Skip to content

Migrate to peft from opendelta for parameter efficient tuning methods #434

@jon-tow

Description

@jon-tow

🚀 The feature, motivation, and pitch

Let's migrate to peft.

Tasks

Doing so will require the following updates:

  1. Replace the opendelta setup in the AccelerateBaseTrainer with a peft backed setup:

    if self.config.model.delta_kwargs is not None:
    delta_type, delta_kwargs = parse_delta_kwargs(
    model.base_model.config,
    self.config.model.delta_kwargs,
    self.config.model.num_layers_unfrozen,
    )
    delta_model_class = get_delta_model_class(delta_type)
    delta_model = delta_model_class(model.base_model, **delta_kwargs)
    delta_model.freeze_module(exclude=["deltas"], set_state_dict=True)
    if self.accelerator.is_main_process:
    delta_model.log()

  2. Handle fine-grained layer capturing to only modify the upper trunk layers of hydra architectures as handled below:

    trlx/trlx/utils/modeling.py

    Lines 414 to 428 in 92b68e4

    def get_delta_modified_modules(
    config: transformers.PretrainedConfig,
    modified_modules: List[str],
    num_layers_unfrozen: int = -1,
    ) -> List[str]:
    """Returns a list of module names to be modified for a given delta method with
    the specified number of learnable layers."""
    unfrozen_layers_pattern = generate_layer_regex(config, num_layers_unfrozen)
    # [r] for regex as per https://github.com/thunlp/OpenDelta/blob/main/opendelta/utils/name_based_addressing.py#L20
    regex_prefix = "[r]"
    # TODO (jon-tow): `decoder.block.` is hardcoded to support T5 layer naming.
    decoder_prefix = "decoder.block." if config.is_encoder_decoder else ""
    module_list = [regex_prefix + decoder_prefix + unfrozen_layers_pattern + module for module in modified_modules]
    return module_list

Motivation

Citing @ethankim00's concerns with opendelta:

  • opendelta import fails due to an unnecessary turtle package import. Even if pip installed, users may be required to have sudo privileges to install the corresponding base graphics package ModuleNotFoundError caused by turtle package thunlp/OpenDelta#47
  • Doesn’t seem to work with DeepSpeed ZeRO 3
  • Additional inference overhead from not merging in the LoRA adapters layers
  • Incompatibility with int8 training
  • Less actively maintained than the peft library, which has been growing rapidly
  • Sharing adapter weights on the HuggingFace Hub is less convenient with opendelta

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions