Skip to content

Conversation

haofanwang
Copy link

What does this PR do?

Support a feature in huggingface/diffusers#2469. For now, as stable diffusion uses CLIPTextEncoder, it doesn't support adding LoRA layers yet. What we have done is quite similar to UNet2DConditionModel.

What to expect after this PR?

import torch
from transformers import CLIPTextModel, CLIPTokenizer
from diffusers.models.cross_attention import LoRACrossAttnProcessor

tokenizer = CLIPTokenizer.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="tokenizer")
text_encoder = CLIPTextModel.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="text_encoder")
text_encoder.requires_grad_(False)

# add LoRA layers
lora_attn_procs = {}
for name in text_encoder.attn_processors.keys():
    cross_attention_dim = None if name.endswith("self_attn.processor") else text_encoder.config.hidden_size
    hidden_size = text_encoder.config.hidden_size
    lora_attn_procs[name] = LoRACrossAttnProcessor(
        hidden_size=hidden_size, cross_attention_dim=cross_attention_dim
    )
text_encoder.set_attn_processor(lora_attn_procs)

inputs = tokenizer(["a photo of a cat", "a photo of a dog"], padding=True, return_tensors="pt")
outputs = text_encoder(**inputs)

# only added LoRA weights require gradients
for name, param in text_encoder.named_parameters():
    print(name, param.requires_grad) 

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Feb 23, 2023

The documentation is not available anymore as the PR was closed or merged.

@sgugger
Copy link
Collaborator

sgugger commented Feb 24, 2023

The support for LoRA should be done using our new peft library. We won't change Transformers models directly. cc @pacman100 @patrickvonplaten

@haofanwang
Copy link
Author

Sure, it makes sense to me. I'm glad to know. I will directly make a new PR in diffusers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants