Skip to content

[examples] add train flux-controlnet scripts in example. #9324

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 68 commits into from
Sep 27, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
8ab9b5b
add train flux-controlnet scripts in example.
PromeAIpro Aug 30, 2024
4a53573
fix error
PromeAIpro Aug 30, 2024
14e9970
fix subfolder error
PromeAIpro Sep 1, 2024
3bb431c
Merge branch 'main' into flux-controlnet-train
yiyixuxu Sep 4, 2024
973c6fb
fix preprocess error
PromeAIpro Sep 4, 2024
599c984
Merge branch 'flux-controlnet-train_x' into flux-controlnet-train
PromeAIpro Sep 4, 2024
24b58f8
Merge branch 'main' into flux-controlnet-train
PromeAIpro Sep 4, 2024
22a3e10
Merge branch 'main' into flux-controlnet-train
PromeAIpro Sep 5, 2024
32eb1ef
Merge branch 'main' into flux-controlnet-train
PromeAIpro Sep 13, 2024
57d143b
Update examples/controlnet/README_flux.md
PromeAIpro Sep 13, 2024
af1b7a5
Update examples/controlnet/README_flux.md
PromeAIpro Sep 13, 2024
d19b101
fix readme
Sep 13, 2024
64251ac
fix note error
Sep 14, 2024
c98d43f
add some Tutorial for deepspeed
Sep 14, 2024
569e0de
fix some Format Error
Sep 14, 2024
916fd80
Merge branch 'main' into flux-controlnet-train
PromeAIpro Sep 14, 2024
67deb7a
add dataset_path example
Sep 15, 2024
76bcf5a
Merge branch 'flux-controlnet-train' of https://github.com/PromeAIpro…
Sep 15, 2024
32fbeac
remove print, add guidance_scale CLI, readable apply
Sep 15, 2024
b03cb01
Update examples/controlnet/README_flux.md
PromeAIpro Sep 15, 2024
7b98459
Merge branch 'main' into flux-controlnet-train
PromeAIpro Sep 15, 2024
443f251
update,push_to_hub,save_weight_dtype,static method,clear_objs_and_ret…
Sep 16, 2024
bc68f1a
add push to hub in readme
Sep 16, 2024
fe2a587
apply weighting schemes
Sep 16, 2024
3dc16ca
add note
Sep 16, 2024
aff0951
Update examples/controlnet/README_flux.md
PromeAIpro Sep 16, 2024
b858507
Merge branch 'main' into flux-controlnet-train
PromeAIpro Sep 16, 2024
7bdf9e3
make code style and quality
Sep 19, 2024
ba45495
Merge branch 'flux-controlnet-train' of https://github.com/PromeAIpro…
Sep 19, 2024
c862d39
fix some unnoticed error
Sep 19, 2024
4b979e0
make code style and quality
Sep 19, 2024
0655a75
Merge branch 'main' into flux-controlnet-train
sayakpaul Sep 19, 2024
90badc2
add example controlnet in readme
Sep 19, 2024
4755557
Merge branch 'flux-controlnet-train' of https://github.com/PromeAIpro…
Sep 19, 2024
e3d10bc
add test controlnet
Sep 19, 2024
f9400a6
rm Remove duplicate notes
Sep 19, 2024
192bbee
Merge branch 'main' into flux-controlnet-train
PromeAIpro Sep 19, 2024
de06965
Fix formatting errors
Sep 19, 2024
8ee2daf
Merge branch 'flux-controlnet-train' of https://github.com/PromeAIpro…
Sep 19, 2024
17fc1ee
add new control image
Sep 20, 2024
213faf9
Merge branch 'main' into flux-controlnet-train
PromeAIpro Sep 20, 2024
b533cae
add model cpu offload
Sep 23, 2024
be965f0
Merge branch 'flux-controlnet-train' of https://github.com/PromeAIpro…
Sep 23, 2024
a2daa9f
Merge branch 'main' into flux-controlnet-train
PromeAIpro Sep 23, 2024
4d7c1af
update help for adafactor
Sep 23, 2024
a11219c
Merge branch 'main' into flux-controlnet-train
sayakpaul Sep 24, 2024
49a1492
make quality & style
Sep 24, 2024
6169b61
Merge branch 'main' into flux-controlnet-train
sayakpaul Sep 24, 2024
d895b8f
make quality and style
Sep 24, 2024
395d2f7
Merge branch 'flux-controlnet-train' of https://github.com/PromeAIpro…
Sep 24, 2024
b6a9021
rename flux_controlnet_model_name_or_path
Sep 24, 2024
66dfdbe
Merge branch 'main' into flux-controlnet-train
PromeAIpro Sep 25, 2024
b097d0d
fix back src/diffusers/pipelines/flux/pipeline_flux_controlnet.py
Sep 25, 2024
49787e3
fix dtype error by pre calculate text emb
Sep 26, 2024
eb64557
Merge branch 'main' into flux-controlnet-train
PromeAIpro Sep 26, 2024
e9d3e04
rm image save
Sep 26, 2024
7245c75
Merge branch 'main' into flux-controlnet-train
sayakpaul Sep 26, 2024
25fc313
quality fix
Sep 26, 2024
c2b44d3
Merge branch 'flux-controlnet-train' of https://github.com/PromeAIpro…
Sep 26, 2024
bc2ea9e
Merge branch 'main' into flux-controlnet-train
sayakpaul Sep 26, 2024
2ee67c4
fix test
Sep 27, 2024
7ab1b80
Merge branch 'flux-controlnet-train' of https://github.com/PromeAIpro…
Sep 27, 2024
56cd984
Merge branch 'main' into flux-controlnet-train
sayakpaul Sep 27, 2024
7cedfb1
fix tiny flux train error
Sep 27, 2024
ee6ca90
Merge branch 'flux-controlnet-train' of https://github.com/PromeAIpro…
Sep 27, 2024
dcac1b0
change report to to tensorboard
Sep 27, 2024
89a1f35
fix save name error when test
Sep 27, 2024
6ccd3e4
Fix shrinking errors
Sep 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
177 changes: 177 additions & 0 deletions examples/controlnet/README_flux.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# ControlNet training example for FLUX

The `train_controlnet_flux.py` script shows how to implement the ControlNet training procedure and adapt it for [FLUX](https://github.com/black-forest-labs/flux).

Training script provided by LibAI, which is an institution dedicated to the progress and achievement of artificial general intelligence.LibAI is the developer of [cutout.pro](https://www.cutout.pro/) and [promeai.pro](https://www.promeai.pro/).

## Running locally with PyTorch

### Installing the dependencies

Before running the scripts, make sure to install the library's training dependencies:

**Important**

To make sure you can successfully run the latest versions of the example scripts, we highly recommend **installing from source** and keeping the install up to date as we update the example scripts frequently and install some example-specific requirements. To do this, execute the following steps in a new virtual environment:

```bash
git clone https://github.com/huggingface/diffusers
cd diffusers
pip install -e .
```

Then cd in the `examples/controlnet` folder and run
```bash
pip install -r requirements_flux.txt
```

And initialize an [🤗Accelerate](https://github.com/huggingface/accelerate/) environment with:

```bash
accelerate config
```

Or for a default accelerate configuration without answering questions about your environment

```bash
accelerate config default
```

Or if your environment doesn't support an interactive shell (e.g., a notebook)

```python
from accelerate.utils import write_basic_config
write_basic_config()
```

When running `accelerate config`, if we specify torch compile mode to True there can be dramatic speedups.

## Custom Datasets

We support importing data from jsonl(xxx.jsonl),here is a brief example:
```sh
{"image_path": "xxx", "caption": "xxx", "control_path": "xxx"}
{"image_path": "xxx", "caption": "xxx", "control_path": "xxx"}
```


## Training

Our training examples use two test conditioning images. They can be downloaded by running

```sh
wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png
wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png
```

Then run `huggingface-cli login` to log into your Hugging Face account. This is needed to be able to push the trained ControlNet parameters to Hugging Face Hub.

we can define the num_layers, num_single_layers, which determines the size of the control(default values are num_layers=4, num_single_layers=10)


```bash
export MODEL_DIR="black-forest-labs/FLUX.1-dev"
export OUTPUT_DIR="path to save model"
export TRAIN_JSON_FILE="path to your jsonl file"


accelerate launch train_controlnet_flux.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--conditioning_image_column=control_path \
--image_column=image_path \
--caption_column=caption \
--output_dir=$OUTPUT_DIR \
--jsonl_for_train=$TRAIN_JSON_FILE \
--mixed_precision="bf16" \
--resolution=512 \
--learning_rate=1e-5 \
--max_train_steps=15000 \
--validation_steps=100 \
--checkpointing_steps=200 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=1 \
--gradient_accumulation_steps=4 \
--report_to="tensorboard" \
--num_double_layers=4 \
--num_single_layers=0 \
--seed=42 \
```

To better track our training experiments, we're using the following flags in the command above:

* `report_to="tensorboard` will ensure the training runs are tracked on Weights and Biases.
* `validation_image`, `validation_prompt`, and `validation_steps` to allow the script to do a few validation inference runs. This allows us to qualitatively check if the training is progressing as expected.

Our experiments were conducted on a single 40GB A100 GPU.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, 40GB A100 seems doable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, this is the 80g A100 (I wrote it wrong), I did a lot of extra work to get it to train with the zero3 on the 40g A100, but I don't think this is suitable for everyone

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all. I think it would still be nice to include the changes you had to make in the form of notes in the README. Does that work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if I can add it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sayakpaul We added a tutorial on configuring deepspeed in the readme.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some tricks to lower GPU:

  1. gradient_checkpointing
  2. bf16 or fp16.
  3. batch size 1, and then use gradient_accumulation_steps above 1

With 1, 2, 3, can this thing be controlled to be trained under 40GB?

Copy link
Contributor Author

@PromeAIpro PromeAIpro Sep 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my practice, deepspeedzero3 must be used, @linjiapro your settings will cost about 70g when 1024 with bs 1 or 512 with bs 3.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry to bother you, have you ever tried cache text-encoder and vae latents to run with lower GPU? @PromeAIpro @linjiapro

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache text-encoder is already available in this script (saving about 10g of gpu memory on T5), about cache vae You can check how to use deepspeed in the readme, which includes cache vae.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi you can also reduce memory usage by using optimum-quanto and qint8 quantising all of the modules except the controlnet (not activation quantisation, just the weights). I ran some experiments on this with my own controlnet training script and it seems to work just fine.


### Inference

Once training is done, we can perform inference like so:

```python
import torch
from diffusers.utils import load_image
from diffusers.pipelines.flux.pipeline_flux_controlnet import FluxControlNetPipeline
from diffusers.models.controlnet_flux import FluxControlNetModel

base_model = 'black-forest-labs/FLUX.1-dev'
controlnet_model = 'path to controlnet'
controlnet = FluxControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
pipe = FluxControlNetPipeline.from_pretrained(base_model,
controlnet=controlnet,
torch_dtype=torch.bfloat16)
pipe.to("cuda")

control_image = load_image("./conditioning_image_1.png").resize((1024, 1024))
prompt = "pale golden rod circle with old lace background"

image = pipe(
prompt,
control_image=control_image,
controlnet_conditioning_scale=0.6,
num_inference_steps=28,
guidance_scale=3.5,
).images[0]
image.save("./output.png")
```

## Notes

### T5 dont support bf16 autocast and i dont know why, will cause black image.

```diff
if is_final_validation or torch.backends.mps.is_available():
autocast_ctx = nullcontext()
else:
# t5 seems not support autocast and i don't know why
+ autocast_ctx = nullcontext()
- autocast_ctx = torch.autocast(accelerator.device.type)
```

### TO Fix Error

```bash
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and BFloat16
```

#### we need to change some code in `diffusers/src/diffusers/pipelines/flux/pipeline_flux_controlnet.py` to ensure the dtype

```diff
noise_pred = self.transformer(
hidden_states=latents,
# YiYi notes: divide it by 1000 for now because we scale it by 1000 in the transforme rmodel (we should not keep it but I want to keep the inputs same for the model for testing)
timestep=timestep / 1000,
guidance=guidance,
pooled_projections=pooled_prompt_embeds,
encoder_hidden_states=prompt_embeds,
- controlnet_block_samples=controlnet_block_samples,
- controlnet_single_block_samples=controlnet_single_block_samples,
+ controlnet_block_samples=[sample.to(dtype=latents.dtype) for sample in controlnet_block_samples]if controlnet_block_samples is not None else None,
+ controlnet_single_block_samples=[sample.to(dtype=latents.dtype) for sample in controlnet_single_block_samples] if controlnet_single_block_samples is not None else None,
txt_ids=text_ids,
img_ids=latent_image_ids,
joint_attention_kwargs=self.joint_attention_kwargs,
return_dict=False,
)[0]
```
9 changes: 9 additions & 0 deletions examples/controlnet/requirements_flux.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
accelerate>=0.16.0
torchvision
transformers>=4.25.1
ftfy
tensorboard
Jinja2
datasets
wandb
SentencePiece
Loading