Qwen Image Layered Support #12853

naykun · 2025-12-17T07:23:39Z

🚀 This PR introduces Qwen Image Layered—a groundbreaking vision model that dissects images into rich, structured layers (think foreground, background, objects, and more)!

By unlocking pixel-perfect, semantically aware decomposition, we’re not just enabling smarter image editing—we’re igniting a whole new playground for creators, developers, and AI artists. Imagine remixing scenes like multitrack audio, editing objects in isolation, or generating dynamic compositions with unprecedented control.

The future of image generation is layered, modular, and collaborative—and it starts right here. Let’s build it together! 🎨✨

cc @sayakpaul @yiyixuxu

sayakpaul

Thanks a lot for this PR! My comments should mostly be minor.

LMK if anything is unclear.

Let's also update with a test and the docs?

src/diffusers/models/transformers/transformer_qwenimage.py

src/diffusers/pipelines/qwenimage/pipeline_qwenimage_layered.py

sayakpaul · 2025-12-17T07:43:11Z

src/diffusers/pipelines/qwenimage/pipeline_qwenimage_layered.py

+            image = torch.cat(image, dim=2)  # b c f h w
+            image = image.permute(0, 2, 3, 4, 1)  # b f h w c
+            image = (image * 0.5 + 0.5).clamp(0, 1).cpu().float().numpy()
+            image = (image * 255).round().astype("uint8")
+            images = []
+            for layers in image:
+                images.append([Image.fromarray(layer) for layer in layers])


Wondering if we can leverage self.image_processor.postprocess here? This is what we usually follow for the pipelines? If we need to have a different postprocess method, we could implement a separate ImageProcessor within the qwenimage module and use it.

Example:
https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/hunyuan_video1_5/image_processor.py

we don't need a separate image processor
I think we can probably use existing one by looping over the layers and process one layer each time
if not, ok to keep the code here

Thanks the advice, I've got a better implementation in the new commit.

src/diffusers/pipelines/qwenimage/pipeline_qwenimage_layered.py

HuggingFaceDocBuilderDev · 2025-12-17T09:16:34Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yiyixuxu

thanks!
looks really good to me & we can merge this soon!

src/diffusers/pipelines/qwenimage/pipeline_qwenimage_layered.py

src/diffusers/models/transformers/transformer_qwenimage.py

yiyixuxu · 2025-12-17T09:51:20Z

src/diffusers/pipelines/qwenimage/pipeline_qwenimage_layered.py

+            image = torch.cat(image, dim=2)  # b c f h w
+            image = image.permute(0, 2, 3, 4, 1)  # b f h w c
+            image = (image * 0.5 + 0.5).clamp(0, 1).cpu().float().numpy()
+            image = (image * 255).round().astype("uint8")
+            images = []
+            for layers in image:
+                images.append([Image.fromarray(layer) for layer in layers])


we don't need a separate image processor
I think we can probably use existing one by looping over the layers and process one layer each time
if not, ok to keep the code here

sayakpaul

LGTM on my end. Maybe we could just update https://huggingface.co/docs/diffusers/main/en/api/pipelines/qwenimage before we merge the PR?

I will try to add tests in a follow-up and ask for your reviews.

sayakpaul · 2025-12-17T10:19:21Z

src/diffusers/models/transformers/transformer_qwenimage.py

+        if use_additional_t_cond:
+            self.addition_t_embedding = nn.Embedding(2, embedding_dim)


Much better.

sayakpaul · 2025-12-17T10:20:19Z

src/diffusers/pipelines/qwenimage/pipeline_qwenimage_layered.py

+            image = self.vae.decode(latents, return_dict=False)[0]  # (b f) c 1 h w
+
+            image = image.squeeze(2)
+
+            image = self.image_processor.postprocess(image, output_type=output_type)
+            images = []
+            for bidx in range(b):
+                images.append(image[bidx * f : (bidx + 1) * f])


This is clean! Thanks for this :)

sayakpaul · 2025-12-17T10:24:39Z

@bot /style

github-actions · 2025-12-17T10:25:04Z

Style bot fixed some files and pushed the changes.

sayakpaul · 2025-12-17T10:32:14Z

@naykun I opened naykun#1 to help fix the consistency problems. Could you please check and merge?

make fix-copies

sayakpaul · 2025-12-17T11:27:51Z

Failing tests are unrelated. Looking forward to seeing Qwen ImageLayer a grand success! Also, hopefully, this kind of starts a new paradigm in image generation!

gluttony-10 · 2025-12-20T09:39:08Z

@naykun Although image is an optional input in pipeline parameters. But without entering an image, it cannot run successfully. Looking forward to repairing.

Best regards！

naykun added 2 commits December 17, 2025 15:01

[qwen-image] qwen image layered support

a89f13e

[qwen-image] update doc

e630e6e

sayakpaul reviewed Dec 17, 2025

View reviewed changes

sayakpaul requested review from DN6 and yiyixuxu December 17, 2025 07:45

sayakpaul reviewed Dec 17, 2025

View reviewed changes

src/diffusers/pipelines/qwenimage/pipeline_qwenimage_layered.py Outdated Show resolved Hide resolved

sayakpaul reviewed Dec 17, 2025

View reviewed changes

src/diffusers/pipelines/qwenimage/pipeline_qwenimage_layered.py Outdated Show resolved Hide resolved

yiyixuxu approved these changes Dec 17, 2025

View reviewed changes

[qwen-image] fix pr comments

f3c6242

naykun requested review from sayakpaul and yiyixuxu December 17, 2025 10:15

sayakpaul approved these changes Dec 17, 2025

View reviewed changes

github-actions bot and others added 3 commits December 17, 2025 10:25

Apply style fixes

c0b5813

Merge branch 'main' into main

02851fa

make fix-copies

49e39f8

sayakpaul mentioned this pull request Dec 17, 2025

make fix-copies naykun/diffusers#1

Merged

Merge pull request #1 from huggingface/naykun-main

7d5d678

make fix-copies

sayakpaul merged commit f9c1e61 into huggingface:main Dec 17, 2025
10 of 11 checks passed

leejet mentioned this pull request Dec 19, 2025

wip qwen image layered support leejet/stable-diffusion.cpp#1119

Open

		if use_additional_t_cond:
		self.addition_t_embedding = nn.Embedding(2, embedding_dim)

Qwen Image Layered Support #12853

Qwen Image Layered Support #12853

Uh oh!

Conversation

naykun commented Dec 17, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sayakpaul Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

naykun Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Dec 17, 2025

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yiyixuxu Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Dec 17, 2025

Uh oh!

sayakpaul commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

gluttony-10 commented Dec 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions bot commented Dec 17, 2025 •

edited

Loading

sayakpaul commented Dec 17, 2025 •

edited

Loading