Make tensors in ResNet contiguous for Hunyuan VAE #10309

a-r-r-o-w · 2024-12-19T22:13:20Z

The following error is raised on Hopper GPUs if the tensors are non-contiguous:

traceback

ERROR:finetrainers:An error occurred during training: Unsupported memory format for group normalization: ChannelsLast3d
ERROR:finetrainers:Traceback (most recent call last):
  File "/fsx/aryan/finetrainers/train.py", line 34, in main
    trainer.train()
  File "/fsx/aryan/finetrainers/finetrainers/trainer.py", line 375, in train
    latent_conditions = self.model_config["prepare_latents"](
  File "/fsx/aryan/finetrainers/finetrainers/hunyuan_video/hunyuan_video_lora.py", line 168, in prepare_latents
    latents = vae.encode(image_or_video).latent_dist.sample(generator=generator)
  File "/fsx/aryan/diffusers/src/diffusers/utils/accelerate_utils.py", line 46, in wrapper
    return method(self, *args, **kwargs)
  File "/fsx/aryan/diffusers/src/diffusers/models/autoencoders/autoencoder_kl_hunyuan_video.py", line 900, in encode
    h = self._encode(x)
  File "/fsx/aryan/diffusers/src/diffusers/models/autoencoders/autoencoder_kl_hunyuan_video.py", line 871, in _encode
    return self._temporal_tiled_encode(x)
  File "/fsx/aryan/diffusers/src/diffusers/models/autoencoders/autoencoder_kl_hunyuan_video.py", line 1098, in _temporal_tiled_encode
    tile = self.tiled_encode(tile)
  File "/fsx/aryan/diffusers/src/diffusers/models/autoencoders/autoencoder_kl_hunyuan_video.py", line 1007, in tiled_encode
    tile = self.encoder(tile)
  File "/fsx/aryan/nightly-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1740, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/fsx/aryan/nightly-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _call_impl
    return forward_call(*args, **kwargs)
  File "/fsx/aryan/diffusers/src/diffusers/models/autoencoders/autoencoder_kl_hunyuan_video.py", line 571, in forward
    hidden_states = self.mid_block(hidden_states)
  File "/fsx/aryan/nightly-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1740, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/fsx/aryan/nightly-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _call_impl
    return forward_call(*args, **kwargs)
  File "/fsx/aryan/diffusers/src/diffusers/models/autoencoders/autoencoder_kl_hunyuan_video.py", line 297, in forward
    hidden_states = resnet(hidden_states)
  File "/fsx/aryan/nightly-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1740, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/fsx/aryan/nightly-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _call_impl
    return forward_call(*args, **kwargs)
  File "/fsx/aryan/diffusers/src/diffusers/models/autoencoders/autoencoder_kl_hunyuan_video.py", line 173, in forward
    hidden_states = self.norm1(hidden_states)
  File "/fsx/aryan/nightly-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1740, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/fsx/aryan/nightly-venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1751, in _call_impl
    return forward_call(*args, **kwargs)
  File "/fsx/aryan/nightly-venv/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 313, in forward
    return F.group_norm(input, self.num_groups, self.weight, self.bias, self.eps)
  File "/fsx/aryan/nightly-venv/lib/python3.10/site-packages/torch/nn/functional.py", line 2965, in group_norm
    return torch.group_norm(
RuntimeError: Unsupported memory format for group normalization: ChannelsLast3d

This does not seem to happen on some of the other GPUs I tested on (A100, L40, 4090) so I think it is hopper-specific

HuggingFaceDocBuilderDev · 2024-12-19T22:19:59Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

contiguous tensors in resnet Co-authored-by: YiYi Xu <[email protected]>

contiguous tensors in resnet

fc7ad39

a-r-r-o-w requested a review from yiyixuxu December 19, 2024 22:13

Merge branch 'main' into contiguous-hunyuan-resnet

76a829c

yiyixuxu approved these changes Dec 20, 2024

View reviewed changes

Merge branch 'main' into contiguous-hunyuan-resnet

6cce571

a-r-r-o-w merged commit 151b74c into main Dec 20, 2024
15 checks passed

a-r-r-o-w deleted the contiguous-hunyuan-resnet branch December 20, 2024 06:15

Foundsheep pushed a commit to Foundsheep/diffusers that referenced this pull request Dec 23, 2024

Make tensors in ResNet contiguous for Hunyuan VAE (huggingface#10309)

6ac6118

contiguous tensors in resnet Co-authored-by: YiYi Xu <[email protected]>

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024

Make tensors in ResNet contiguous for Hunyuan VAE (#10309)

7ffa043

contiguous tensors in resnet Co-authored-by: YiYi Xu <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make tensors in ResNet contiguous for Hunyuan VAE #10309

Make tensors in ResNet contiguous for Hunyuan VAE #10309

Uh oh!

a-r-r-o-w commented Dec 19, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Dec 19, 2024

Uh oh!

Uh oh!

Uh oh!

Make tensors in ResNet contiguous for Hunyuan VAE #10309

Make tensors in ResNet contiguous for Hunyuan VAE #10309

Uh oh!

Conversation

a-r-r-o-w commented Dec 19, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Dec 19, 2024

Uh oh!

Uh oh!

Uh oh!