Batch Mode + Maintain Aspect Ratio + Multi-GPU Random Seed + Fixed Multi-GPU CuSolver Error + Fixed 20-min Load Time + Video Input #31

pftq · 2025-04-23T05:35:52Z

Either (1) merge chaojie's PR first and then this one or (2) merge this one only, which includes chaojie already as a PR-merge to this PR. What won't work is merging this one first and then chaojie's after.

I updated the generate_video files to support the following:

Added seed synchronization code to allow random seed with multi-GPU (Multi-GPU Results in Static Noise After Initial Frames #24).
Reduced 20-min+ load time on multi-GPU to ~8min by fixing contention (all GPUs loading models at once). Indirectly also solved CPU RAM spike during multi-GPU (>200GB on 4 GPUs) (Multi-GPU Initialization Takes 20 Min on 8 GPUs (Fix Provided) #28).
Fixed CuSolver error that occasionally comes up in multi-GPU by presetting linear algebra library (Occasional CuSolver Error on Multi-GPU - Bug and Fix #37).
Removed duplicate model loading line on I2V pipeline
Added batch_size parameter to allow multiple videos to generate without reloading the model, which takes about 20 min on multi-gpu so this saves a lot of time.
Added preserve_image_aspect_ratio parameter to allow preserving original image aspect ratio.
Fixed DF script not resize-cropping the image (I2V script does it but DF is missing the code).
Exposed negative_prompt to allow that to be changed/overwritten.
Friendlier filenames with date, seed, cfg, steps, and other details in front.

This also includes and cleanly integrates chaojie's fork (#12):

Prompt travel, allow multiple text strings in the --prompt parameter to guide the video differently each chunk of base_num_frames.
Video input via --video parameter, allow continuing/extending from a video.
Partially complete videos will be output as each chunk of base_num_frames completes. In combination with the --video paramater, this lets you effectively resume from a previous render as well as abort mid-render if the videos take a turn you don't like. Extremely useful for saving time and "watching" as the renders complete rather than committing the full time.

Let me know if there is anything you guys want changed for the PR. I still think you guys have the best open-source model so far, just that it's really hard for an average user to get good results without a lot of debugging, so I'm happy to help out.

Multi-GPU with video input and prompt travel, batch of 10, preserving aspect ratio.
Change --video "video.mp4" to --image "image.jpg" if you want to load a starting image instead.

model_id=Skywork/SkyReels-V2-DF-14B-540P
gpu_count=2
torchrun --nproc_per_node=${gpu_count} generate_video_df.py \
  --model_id ${model_id} \
  --resolution 540P \
  --ar_step 0 \
  --base_num_frames 97 \
  --num_frames 257 \
  --overlap_history 17 \
  --inference_steps 50 \
  --guidance_scale 6 \
  --batch_size 10 \
  --preserve_image_aspect_ratio \
  --video "video.mp4" \
  --prompt "The first thing he does" \
  "The second thing he does." \
  "The third thing he does." \
  --negative_prompt "Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards" \
  --addnoise_condition 20 \
  --use_usp \
  --offload

Single GPU with video input and prompt travel, batch of 10, preserving aspect ratio.
Change --video "video.mp4" to --image "image.jpg" if you want to load a starting image instead.

model_id=Skywork/SkyReels-V2-DF-14B-540P
python3 generate_video_df.py \
  --model_id ${model_id} \
  --resolution 540P \
  --ar_step 0 \
  --base_num_frames 97 \
  --num_frames 257 \
  --overlap_history 17 \
  --inference_steps 50 \
  --guidance_scale 6 \
  --batch_size 10 \
  --preserve_image_aspect_ratio \
  --video "video.mp4" \
  --prompt "The first thing he does" \
  "The second thing he does." \
  "The third thing he does." \
  --negative_prompt "Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards" \
  --addnoise_condition 20 \
  --offload

python3 generate_video_df.py --model_id ${model_id} --resolution 540P --ar_step 0 --base_num_frames 97 --num_frames 177 --overlap_history 17 --addnoise_condition 20 --offload --prompt 'A woman in a leather jacket and sunglasses riding a vintage motorcycle through a desert highway at sunset, her hair blowing wildly in the wind as the motorcycle kicks up dust, with the golden sun casting long shadows across the barren landscape.' 'A woman flies into space'

Added batch mode, added option to keep original aspect ratio, synchronized seeds on multi-gpu.

…nized randomized seeds on multi-gpu, exposed negative_prompt option.

… resolving merge conflicts.

…d conflicts.

… chaojie and fixed merged conflicts.

innokria · 2025-05-06T04:20:26Z

I have the latest code however i still cant generate video wth multiple prompt in sequence
https://huggingface.co/spaces/rahul7star/Skyreel-V2-Enchance

any extra params that i need to pass ?

pftq · 2025-05-06T04:28:03Z

What is your commandline prompt and what is the error when you try to run?

innokria · 2025-05-06T04:49:44Z

What is your commandline prompt and what is the error when you try to run?

no error just that video is not rendering second part

hmm I am runnning via gradio
https://huggingface.co/spaces/rahul7star/Skyreel-V2-Enchance/blob/main/app.py

'A woman in a leather jacket and sunglasses riding a vintage motorcycle through a desert highway at sunset, her hair blowing wildly in the wind as the motorcycle kicks up dust, with the golden sun casting long shadows across the barren landscape.' 'A woman flies into space'

let me run that generate_video_df.py direclty and see rather than my custum gradio to wrap around

pftq · 2025-05-06T05:01:26Z

I'm not familiar with gradio - have you tried running it via command line like the example on the readme? A lot of the work for this fork's changes is in the generate py file, so you'd have to replicate those into your custom code.

What is your num_frames and base_num_frames? It'd be good to know your full parameters list. Each additional prompt is assigned to a chunk, with # of chunks = num_frames/base_num_frames. So you need to make sure you have enough chunks to make it to the next prompt.

If you're running the fork directly via command line you can also see the debug saying what prompt/chunk it is currently on to see.

innokria · 2025-05-06T05:27:25Z

I'm not familiar with gradio - have you tried running it via command line like the example on the readme? A lot of the work for this fork's changes is in the generate py file, so you'd have to replicate those into your custom code.

What is your num_frames and base_num_frames? It'd be good to know your full parameters list. Each additional prompt is assigned to a chunk, with # of chunks = num_frames/base_num_frames. So you need to make sure you have enough chunks to make it to the next prompt.

If you're running the fork directly via command line you can also see the debug saying what prompt/chunk it is currently on to see.

alright going to use exact as ur setting , my video length was 5 sec so only less frame i guess i will set the video to 10 sec .. thanks for your help will keep messing :)

qiwang1996 · 2025-05-06T08:13:42Z

Thanks for your good work! I tested your code on 4* [4090 48G] and i set offload cpu option. However, there is still a problem of CPU RAM spike during multi-GPU (>200GB on 4 GPUs). I wonder if i do something wrong.

model_id="./SkyReels-V2-DF-14B-540P"
gpu_count=4
torchrun --nproc_per_node=${gpu_count} generate_video_df.py \
  --model_id ${model_id} \
  --resolution 540P \
  --ar_step 0 \
  --base_num_frames  97 \  
  --num_frames 289 \  
  --overlap_history 17 \
  --inference_steps 50 \
  --guidance_scale 6 \
  --batch_size 1 \
  --preserve_image_aspect_ratio \
  --prompt  "A graceful white swan with a curved neck and delicate feathers swimming in a serene lake at dawn, its reflection perfectly mirrored in the still water as mist rises from the surface, with the swan occasionally dipping its head into the water to feed." \
  --negative_prompt "Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards" \
  --addnoise_condition 20 \
  --use_ret_steps \
  --teacache_thresh 0.0 \
  --use_usp \
  --offload

My script is above.

pftq · 2025-05-06T09:35:56Z

Thanks - it's hard to be sure on that issue - I didn't directly set out to solve it. I just know my Runpod instance was crashing with 4xA40s previously and afterwards it did not. I wonder if it is partly because you still have RAM to spare so maybe the system is not as stringent on clearing the memory.

qiwang1996 · 2025-05-06T12:42:22Z

@pftq thanks for your fast reply. I guess the existing difference of RAM manager strategy between runpod and autodl where i run my code caused it.

innokria · 2025-05-07T19:35:20Z

Thanks for your good work! I tested your code on 4* [4090 48G] and i set offload cpu option. However, there is still a problem of CPU RAM spike during multi-GPU (>200GB on 4 GPUs). I wonder if i do something wrong.

model_id="./SkyReels-V2-DF-14B-540P"
gpu_count=4
torchrun --nproc_per_node=${gpu_count} generate_video_df.py \
  --model_id ${model_id} \
  --resolution 540P \
  --ar_step 0 \
  --base_num_frames  97 \  
  --num_frames 289 \  
  --overlap_history 17 \
  --inference_steps 50 \
  --guidance_scale 6 \
  --batch_size 1 \
  --preserve_image_aspect_ratio \
  --prompt  "A graceful white swan with a curved neck and delicate feathers swimming in a serene lake at dawn, its reflection perfectly mirrored in the still water as mist rises from the surface, with the swan occasionally dipping its head into the water to feed." \
  --negative_prompt "Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards" \
  --addnoise_condition 20 \
  --use_ret_steps \
  --teacache_thresh 0.0 \
  --use_usp \
  --offload

My script is above.

Even I got "Script timed out after 15 minutes. Try reducing frame count or prompt complexity."

pftq · 2025-05-08T00:30:39Z

That's not an error message from anywhere in this code repo - if you are embedding this in a custom script or environment, you would need to look there for the issue. Additionally that is the multi-gpu code, which is quite complex so I don't recommend embedding that in another wrapper.

chaojie and others added 29 commits April 22, 2025 03:44

add --video reference

1382b4c

Merge branch 'SkyworkAI:main' into main

b2b94b5

Merge branch 'SkyworkAI:main' into main

4468dae

fix image size compatibility when Diffusion Forcing

3821463

Update README.md

ef5c736

Update README.md

f260b15

Update README.md

9edec53

Update README.md

6dc8307

Update README.md

9fc4d1a

Update generate_video_df.py

98dae5c

Added batch mode, added option to keep original aspect ratio, synchronized seeds on multi-gpu.

Update README.md

36d42bf

Update README.md

4a3c4b2

Update README.md

3dced69

Exposed negative_prompt option

a8352b8

Added batch mode, added option to keep original aspect ratio, synchro…

214e79d

…nized randomized seeds on multi-gpu, exposed negative_prompt option.

Update README.md

cdf25f6

Fixed CuSolver issues

7b135c0

Fixed CuSolver issues

357f013

Update generate_video.py

037012f

Update generate_video_df.py

dfc531d

Update generate_video_df.py

74ac5dc

Update generate_video.py

1497cb0

Update README.md

d7ad88e

Update generate_video.py

5b46048

Update generate_video_df.py

0898476

Update README.md

c284670

Update generate_video_df.py

7a7d65d

Update generate_video.py

dd5dde3

pftq mentioned this pull request Apr 23, 2025

Multi-GPU Results in Static Noise After Initial Frames #24

Closed

pftq added 17 commits April 25, 2025 06:34

Fix merge conflict

8151808

Fix merge conflict

e886e8e

Merge branch 'SkyworkAI:main' into main

9862302

Reapplied image aspect ratio code and negative prompt parameter after…

5b6c124

… resolving merge conflicts.

Merge branch 'pr/1'

742f883

Integrated prompt travel and video input from chaojie and fixed merge…

8b88304

…d conflicts.

Integrated prompt travel, video input, and partial video outputs from…

e13fc6a

… chaojie and fixed merged conflicts.

Integrated prompt travel, video input, and partial video outputs from…

84db4d3

… chaojie and fixed merged conflicts.

Integrated prompt travel, video input, and partial video outputs from…

1011b49

… chaojie and fixed merged conflicts.

Integrated prompt travel, video input, and partial video outputs from…

852c104

… chaojie and fixed merged conflicts.

Integrated prompt travel, video input, and partial video outputs from…

c8276f0

… chaojie and fixed merged conflicts.

Integrated prompt travel, video input, and partial video outputs from…

d965fb4

… chaojie and fixed merged conflicts.

Update README.md

8a35c19

Update README.md

fac7323

Update README.md

524437e

Fixed error on number of prompts less than video chunks.

c3feb43

Update README.md

8f995a1

pftq changed the title ~~Batch Mode + Maintain Aspect Ratio + Multi-GPU Random Seed + Fixed Multi-GPU CuSolver Error + Fixed 20-min Load Time~~ Batch Mode + Maintain Aspect Ratio + Multi-GPU Random Seed + Fixed Multi-GPU CuSolver Error + Fixed 20-min Load Time + Video Input May 3, 2025

Shortened filename with less prompt text.

96e6f94

pftq mentioned this pull request May 6, 2025

add prompt travel #12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batch Mode + Maintain Aspect Ratio + Multi-GPU Random Seed + Fixed Multi-GPU CuSolver Error + Fixed 20-min Load Time + Video Input #31

Batch Mode + Maintain Aspect Ratio + Multi-GPU Random Seed + Fixed Multi-GPU CuSolver Error + Fixed 20-min Load Time + Video Input #31

Uh oh!

pftq commented Apr 23, 2025 •

edited

Loading

Uh oh!

innokria commented May 6, 2025 •

edited

Loading

Uh oh!

pftq commented May 6, 2025 •

edited

Loading

Uh oh!

innokria commented May 6, 2025 •

edited

Loading

Uh oh!

pftq commented May 6, 2025 •

edited

Loading

Uh oh!

innokria commented May 6, 2025

Uh oh!

qiwang1996 commented May 6, 2025 •

edited

Loading

Uh oh!

pftq commented May 6, 2025

Uh oh!

qiwang1996 commented May 6, 2025

Uh oh!

innokria commented May 7, 2025

Uh oh!

pftq commented May 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Batch Mode + Maintain Aspect Ratio + Multi-GPU Random Seed + Fixed Multi-GPU CuSolver Error + Fixed 20-min Load Time + Video Input #31

Are you sure you want to change the base?

Batch Mode + Maintain Aspect Ratio + Multi-GPU Random Seed + Fixed Multi-GPU CuSolver Error + Fixed 20-min Load Time + Video Input #31

Uh oh!

Conversation

pftq commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

innokria commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pftq commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

innokria commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pftq commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

innokria commented May 6, 2025

Uh oh!

qiwang1996 commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pftq commented May 6, 2025

Uh oh!

qiwang1996 commented May 6, 2025

Uh oh!

innokria commented May 7, 2025

Uh oh!

pftq commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pftq commented Apr 23, 2025 •

edited

Loading

innokria commented May 6, 2025 •

edited

Loading

pftq commented May 6, 2025 •

edited

Loading

innokria commented May 6, 2025 •

edited

Loading

pftq commented May 6, 2025 •

edited

Loading

qiwang1996 commented May 6, 2025 •

edited

Loading

pftq commented May 8, 2025 •

edited

Loading