-
Notifications
You must be signed in to change notification settings - Fork 795
Demo img2img webcam browser #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We will make a noise scheduler function and realize parametric operations such as noise strength 0.0-1.0 and num_denoising_steps 1-50. We will also do the review for the PR soon! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks cool, nice work @radames! I've been tinkering with a strikingly similar set of changes, but using a canvas with drawing tools instead of webcam input.
If you and the team don't mind unsolicited feedback, I'll leave a review and share a few suggestions/thoughts that I hope you find helpful:
-
Batch inference
It appears image frames are processed one at a time in this demo, but batching multiple frames together for higher throughput (& FPS) should result in a smoother experience (at the expense of increased latency). -
Circular buffer & continuous streaming
It looks like the server requests the client to send a frame - instead of this request/response cycle, the client could continuously stream image frames to the server which would maintain a circular buffer that can then be used to perform batch inference. Notablyexamples/screen/main.py
uses that approach. -
No separate endpoint/stream for returning generated images
The generated image could be returned to the client via the websocket connection, instead of via a separate API endpoint. This could be a minor code simplification, and notably would sidestep the linked chromium bug (so we could avoid sending frames twice to every browser that isn't firefox). -
Return raw pixels in RGBA format
Generated images can be returned to the client in raw RGBA pixel format and then directly written to the canvas. This may be a relatively minor optimization, but it avoids the overhead of transforming to & from JPEG format and any associated lossy compression. -
Integrate
wrapper.py
modifications
The modifications to accept a prompt for img2img and to supportengine_dir
are great and seem well contained, those ought to be integrated into the canonicalwrapper.py
so there is no unnecessary duplication of code or maintenance burden.
Perhaps these ideas could be addressed here or in future PRs (or not at all), either way I'd be happy to discuss or collaborate further on details - feel free to reach out.
hi @GradientSurfer, thanks for detailed response! I really appreciate the feedback, I'm happy to address some of your points on this PR, also if you're interested in collaborating, please send edits, commits. do you have PR edit access?
Addressing number 2 here, we can try a batching approach!
Ohh yes that makes a lot of sense, In my original demo with LCM, I did use an async
Yes you're right, however I did the canvas to normalize the webcam image, cropping on the desired size, this could be done on the backend, whichever is faster.
|
@radames I do not have PR edit access here, @cumulo-autumn perhaps you would consider granting collaborator access? |
hi @cumulo-autumn , I think it's good now. I've fixed a couple of uncaught exceptions. One important note, while the server and the client were designed to accept multiple queued connections, the wrapper and |
Hi @GradientSurfer . Thank you for your valuable PR submissions in the past, and for your many meaningful suggestions this time as well! Regarding PR edit access, currently, we are keeping it within a group of acquaintances, so please allow us to hold off on adding new PR edit access for now. However, we are very much open to more discussions and PRs in the future, so we definitely want to continue doing those! (I apologize for the late response this time, as it has been a busy end-of-year period. Also, I really appreciate your prompt and valuable feedback on this PR.) We will consider our policy on adding new PR edit access in the future! |
Hi @radames. Thank you for the update! It works perfectly in my environment too! I am going to merge it. |
@cumulo-autumn I'm not seeing denoising_strength param in the latest repo. How do we set it? Thanks, |
Demo Notes
img2img
andengine_dir
, allowing me to specify the directory and reuse the compiled model in the Docker environment.All the StreamDiffusion code is on img2img.py. Please feel free to add any speedup suggestions.
I'm using
t_index_list=[35, 45]
. Is there a way to provide astrength
on a 0-1 scale?