Demo img2img webcam browser #57

radames · 2023-12-26T10:02:09Z

Demo Notes

Frontend in Svelte
Backend: FastAPI / WebSocket / MJPEG stream
Wrapper is copied and modified here to accept a prompt for img2img and engine_dir, allowing me to specify the directory and reuse the compiled model in the Docker environment.

All the StreamDiffusion code is on img2img.py. Please feel free to add any speedup suggestions.
I'm using t_index_list=[35, 45]. Is there a way to provide a strength on a 0-1 scale?

cumulo-autumn · 2023-12-26T10:10:11Z

We will make a noise scheduler function and realize parametric operations such as noise strength 0.0-1.0 and num_denoising_steps 1-50. We will also do the review for the PR soon!

GradientSurfer

Looks cool, nice work @radames! I've been tinkering with a strikingly similar set of changes, but using a canvas with drawing tools instead of webcam input.

If you and the team don't mind unsolicited feedback, I'll leave a review and share a few suggestions/thoughts that I hope you find helpful:

Batch inference
It appears image frames are processed one at a time in this demo, but batching multiple frames together for higher throughput (& FPS) should result in a smoother experience (at the expense of increased latency).
Circular buffer & continuous streaming
It looks like the server requests the client to send a frame - instead of this request/response cycle, the client could continuously stream image frames to the server which would maintain a circular buffer that can then be used to perform batch inference. Notably examples/screen/main.py uses that approach.
No separate endpoint/stream for returning generated images
The generated image could be returned to the client via the websocket connection, instead of via a separate API endpoint. This could be a minor code simplification, and notably would sidestep the linked chromium bug (so we could avoid sending frames twice to every browser that isn't firefox).
Return raw pixels in RGBA format
Generated images can be returned to the client in raw RGBA pixel format and then directly written to the canvas. This may be a relatively minor optimization, but it avoids the overhead of transforming to & from JPEG format and any associated lossy compression.
Integrate wrapper.py modifications
The modifications to accept a prompt for img2img and to support engine_dir are great and seem well contained, those ought to be integrated into the canonical wrapper.py so there is no unnecessary duplication of code or maintenance burden.

Perhaps these ideas could be addressed here or in future PRs (or not at all), either way I'd be happy to discuss or collaborate further on details - feel free to reach out.

demo/realtime-img2img/img2img.py

demo/realtime-img2img/main.py

radames · 2023-12-27T20:06:43Z

hi @GradientSurfer, thanks for detailed response! I really appreciate the feedback, I'm happy to address some of your points on this PR, also if you're interested in collaborating, please send edits, commits. do you have PR edit access?

Looks cool, nice work @radames! I've been tinkering with a strikingly similar set of changes, but using a canvas with drawing tools instead of webcam input.

If you and the team don't mind unsolicited feedback, I'll leave a review and share a few suggestions/thoughts that I hope you find helpful:

Batch inference
It appears image frames are processed one at a time in this demo, but batching multiple frames together for higher throughput (& FPS) should result in a smoother experience (at the expense of increased latency).

Addressing number 2 here, we can try a batching approach!

Circular buffer & continuous streaming
It looks like the server requests the client to send a frame - instead of this request/response cycle, the client could continuously stream image frames to the server which would maintain a circular buffer that can then be used to perform batch inference. Notably examples/screen/main.py uses that approach.

Ohh yes that makes a lot of sense, In my original demo with LCM, I did use an async queue(), but back in time, the inference was slow, and the result was a lagged video, thus I decided to switch to ping/pong approach.

No separate endpoint/stream for returning generated images
The generated image could be returned to the client via the websocket connection, instead of via a separate API endpoint. This could be a minor code simplification, and notably would sidestep the linked chromium bug (so we could avoid sending frames twice to every browser that isn't firefox).
Yes, that's a great point, MJPEG stream seems a bit awkward, and buggy on Chrome, Ideally it would be on WebRTC, but I was looking into performance and simplicity. An open socket jpeg streaming looked faster to me compared to sending blobs over websocket, it needs an extra decoding processing to send the bytes to the <img>. However, this demo seems very fast and it's doing the blob over websockets -> <img> https://www.fal.ai/camera

Return raw pixels in RGBA format
Generated images can be returned to the client in raw RGBA pixel format and then directly written to the canvas. This may be a relatively minor optimization, but it avoids the overhead of transforming to & from JPEG format and any associated lossy compression.

Yes you're right, however I did the canvas to normalize the webcam image, cropping on the desired size, this could be done on the backend, whichever is faster.

Integrate wrapper.py modifications
The modifications to accept a prompt for img2img and to support engine_dir are great and seem well contained, those ought to be integrated into the canonical wrapper.py so there is no unnecessary duplication of code or maintenance burden.
Done on PR wrapper.py: pass prompt to img2img and optional engine_dir arg #66 , and when it's merged I can update it here.

Perhaps these ideas could be addressed here or in future PRs (or not at all), either way I'd be happy to discuss or collaborate further on details - feel free to reach out.

GradientSurfer · 2023-12-28T07:54:25Z

@radames I do not have PR edit access here, @cumulo-autumn perhaps you would consider granting collaborator access?

radames · 2023-12-30T06:44:00Z

hi @cumulo-autumn , I think it's good now. I've fixed a couple of uncaught exceptions. One important note, while the server and the client were designed to accept multiple queued connections, the wrapper and StreamDiffusionWrapper are not working well in that regard, i.e. the buffer across stream.stream is shared, so if you open multiple browser tabs and switch the prompt and webcams, you'll notice the images are leaking across tabs. For instance when using diffusers pipe(...) it's possible to have queue calls, as long as they're quick inference, example here.
ps. please pull and test again if you can on Windows

cumulo-autumn · 2023-12-30T08:07:11Z

@radames I do not have PR edit access here, @cumulo-autumn perhaps you would consider granting collaborator access?

Hi @GradientSurfer . Thank you for your valuable PR submissions in the past, and for your many meaningful suggestions this time as well! Regarding PR edit access, currently, we are keeping it within a group of acquaintances, so please allow us to hold off on adding new PR edit access for now. However, we are very much open to more discussions and PRs in the future, so we definitely want to continue doing those! (I apologize for the late response this time, as it has been a busy end-of-year period. Also, I really appreciate your prompt and valuable feedback on this PR.) We will consider our policy on adding new PR edit access in the future!

cumulo-autumn · 2023-12-30T08:51:30Z

hi @cumulo-autumn , I think it's good now. I've fixed a couple of uncaught exceptions. One important note, while the server and the client were designed to accept multiple queued connections, the wrapper and StreamDiffusionWrapper are not working well in that regard, i.e. the buffer across stream.stream is shared, so if you open multiple browser tabs and switch the prompt and webcams, you'll notice the images are leaking across tabs. For instance when using diffusers pipe(...) it's possible to have queue calls, as long as they're quick inference, example here. ps. please pull and test again if you can on Windows

Hi @radames. Thank you for the update! It works perfectly in my environment too! I am going to merge it.

openSourcerer9000 · 2025-01-07T06:00:27Z

@cumulo-autumn I'm not seeing denoising_strength param in the latest repo. How do we set it? Thanks,

radames added 15 commits December 21, 2023 12:09

first

d139f6f

Merge branch 'main' into dev/demo-img2img

5b33149

cleanup

ea2e00a

Merge branch 'main' into dev/demo-img2img

b635981

pass engine_dir

35566aa

warning

697ba35

Merge branch 'main' into dev/demo-img2img

2657331

fix warning

1af257a

fix title

073efe0

fix start script

11b0783

gifs

c1c0e82

fix start.sh

01490e9

no need license here

7862349

fix engine env

6a6eb9e

fix dir

a293eeb

radames requested review from AttaQ, cumulo-autumn and discus0434 December 26, 2023 10:02

radames added 3 commits December 26, 2023 19:10

name error

038ac77

static favicon

25d5547

fix timeout user warning

9c9b2f5

GradientSurfer reviewed Dec 27, 2023

View reviewed changes

demo/realtime-img2img/img2img.py Outdated Show resolved Hide resolved

demo/realtime-img2img/main.py Outdated Show resolved Hide resolved

demo/realtime-img2img/main.py Show resolved Hide resolved

fix throlle location

95b848d

radames mentioned this pull request Dec 27, 2023

wrapper.py: pass prompt to img2img and optional engine_dir arg #66

Merged

radames added 3 commits December 28, 2023 21:09

Merge branch 'main' into dev/demo-img2img

b5e39cb

use wrapper.py from common utils

0cc1737

enable stop button

aa6a75f

radames added 6 commits December 28, 2023 23:23

fix mime error on windows

aa183aa

Merge branch 'main' into dev/demo-img2img

e0992aa

use connection manager

fea6ca7

remove unecessary calls

0a5101a

remove lock

c6b9514

disable similar image filter

0daa746

cumulo-autumn approved these changes Dec 30, 2023

View reviewed changes

cumulo-autumn merged commit a3d01c4 into cumulo-autumn:main Dec 30, 2023

radames deleted the dev/demo-img2img branch December 31, 2023 07:28

TimPietrusky mentioned this pull request Jan 13, 2024

How to set Denoising strength in img2img mode? #102

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Demo img2img webcam browser #57

Demo img2img webcam browser #57

Uh oh!

radames commented Dec 26, 2023

Uh oh!

cumulo-autumn commented Dec 26, 2023 •

edited

Loading

Uh oh!

GradientSurfer left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

radames commented Dec 27, 2023

Uh oh!

GradientSurfer commented Dec 28, 2023

Uh oh!

radames commented Dec 30, 2023

Uh oh!

cumulo-autumn commented Dec 30, 2023

Uh oh!

cumulo-autumn commented Dec 30, 2023

Uh oh!

openSourcerer9000 commented Jan 7, 2025

Uh oh!

Uh oh!

Demo img2img webcam browser #57

Demo img2img webcam browser #57

Uh oh!

Conversation

radames commented Dec 26, 2023

Uh oh!

cumulo-autumn commented Dec 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GradientSurfer left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

radames commented Dec 27, 2023

Uh oh!

GradientSurfer commented Dec 28, 2023

Uh oh!

radames commented Dec 30, 2023

Uh oh!

cumulo-autumn commented Dec 30, 2023

Uh oh!

cumulo-autumn commented Dec 30, 2023

Uh oh!

openSourcerer9000 commented Jan 7, 2025

Uh oh!

Uh oh!

cumulo-autumn commented Dec 26, 2023 •

edited

Loading

GradientSurfer left a comment •

edited

Loading