-
Notifications
You must be signed in to change notification settings - Fork 29k
Description
Is there an existing issue for this?
- I have searched the existing issues and checked the recent builds/commits
What happened?
During model loading, alphas_cumprod is downcasted to half precision along with the rest of the model. However, any sampler will always use those values as full precision floats. Not only is this not saving memory, it also noticeably changes results due to the loss of precision, and it may in fact be one of the most major factors changing results between mixed and full precision inference. I am considering this to be a bug since it serves no clear optimization purpose and alters results in a way that is less aligned with model training. I have also tested a crude fix for the issue and have found that it does not impact generation speed in a significant manner.
Perhaps more importantly, this problem is an obstacle to implementing zero terminal SNR noise schedules like in #13052, since downcasting the values that we derive SNR from will result in sampler sigmas being rounded to infinity shortly after they pass 65000. It is possible to use a zero terminal SNR noise schedule that even plays nicely with k-diffusion samplers by setting the last alpha_bar value to a very small number like 4.8973451890853435e-08
, but in fp16 format this number will be rounded down to zero which will break things.
Steps to reproduce the problem
- Load a model
- model.half() gets called in
modules/sd_models.py
, causingmodel.alphas_cumprod
to be downcast - Observe the loss of precision
- Inspect the dtype of alphas_cumprod values (or values directly derived from them, like the sigma schedule) as used within a K-diffusion sampler
- Observe that they are being used in full-precision float format
What should have happened?
Downcasting the model to half precision should be implemented in a way that leaves alphas_cumprod alone. Since this change will change seeds, a compatibility option should be added that will downcast and upcast a copy of alphas_cumprod right before sampling in order to simulate old behavior. Implementing that compatibility option would also provide all of the necessary infrastructure to cleanly implement an option to use a zero terminal SNR noise schedule so #13052 may as well be implemented at the same time.
Sysinfo
What browsers do you use to access the UI ?
Mozilla Firefox
Console logs
This is an internal issue, logs won't help here.
Additional information
No response