Skip to content

Conversation

@wwelder
Copy link

@wwelder wwelder commented Mar 21, 2024

Description

  • fixed a bug that could cause sd_model not being sent to CPU as expected

Screenshots/videos:

Checklist:

@wwelder wwelder requested a review from AUTOMATIC1111 as a code owner March 21, 2024 22:44
@AUTOMATIC1111
Copy link
Owner

how to reproduce

@wwelder
Copy link
Author

wwelder commented Mar 24, 2024

Let me know if I am wrong here but the original code looks like a typo to me, we are sending sd_model to cpu multiple times in this for loop(sd_model is the current loaded model).
This pr applies the setting of sd_checkpoints_keep_in_cpu that all loaded models in gpu will be sent to cpu

def reuse_model_from_already_loaded(sd_model, checkpoint_info, timer):
    already_loaded = None
    for i in reversed(range(len(model_data.loaded_sd_models))):
        loaded_model = model_data.loaded_sd_models[i]
        ...
        if shared.opts.sd_checkpoints_keep_in_cpu:
            send_model_to_cpu(sd_model)
            timer.record("send model to cpu")```

@AUTOMATIC1111
Copy link
Owner

Well, while I agree that this shouldn't be done in a loop, the original logic was to send the current model, not all of them. I guess your situation is that we have like 5 models loaded, then we enable the sd_checkpoints_keep_in_cpu setting, and after that only one of them is sent to CPU?

In any case I'd rather not change the arguments of the function because there may be someone using it, and it kind of looks like things will break if send_model_to_cpu is called on the model that send_model_to_trash has been called on.

@wwelder
Copy link
Author

wwelder commented Mar 25, 2024

I guess your situation is that we have like 5 models loaded, then we enable the sd_checkpoints_keep_in_cpu setting, and after that only one of them is sent to CPU?

yea this is what I have seen, but it doesn't cause oom and will gradually put all models in CPU after more model switches so I think it is fine for most people

In any case I'd rather not change the arguments of the function because there may be someone using it, and it kind of looks like things will break if send_model_to_cpu is called on the model that send_model_to_trash has been called on.

good point, I reverted the argument change and moved send_model_to_cpu out of the loop

@AUTOMATIC1111
Copy link
Owner

This presents a problem of needlessly sending the model to CPU when it's the one we want to use.

@wwelder
Copy link
Author

wwelder commented Mar 25, 2024

updated again, let me know if it meets the requirement now

@AUTOMATIC1111 AUTOMATIC1111 merged commit 8bebfde into AUTOMATIC1111:dev Mar 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants