On error, the framework reloads the model even if it was already loaded

in `_run_single_test`
we have the code
```
            model, tokenizer = load_model(
            self.model_path,
            num_gpus=self.num_gpus,
            device=self.device,
            debug=self.debug,
        )
```

This happens in a loop in generation_results:

```
        for attempt in range(max_retries):
            try:
                state = self._run_single_test()
                if state:
                    print(f"Test function successful on attempt {attempt + 1}")
                    return state
            except Exception as e:
                
                print(f"Test function failed on attempt {attempt + 1}")
                import traceback; traceback.print_exc();
                print(f"Retrying in {retry_interval} seconds...")
                time.sleep(retry_interval)
```

So if the model was already loaded and then there is an error, the model will be loaded again without clearing the memory, often causing OOM errors. The model should be stored in a class property so it can be accessed if it was already loaded into the GPU.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

On error, the framework reloads the model even if it was already loaded #47

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

On error, the framework reloads the model even if it was already loaded #47

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions