Added mixed_precision_dtype argument to HFLM to enable autocasting #3138
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a new
mixed_precision_dtype: Optional[Union[str, torch.dtype]]
parameter toHFML
.When specified (either as a string or
torch.dtype
) all internal calls toself.model
will occur insidetorch.autocast
regions for the model's device and given dtype.self._model_call
andself._model_generate
which affectsloglikelihood
,loglikelihood_rolling
,generate_until
and automatic batch size detection.None
-> original behaviour (thetorch.autocast
context manager will become a no op).The addition of this feature results in slightly different behaviour to explicitly loading the model with a specified
dtype
as it relies on torch's internal op-selection and autocasting behaviour. This is primarily useful when working with multi-dtype models (e.g. VLMs, PEFT models) in the CLI or when invoking the harness from the python API during training where a model may be in full precision.Usage
Although any
torch.dtype
is accepted, in practice"float16"
and"bfloat16"
are the only sensible choices when utilising this feature.Docs
No documentation has been added as the
HFLM
class does not have specific documentation, and the argument name and typing is self explanatory.