Skip to content

Disable kernels during calibration (and tracing) #1454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 22, 2025

Conversation

kylesayrs
Copy link
Collaborator

Purpose

  • Guarantee that module hooks trigger by disabling kernel acceleration

Background

  • As of transformers>=4.52, model forward functions may be overwritten with kernels
  • Kernel execution can be disabled using the config (source)
  • It seems that HF wants to continue enabling regular execution through this disabling feature. This gives us some faith that they will not override the cls init function in a way that inhibits execution with the standard forward definition
  • This only affects users who have the hf kernels library installed

Changes

  • Implement disable_hf_kernels context and add to calibration_forward_context
  • Remove parenthesis around calibration_forward_context with in order to support python3.9
  • Apply style to tests

Signed-off-by: Kyle Sayers <[email protected]>
Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

dsikka
dsikka previously approved these changes May 20, 2025
@dsikka dsikka added the ready When a PR is ready for review label May 20, 2025
@dsikka dsikka enabled auto-merge (squash) May 21, 2025 17:11
@kylesayrs kylesayrs dismissed stale reviews from brian-dellabetta and dsikka via ef0ec36 May 22, 2025 16:08
@dsikka dsikka merged commit dc063de into main May 22, 2025
11 checks passed
@dsikka dsikka deleted the kylesayrs/disable-kernels branch May 22, 2025 17:20
aireilly pushed a commit to aireilly/llm-compressor that referenced this pull request Jul 30, 2025
## Purpose ##
* Guarantee that module hooks trigger by disabling kernel acceleration

## Background ##
* As of `transformers>=4.52`, model forward functions may be overwritten
with kernels
* Kernel execution can be disabled using the config
([source](https://github.com/huggingface/transformers/blob/main/src/transformers/integrations/hub_kernels.py#L84))
* It seems that HF wants to continue enabling regular execution through
this disabling feature. This gives us some faith that they will not
[override the cls init
function](https://github.com/huggingface/transformers/blob/main/src/transformers/integrations/hub_kernels.py#L78)
in a way that inhibits execution with the standard forward definition
* This only affects users who have the hf `kernels` library installed

## Changes ##
* Implement `disable_hf_kernels` context and add to
`calibration_forward_context`
* Remove parenthesis around `calibration_forward_context` with in order
to support python3.9
* Apply style to tests

Signed-off-by: Kyle Sayers <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready When a PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants