Drop flash_attn
skip for quantizing_moe example tests
#1396
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
SUMMARY:
Drop the skip related to requiring
flash_attn
be installed in the tests for thequantizing_moe
examples. Recent CI failures related to this package and CUDA compatibility with the recently released PyTorch 2.7.0 has resulted in findings that it is not required for these tests.TEST PLAN:
An internal test run that drops the installation of
flash-attn
and runs the changes on this branch indicates that the tests will pass (one successful so far, will mark PR as ready once the run completes and the remaining show expected results).Specific relevant output (will update with other tests’ results):