You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Drop flash_attn skip for quantizing_moe example tests (#1396)
SUMMARY:
Drop the skip related to requiring `flash_attn` be installed in the
tests for the `quantizing_moe` examples. Recent CI failures related to
this package and CUDA compatibility with the recently released PyTorch
2.7.0 has resulted in findings that it is not required for these tests.
TEST PLAN:
An [internal test run][1] that drops the installation of `flash-attn`
and runs the changes on this branch indicates that the tests will pass
(one successful so far, will mark PR as ready once the run completes and
the remaining show expected results).
Specific relevant output (will update with other tests’ results):
```
tests/examples/test_quantizing_moe.py::TestQuantizingMOE::test_deepseek_example_script[deepseek_moe_w8a8_int8.py] PASSED
tests/examples/test_quantizing_moe.py::TestQuantizingMOE::test_deepseek_example_script[deepseek_moe_w8a8_fp8.py] PASSED
```
[1]:
https://github.com/neuralmagic/llm-compressor-testing/actions/runs/14712618904
Signed-off-by: Domenic Barbuzzi <[email protected]>
0 commit comments