Skip to content

Commit 564140d

Browse files
authored
Drop flash_attn skip for quantizing_moe example tests (#1396)
SUMMARY: Drop the skip related to requiring `flash_attn` be installed in the tests for the `quantizing_moe` examples. Recent CI failures related to this package and CUDA compatibility with the recently released PyTorch 2.7.0 has resulted in findings that it is not required for these tests. TEST PLAN: An [internal test run][1] that drops the installation of `flash-attn` and runs the changes on this branch indicates that the tests will pass (one successful so far, will mark PR as ready once the run completes and the remaining show expected results). Specific relevant output (will update with other tests’ results): ``` tests/examples/test_quantizing_moe.py::TestQuantizingMOE::test_deepseek_example_script[deepseek_moe_w8a8_int8.py] PASSED tests/examples/test_quantizing_moe.py::TestQuantizingMOE::test_deepseek_example_script[deepseek_moe_w8a8_fp8.py] PASSED ``` [1]: https://github.com/neuralmagic/llm-compressor-testing/actions/runs/14712618904 Signed-off-by: Domenic Barbuzzi <[email protected]>
1 parent ab43c11 commit 564140d

File tree

1 file changed

+0
-6
lines changed

1 file changed

+0
-6
lines changed

tests/examples/test_quantizing_moe.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,6 @@
1111
requires_gpu_count,
1212
)
1313

14-
# flash_attn module is required. It cannot safely be specified as a dependency because
15-
# it rqeuires a number of non-standard packages to be installed in order to be built
16-
# such as pytorch, and thus cannot be installed in a clean environment (those
17-
# dependencies must be installed prior to attempting to install flash_attn)
18-
pytest.importorskip("flash_attn", reason="flash_attn is required")
19-
2014

2115
@pytest.fixture
2216
def example_dir() -> str:

0 commit comments

Comments
 (0)