Skip to content

Commit 4232c89

Browse files
committed
Address: Review Comments from @kylesayrs
Signed-off-by: Rahul Tuli <[email protected]>
1 parent cccb284 commit 4232c89

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Big updates have landed in LLM Compressor! Check out these exciting new features
1818

1919
* **Axolotl Sparse Finetuning Integration:** Easily finetune sparse LLMs through our seamless integration with Axolotl. [Learn more here](https://docs.axolotl.ai/docs/custom_integrations.html#llmcompressor).
2020
* **AutoAWQ Integration:** Perform low-bit weight-only quantization efficiently using AutoAWQ, now part of LLM Compressor. *Note: This integration should be considered experimental for now. Enhanced support, including for MoE models and improved handling of larger models via layer sequential pipelining, is planned for upcoming releases.* [See the details](https://github.com/vllm-project/llm-compressor/pull/1177).
21-
* **Day 0 Llama 4 Support:** Meta utilized LLM Compressor to create the [FP8-quantized Llama-4-Maverick-17B-128E](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8), optimized for vLLM inference using our compressed-tensors format.
21+
* **Day 0 Llama 4 Support:** Meta utilized LLM Compressor to create the [FP8-quantized Llama-4-Maverick-17B-128E](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8), optimized for vLLM inference using [compressed-tensors](https://github.com/neuralmagic/compressed-tensors) format.
2222

2323
### Supported Formats
2424
* Activation Quantization: W8A8 (int8 and fp8)

0 commit comments

Comments
 (0)