Skip to content

Commit 640e8b6

Browse files
committed
Small changes for docs build
Signed-off-by: Aidan Reilly <[email protected]>
1 parent dddc2f8 commit 640e8b6

File tree

4 files changed

+65
-8
lines changed

4 files changed

+65
-8
lines changed

.gitignore

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -93,9 +93,6 @@ instance/
9393
# Scrapy stuff:
9494
.scrapy
9595

96-
# Sphinx documentation
97-
docs/_build/
98-
9996
# PyBuilder
10097
target/
10198

@@ -129,6 +126,7 @@ venv.bak/
129126

130127
# mkdocs documentation
131128
/site
129+
docs/.cache/
132130

133131
# mypy
134132
.mypy_cache/

docs/Makefile

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Minimal mkdocs makefile
2+
3+
PYTHON := python3
4+
MKDOCS_CMD := mkdocs
5+
MKDOCS_CONF := ../mkdocs.yml
6+
7+
.PHONY: help install serve build clean
8+
9+
help:
10+
@echo "Available targets:"
11+
@echo " install Install dependencies globally"
12+
@echo " serve Serve docs locally"
13+
@echo " build Build static site"
14+
@echo " clean Remove build artifacts"
15+
16+
install:
17+
pip install -e "../[dev]"
18+
19+
serve:
20+
$(MKDOCS_CMD) serve --livereload -f $(MKDOCS_CONF)
21+
22+
build:
23+
$(MKDOCS_CMD) build -f $(MKDOCS_CONF)
24+
25+
clean:
26+
rm -rf site/ .cache/

docs/README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Getting started with LLM Compressor docs
2+
3+
```bash
4+
cd docs
5+
```
6+
7+
- Install the dependencies:
8+
9+
```bash
10+
make install
11+
```
12+
13+
- Clean the previous build (optional but recommended):
14+
15+
```bash
16+
make clean
17+
```
18+
19+
- Serve the docs:
20+
21+
```bash
22+
make serve
23+
```
24+
25+
This will start a local server at http://localhost:8000. You can now open your browser and view the documentation.

docs/index.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,19 @@
11
# Home
22

3-
!!! info "New Feature: Axolotl Sparse Finetuning Integration"
4-
Easily finetune sparse LLMs through our seamless integration with Axolotl.
5-
[Learn more](https://docs.axolotl.ai/docs/custom_integrations.html#llmcompressor).
3+
!!! info "Llama4 Quantization Support"
4+
Quantize a Llama4 model to [W4A16](examples/quantization_w4a16.md) or [NVFP4](examples/quantization_w4a16.md). The checkpoint produced can seamlessly run in vLLM.
65

7-
!!! info "New Feature: AutoAWQ Integration"
8-
Perform low-bit weight-only quantization efficiently using AutoAWQ, now part of LLM Compressor. [Learn more](https://github.com/vllm-project/llm-compressor/pull/1177).
6+
!!! info "Large Model Support with Sequential Onloading"
7+
As of llm-compressor>=0.6.0, you can now quantize very large language models on a single GPU. Models are broken into disjoint layers which are then onloaded to the GPU one layer at a time. For more information on sequential onloading, see [Big Modeling with Sequential Onloading](examples/big_models_with_sequential_onloading.md) as well as the [DeepSeek-R1 Example](examples/quantizing_moe.md).
8+
9+
!!! info "Preliminary FP4 Quantization Support"
10+
Quantize weights and activations to FP4 and seamlessly run the compressed model in vLLM. Model weights and activations are quantized following the NVFP4 [configuration](https://github.com/neuralmagic/compressed-tensors/blob/f5dbfc336b9c9c361b9fe7ae085d5cb0673e56eb/src/compressed_tensors/quantization/quant_scheme.py#L104). See examples of [weight-only quantization](examples/quantization_w4a16_fp4.md) and [fp4 activation support](examples/quantization_w4a4_fp4.md). Support is currently preliminary and additional support will be added for MoEs.
11+
12+
!!! info "Updated AWQ Support"
13+
Improved support for MoEs with better handling of larger models
14+
15+
!!! info "Axolotl Sparse Finetuning Integration"
16+
Seamlessly finetune sparse LLMs with our Axolotl integration. Learn how to create [fast sparse open-source models with Axolotl and LLM Compressor](https://developers.redhat.com/articles/2025/06/17/axolotl-meets-llm-compressor-fast-sparse-open). See also the [Axolotl integration docs](https://docs.axolotl.ai/docs/custom_integrations.html#llmcompressor).
917

1018
## <div style="display: flex; align-items: center;"><img alt="LLM Compressor Logo" src="assets/llmcompressor-icon.png" width="40" style="vertical-align: middle; margin-right: 10px" /> LLM Compressor</div>
1119

0 commit comments

Comments
 (0)