@@ -65,13 +65,14 @@ pip install --upgrade torchao transformers
6565</hfoption >
6666<hfoption id =" PyTorch Index " >
6767Stable Release from the PyTorch index
68+
6869``` bash
6970pip install torchao --index-url https://download.pytorch.org/whl/cu126 # options are cpu/cu118/cu126/cu128
7071```
7172</hfoption >
7273</hfoptions >
7374
74- If your torcha version is below 0.10.0, you need to upgrade it, please refer to the [ deprecation notice] ( #deprecation-notice ) for more details.
75+ If your torchao version is below 0.10.0, you need to upgrade it, please refer to the [ deprecation notice] ( #deprecation-notice ) for more details.
7576
7677## Quantization examples
7778
@@ -88,6 +89,7 @@ We'll show examples for recommended quantization methods based on hardwares, e.g
8889### H100 GPU
8990<hfoptions id =" examples-H100-GPU " >
9091<hfoption id =" float8-dynamic-and-weight-only " >
92+
9193``` py
9294import torch
9395from transformers import TorchAoConfig, AutoModelForCausalLM, AutoTokenizer
@@ -148,6 +150,7 @@ print(tokenizer.decode(output[0], skip_special_tokens=True))
148150### A100 GPU
149151<hfoptions id =" examples-A100-GPU " >
150152<hfoption id =" int8-dynamic-and-weight-only " >
153+
151154``` py
152155import torch
153156from transformers import TorchAoConfig, AutoModelForCausalLM, AutoTokenizer
@@ -215,6 +218,7 @@ print(tokenizer.decode(output[0], skip_special_tokens=True))
215218### CPU
216219<hfoptions id =" examples-CPU " >
217220<hfoption id =" int8-dynamic-and-weight-only " >
221+
218222``` py
219223import torch
220224from transformers import TorchAoConfig, AutoModelForCausalLM, AutoTokenizer
@@ -385,13 +389,15 @@ To avoid arbitrary user code execution, torchao sets `weights_only=True` in [tor
385389
386390<hfoptions id =" serialization-examples " >
387391<hfoption id =" save-locally " >
392+
388393``` py
389394# don't serialize model with Safetensors
390395output_dir = " llama3-8b-int4wo-128"
391396quantized_model.save_pretrained(" llama3-8b-int4wo-128" , safe_serialization = False )
392397```
393398</hfoption >
394399<hfoption id =" push-to-huggingface-hub " >
400+
395401``` py
396402# don't serialize model with Safetensors
397403USER_ID = " your_huggingface_user_id"
0 commit comments