add support block-wise quant from bf16 #925

yiakwy-xpu-ml-framework-team · 2025-07-01T00:23:28Z

add native bf16 to fp8 block-wise quant utility for NV E4M3 data type

Currently many toolkits do not support block-wise quantization, (see the recently raised llmcompressor issue PR#1475) and torch ao topics.

Since former release of DeepV3 (03-24), more and more entities are working on alignment of DeepSeekV3, and bf16 training is very handy in major training framework.

It is native to add this processing script here so that people can immediately quant the model from bf16.

Procedure how to use it

please see inference/README.md

Correctness Verification (H800 2x8 setting up)

setup model:

send a query

yiakwy-xpu-ml-framework-team · 2025-07-01T12:06:51Z

cc @enochkan @stack-heap-overflow

yiakwy-xpu-ml-framework-team added 3 commits July 1, 2025 07:59

add support block-wise quant from bf16

44c403f

update script and verified correctness

31bdaf1

add readme

7813704

This was referenced Jul 1, 2025

[float8] Add support for blockwise fp8 quantization scheme used in DeepSeek v3 pytorch/ao#1594

Open

[FIX] fix llmcompressor support sgl-project/sglang#7270

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add support block-wise quant from bf16 #925

add support block-wise quant from bf16 #925

Uh oh!

yiakwy-xpu-ml-framework-team commented Jul 1, 2025 •

edited

Loading

Uh oh!

yiakwy-xpu-ml-framework-team commented Jul 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

add support block-wise quant from bf16 #925

Are you sure you want to change the base?

add support block-wise quant from bf16 #925

Uh oh!

Conversation

yiakwy-xpu-ml-framework-team commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

add native bf16 to fp8 block-wise quant utility for NV E4M3 data type

Procedure how to use it

Correctness Verification (H800 2x8 setting up)

setup model:

send a query

Uh oh!

yiakwy-xpu-ml-framework-team commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

yiakwy-xpu-ml-framework-team commented Jul 1, 2025 •

edited

Loading

yiakwy-xpu-ml-framework-team commented Jul 1, 2025 •

edited

Loading