-
Notifications
You must be signed in to change notification settings - Fork 193
Pull requests: NVIDIA/TensorRT-Model-Optimizer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[5455919] Fix Q/DQ/Cast placement in 'FP32 required' custom ops
#554
opened Nov 13, 2025 by
gcunhase
Loading…
llama converter is self-contained now (no dependency on internal nvidia code)
#552
opened Nov 13, 2025 by
danielkorzekwa
Loading…
AutoQuantize minor improvement: limit grad enabled parameters, limit cpu-gpu sync during scoring
#551
opened Nov 13, 2025 by
realAsma
Loading…
Make all tensors on same device for svdquant with cpu-offloading
#550
opened Nov 13, 2025 by
vishalpandya1990
Loading…
Specdec Bench: vLLM reqid, SGL path, conc > 1 metric fix
#541
opened Nov 12, 2025 by
IzzyPutterman
Loading…
[OMNIML-3015]Add per tensor/per channel MSE calibrator
#540
opened Nov 12, 2025 by
Fridah-nv
Loading…
2 tasks
Use ONNX DQ node instead of DQ custom-op for activation dequantization in nvfp4
#536
opened Nov 11, 2025 by
vishalpandya1990
Loading…
[OMNIML-2852] [2/n] Add Core Sparse Attention Infrastructure
#527
opened Nov 7, 2025 by
kaix-nv
Loading…
[Draft] [5526696] Add kv cache quantization support for onnx quantization
#486
opened Oct 31, 2025 by
zhanghaoc
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.