Skip to content

Conversation

Edwardf0t1
Copy link
Collaborator

@Edwardf0t1 Edwardf0t1 commented Sep 4, 2025

This is the second PR in a three-part series to enable native ModelOpt quantization in SGLang. It includes changes from the first PR (#7149) and will be rebased once the first PR is merged.

Motivation

We aim to enhance SGLang's quantization capabilities, making ModelOpt integration more robust and user-friendly while providing checkpoint persistence for better performance in production environments.

Modifications

  • Created _setup_modelopt_quantization() and added calibration functionalities.
  • Added modelopt_checkpoint_restore_path and modelopt_checkpoint_save_path parameters to both ModelConfig and ServerArgs. These allow users to save and restore quantized checkpoints, avoiding re-quantization on subsequent runs
  • Improved error handling during the ModelOpt quantization process.
  • Added mode unit tests in test_modelopt_loader.py to verify the ModelOpt functionality.

The 3rd PR are also ready for review: #10154

Accuracy Tests

Benchmarking and Profiling

Checklist

@zhyncs
Copy link
Member

zhyncs commented Sep 8, 2025

hi @Edwardf0t1 can you help fix the conflicts? thanks

@Edwardf0t1
Copy link
Collaborator Author

hi @Edwardf0t1 can you help fix the conflicts? thanks

@zhyncs Just rebased and resolved the conflicts. Could you or @Qiaolin-Yu help review the PR? Thanks.

@jingyu-ml
Copy link
Contributor

I think we should add example code in this PR to demonstrate how to use modelopt_checkpoint_restore_path or other new functions, so users can understand without needing deep context.

@Edwardf0t1 Edwardf0t1 force-pushed the zhiyu/modelopt-sglang-api-2 branch from f074579 to c13b457 Compare September 18, 2025 06:21
@Edwardf0t1
Copy link
Collaborator Author

I think we should add example code in this PR to demonstrate how to use modelopt_checkpoint_restore_path or other new functions, so users can understand without needing deep context.

The usage is covered in unit tests: test/srt/model_loader/test_modelopt_loader.py

@Edwardf0t1 Edwardf0t1 force-pushed the zhiyu/modelopt-sglang-api-2 branch 2 times, most recently from c118561 to e75fbf3 Compare September 29, 2025 23:56
Copy link
Collaborator

@Qiaolin-Yu Qiaolin-Yu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Edwardf0t1
Copy link
Collaborator Author

LGTM

Thanks @Qiaolin-Yu for the review and approval.
The failed CI tests seem unrelated to the PR. Could we unblock to merge? @zhyncs @Ying1123 @merrymercy

@Edwardf0t1 Edwardf0t1 force-pushed the zhiyu/modelopt-sglang-api-2 branch from 40fefb3 to 9bc99e7 Compare October 11, 2025 03:13
@Edwardf0t1 Edwardf0t1 enabled auto-merge (squash) October 11, 2025 03:14
@Edwardf0t1 Edwardf0t1 merged commit 129d299 into sgl-project:main Oct 11, 2025
84 of 98 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants