Skip to content

Conversation

adarshxs
Copy link
Collaborator

@adarshxs adarshxs commented Apr 24, 2025

Motivation

requested in: #5653

Adds /tokenize and /detokenize endpoints to the OpenAI-compatible API server.

allows users to directly access the server's tokenizer for encoding text into token IDs and decoding token IDs back into text.

  • Implements TokenizeRequest/Response and DetokenizeRequest/Response Pydantic models.
  • Adds handler logic in adapter.py utilizing the TokenizerManager.
  • Routes are added in http_server.py.
  • Supports single/batch inputs and options like add_special_tokens/skip_special_tokens.
  • Includes /v1/ path aliases.

works like this:
image

Checklist

Copy link
Contributor

@merrymercy merrymercy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test case?

@adarshxs
Copy link
Collaborator Author

Done @merrymercy

@adarshxs adarshxs requested a review from merrymercy April 27, 2025 15:33
@halaction
Copy link

Possible updates on this PR? Test failures don't seem to be related to this issue, and the added test cases are successful - i think it's due for a review. This feature would be really helpful.

@adarshxs
Copy link
Collaborator Author

@zhyncs @merrymercy @zhaochenyang20 please review

@FeliceSchena
Copy link

Is this PR still under review? It would be helpful to have it merged.

@thigger
Copy link

thigger commented Jul 15, 2025

Unfortunately there have been a few breaking changes to the OAI code so I've had to downgrade to 0.4.6-post5 to use this (and a tokenize endpoint is really useful!)

@adarshxs
Copy link
Collaborator Author

adarshxs commented Jul 15, 2025

Yeah the entire openai api interface has been refactored. I'll try implementing these endpoints for the new refractor ASAP and soon close this PR

@thigger
Copy link

thigger commented Jul 15, 2025

Thanks - your patch is the main reason I've been able to use sglang for my usecase!

@Zyann7
Copy link

Zyann7 commented Jul 16, 2025

Thanks for your work, this will be really helpful! Is there an estimated timeline for merge or release?

@adarshxs
Copy link
Collaborator Author

cc @thigger @Zyann7 @FeliceSchena @halaction apologies for the delay. opened a new PR with this feature compatible with the new openai interface. Hoping to have it merged soon hence i'll be closing this

@adarshxs adarshxs closed this Aug 23, 2025
@thigger
Copy link

thigger commented Aug 23, 2025

Thanks! I've been holding off upgrading sglang for this reason and will pull your new patch

@adarshxs adarshxs deleted the tok branch October 8, 2025 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants