[Feature] Add /tokenize and /detokenize OpenAI compatible endpoints #9545

adarshxs · 2025-08-23T20:11:11Z

Motivation

multiple users requesting tokenize/detokenize endpoints: #5711 (comment)

Adds /tokenize and /detokenize endpoints to the OpenAI-compatible API server.

works like this:

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.

gemini-code-assist

Summary of Changes

Hello @adarshxs, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates new OpenAI-compatible /tokenize and /detokenize endpoints into the API server. This enhancement allows users to programmatically convert text into token IDs and token IDs back into human-readable text, addressing a common need for interacting with large language models. The changes involve adding new API routes, defining request/response protocols, implementing the core tokenization/detokenization logic, and providing comprehensive unit tests to ensure reliability.

Highlights

New API Endpoints: This PR introduces two new OpenAI-compatible API endpoints: /tokenize and /detokenize. These endpoints allow users to convert text into token IDs and vice-versa, providing essential utilities for working with language models.
Core Logic Implementation: The core logic for handling tokenization and detokenization requests is implemented in the new serving_tokenize.py file. This includes robust handling for various input formats (single strings, lists of strings/tokens) and options for managing special tokens.
API Protocol Definition: New data models (TokenizeRequest, TokenizeResponse, DetokenizeRequest, DetokenizeResponse) have been added to protocol.py to define the clear structure for requests and responses, ensuring compatibility and ease of use.
Thorough Unit Testing: Comprehensive unit tests have been added to test_srt_endpoint.py to validate the functionality of the new endpoints, covering various scenarios including valid inputs, edge cases, and error handling for invalid data.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces /tokenize and /detokenize endpoints to the OpenAI-compatible API, which is a valuable feature enhancement. The implementation is well-structured, following the existing patterns for API endpoints, and includes a comprehensive set of unit tests covering various input types and edge cases. My review includes one suggestion to improve the robustness of input validation in the detokenization logic to handle malformed requests more gracefully.

python/sglang/srt/entrypoints/openai/serving_tokenize.py

adarshxs · 2025-08-25T18:38:52Z

cc @CatherineSue

JustinTong0323

We need to discuss further whether to add these endpoints.

python/sglang/srt/entrypoints/openai/protocol.py

adarshxs · 2025-08-31T06:30:25Z

We need to discuss further whether to add these endpoints.

Understood. it was initially requested in #5653 and we received a lot a requests in this thread as well: #5711 . additionally similar endpoints are supported in vllm: https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html?h=%2Ftokenize#tokenizer-api_1

JustinTong0323

Thanks! We plan to merge this PR, please resolve the comments ~

python/sglang/srt/entrypoints/http_server.py

JustinTong0323 · 2025-08-31T17:20:03Z

python/sglang/srt/entrypoints/openai/protocol.py

+    """Request schema for the /tokenize endpoint."""
+
+    model: str = DEFAULT_MODEL_NAME
+    prompt: Union[str, List[str]]


Shall we keep the batched option? cc @slin1237 @CatherineSue

I think we can keep it as this is not an official OpenAI endpoint, and it directly uses tokenizer, so no performance or compatibility concerns.

JustinTong0323 · 2025-09-02T06:30:54Z

I think these lines could be removed

@app.post(
    "/tokenize",
    response_class=ORJSONResponse,
    dependencies=[Depends(validate_json_request)],
    include_in_schema=False,
)
async def openai_tokenize(request: TokenizeRequest, raw_request: Request):
    return await openai_v1_tokenize(request, raw_request)
@app.post(
    "/detokenize",
    response_class=ORJSONResponse,
    dependencies=[Depends(validate_json_request)],
    include_in_schema=False,
)
async def openai_detokenize(request: DetokenizeRequest, raw_request: Request):
    return await openai_v1_detokenize(request, raw_request)

as we already routed /tokenize and /detokenize

adarshxs · 2025-09-02T07:07:20Z

yeah my bad i forgot to remove that. updated the same

hnyls2002 · 2025-09-10T05:29:09Z

@CatherineSue, could you please check the OpenAI endpoints related modifications?

CatherineSue · 2025-09-12T00:50:37Z

python/sglang/srt/entrypoints/openai/serving_tokenize.py

+                and request.tokens
+                and isinstance(request.tokens[0], int)
+            ):
+                if not all(isinstance(t, int) for t in request.tokens):


nit: Should this be better removed to _validate_request?

CatherineSue · 2025-09-12T00:52:45Z

python/sglang/srt/entrypoints/openai/serving_tokenize.py

+                    return self.create_error_response(
+                        "Invalid input: 'tokens' must be a list of integers."
+                    )
+                tokens_to_decode = [int(t) for t in request.tokens]


nit: Why do we need int(t) here? I assume the above if check already makes sure tokens are int?

python/sglang/srt/entrypoints/openai/protocol.py

CatherineSue

Overall LGTM. Left some nit comments.

anonymousmaharaj · 2025-10-06T13:50:00Z

@ispobock @slin1237 @merrymercy Could you please review this PR when you have a moment? This is important for our workflow. Thanks!

ispobock · 2025-10-07T06:19:32Z

@adarshxs Could you check the failed CI test? https://github.com/sgl-project/sglang/actions/runs/18284367523/job/52107776495?pr=9545#step:5:14449

adarshxs · 2025-10-07T07:23:28Z

@ispobock should be fixed. thanks

upd

e08c877

adarshxs requested review from ispobock, CatherineSue, slin1237 and merrymercy as code owners August 23, 2025 20:11

gemini-code-assist bot reviewed Aug 23, 2025

View reviewed changes

python/sglang/srt/entrypoints/openai/serving_tokenize.py Show resolved Hide resolved

add docs

75f963d

Merge branch 'main' into tokenizer

2e8648a

JustinTong0323 reviewed Aug 31, 2025

View reviewed changes

python/sglang/srt/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

python/sglang/srt/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

upd

7231a93

JustinTong0323 reviewed Aug 31, 2025

View reviewed changes

fix

60cd111

Merge branch 'main' into tokenizer

7612300

adarshxs and others added 3 commits September 2, 2025 07:10

remove

fc25031

Merge branch 'main' into tokenizer

ba0bb79

Merge branch 'main' into tokenizer

6b0ca39

JustinTong0323 approved these changes Sep 5, 2025

View reviewed changes

hnyls2002 approved these changes Sep 10, 2025

View reviewed changes

hnyls2002 enabled auto-merge (squash) September 10, 2025 04:31

CatherineSue reviewed Sep 12, 2025

View reviewed changes

python/sglang/srt/entrypoints/openai/protocol.py Show resolved Hide resolved

CatherineSue reviewed Sep 12, 2025

View reviewed changes

ispobock disabled auto-merge October 6, 2025 14:31

Merge branch 'main' into tokenizer

8cb4a0b

ispobock added the run-ci label Oct 6, 2025

fix raw request arg

f95f828

ispobock merged commit 7c3f07d into sgl-project:main Oct 8, 2025
164 of 173 checks passed

[Feature] Add /tokenize and /detokenize OpenAI compatible endpoints #9545

[Feature] Add /tokenize and /detokenize OpenAI compatible endpoints #9545

Conversation

adarshxs commented Aug 23, 2025

Motivation

Checklist

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

adarshxs commented Aug 25, 2025

Uh oh!

JustinTong0323 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

adarshxs commented Aug 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JustinTong0323 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JustinTong0323 Aug 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CatherineSue Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

JustinTong0323 commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adarshxs commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hnyls2002 commented Sep 10, 2025

Uh oh!

CatherineSue Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

CatherineSue Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

CatherineSue left a comment

Choose a reason for hiding this comment

Uh oh!

anonymousmaharaj commented Oct 6, 2025

Uh oh!

ispobock commented Oct 7, 2025

Uh oh!

adarshxs commented Oct 7, 2025

Uh oh!

Uh oh!

Uh oh!

adarshxs commented Aug 31, 2025 •

edited

Loading

JustinTong0323 Aug 31, 2025 •

edited

Loading

JustinTong0323 commented Sep 2, 2025 •

edited

Loading

adarshxs commented Sep 2, 2025 •

edited

Loading