Skip to content

[Feature] Tokenizer endpoint in server mode #5653

@junjzhang

Description

@junjzhang

Checklist

Motivation

Using Server mode to generate Rollout in Agentic RL training is a very necessary and natural approach. However, the design of Agent Scaffold typically only considers compatibility with OpenAI compatible API interface, making it difficult to collect token IDs at the Agent Scaffold level—information that is essential for training. Additionally, current design couples tokenization with the inference model, which indicates it's a logically sound idea to let inference engine handle tokenization.
Thus, a tokenize and endpoint is needed.

Related resources

Maybe refer to vllm's tokenize endpoint. https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#tokenizer-api

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions