Skip to content

🪙 feat: Add check_embedding_ctx_length Flag #161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 4, 2025

Conversation

Fjf
Copy link
Contributor

@Fjf Fjf commented Jun 20, 2025

When using an OpenAI compatible endpoint with e.g., vLLM, the referenced models may not be one of the default openai models. The langchain OpenAIEmbeddings will then pre-tokenize the input using a local tiktoken tokenizer, before sending this to the endpoint as tokenized input.
This of course will not work if the tokenizer of the vLLM model is different from the openai tokenizer.

Disabling this check will directly send input to the given endpoint without checking the tokenized input first.
For vLLM I have checked, and giving input larger than the context length will not cause any errors, but it will simply be truncated and will return an embedding as expected.

@danny-avila
Copy link
Owner

@Fjf can you also edit the README to include this new variable?

@Fjf
Copy link
Contributor Author

Fjf commented Jul 4, 2025

@danny-avila updated

@danny-avila danny-avila changed the title Add environment variable for disabling 'check_embedding_ctx_length'. 🪙 feat: Add check_embedding_ctx_length Flag Jul 4, 2025
@danny-avila danny-avila merged commit 87396d6 into danny-avila:main Jul 4, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants