Skip to content

Conversation

shyamnamboodiripad
Copy link
Contributor

@shyamnamboodiripad shyamnamboodiripad commented Sep 22, 2025

EquivalenceEvaluator was specifying MaxOutputTokens = 1 since its prompt instructs the LLM to produce a response (score) that is a single digit (between 1 and 5).

Turns out that while this works for most models (including the OpenAI models that were used to test the prompt), some models require more than one token for this. For example, looks like Claude requires two tokens for this - see #6814).

This PR bumps the MaxOutputTokens to 5 to address the above issue.

Fixes #6814

Microsoft Reviewers: Open in CodeFlow

EquivalenceEvaluator was specifying MaxOutputTokens = 1 since its prompt instructs the LLM to produce a response (score) that is a single digit (between 1 and 5).

Turns out that while this works for most models (including the OpenAI models that were used to test the prompt), some models require more than one token for this. For example, looks like Claude requires two tokens for this - see dotnet#6814).

This PR bumps the MaxOutputTokens to 5 to address the above issue.

Fixes dotnet#6814
@shyamnamboodiripad shyamnamboodiripad requested a review from a team as a code owner September 22, 2025 21:15
@Copilot Copilot AI review requested due to automatic review settings September 22, 2025 21:15
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Increases the maximum output token limit for EquivalenceEvaluator from 1 to 5 tokens to accommodate different LLM tokenization behaviors while maintaining the single-digit score output requirement.

  • Bumps MaxOutputTokens from 1 to 5 in ChatOptions configuration
  • Adds explanatory comment linking to the GitHub issue for context

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@shyamnamboodiripad shyamnamboodiripad merged commit 6fb8ab7 into dotnet:main Sep 23, 2025
7 checks passed
@shyamnamboodiripad shyamnamboodiripad deleted the tokenlimit branch September 23, 2025 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-ai-eval Microsoft.Extensions.AI.Evaluation and related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AI Evaluation] EquivalenceEvaluator is not producing an answer
2 participants