Increase output token limit for EquivalenceEvaluator #6835

shyamnamboodiripad · 2025-09-22T21:15:09Z

EquivalenceEvaluator was specifying MaxOutputTokens = 1 since its prompt instructs the LLM to produce a response (score) that is a single digit (between 1 and 5).

Turns out that while this works for most models (including the OpenAI models that were used to test the prompt), some models require more than one token for this. For example, looks like Claude requires two tokens for this - see #6814).

This PR bumps the MaxOutputTokens to 5 to address the above issue.

Fixes #6814

Microsoft Reviewers: Open in CodeFlow

EquivalenceEvaluator was specifying MaxOutputTokens = 1 since its prompt instructs the LLM to produce a response (score) that is a single digit (between 1 and 5). Turns out that while this works for most models (including the OpenAI models that were used to test the prompt), some models require more than one token for this. For example, looks like Claude requires two tokens for this - see dotnet#6814). This PR bumps the MaxOutputTokens to 5 to address the above issue. Fixes dotnet#6814

Copilot

Pull Request Overview

Increases the maximum output token limit for EquivalenceEvaluator from 1 to 5 tokens to accommodate different LLM tokenization behaviors while maintaining the single-digit score output requirement.

Bumps MaxOutputTokens from 1 to 5 in ChatOptions configuration
Adds explanatory comment linking to the GitHub issue for context

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

shyamnamboodiripad requested a review from a team as a code owner September 22, 2025 21:15

Copilot AI review requested due to automatic review settings September 22, 2025 21:15

Copilot AI reviewed Sep 22, 2025

View reviewed changes

github-actions bot added the area-ai-eval Microsoft.Extensions.AI.Evaluation and related label Sep 22, 2025

dotnet-policy-service bot assigned shyamnamboodiripad Sep 22, 2025

shyamnamboodiripad enabled auto-merge (squash) September 22, 2025 21:15

shyamnamboodiripad mentioned this pull request Sep 22, 2025

[AI Evaluation] EquivalenceEvaluator is not producing an answer #6814

Closed

peterwald approved these changes Sep 23, 2025

View reviewed changes

shyamnamboodiripad merged commit 6fb8ab7 into dotnet:main Sep 23, 2025
7 checks passed

shyamnamboodiripad deleted the tokenlimit branch September 23, 2025 17:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Increase output token limit for EquivalenceEvaluator #6835

Increase output token limit for EquivalenceEvaluator #6835

Uh oh!

shyamnamboodiripad commented Sep 22, 2025 •

edited by dotnet-policy-service bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Increase output token limit for EquivalenceEvaluator #6835

Increase output token limit for EquivalenceEvaluator #6835

Uh oh!

Conversation

shyamnamboodiripad commented Sep 22, 2025 • edited by dotnet-policy-service bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Microsoft Reviewers: Open in CodeFlow

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

shyamnamboodiripad commented Sep 22, 2025 •

edited by dotnet-policy-service bot

Loading