fix: improve metric decorators with better validation and error handling #2302

jjmachan · 2025-09-23T02:39:05Z

Summary

This PR enhances the metric decorator system (discrete_metric, numeric_metric, ranking_metric) with improved input validation, better error messages, and support for both plain value returns and MetricResult objects.

Key Changes

1. Automatic MetricResult Wrapping

Metric functions can now return plain values (strings, floats, lists) which are automatically wrapped in MetricResult objects
Functions can still return MetricResult objects directly for cases where custom reasons are needed

2. Enhanced Input Validation with Pydantic

Added Pydantic-based validation for metric function parameters
Provides clear, user-friendly error messages for:
- Type mismatches
- Missing required parameters
- Unknown parameters (warnings)
- Positional arguments (with helpful correction suggestions)

3. Improved Async/Sync Handling

Simplified async execution using existing ragas.async_utils.run
Proper detection and handling of event loops
Both sync and async functions work seamlessly

4. Direct Callable Support

Decorated metrics can be called directly like the original function
metric() returns the raw function result
metric.score() returns a MetricResult object with validation

5. Better MetricResult Representation

Improved __repr__ to show both value and reason when present
Cleaner string representation for debugging

Testing

Comprehensive test suite added (test_metric_decorators.py) covering:
- Plain value returns vs MetricResult returns
- Sync and async functions
- Input validation edge cases
- Direct callable functionality
- All three metric types (discrete, numeric, ranking)

Breaking Changes

None - the changes are backward compatible. Existing code using MetricResult returns will continue to work.

Benefits

Better Developer Experience: Clear error messages guide users to correct usage
More Flexible: Functions can return simple values without wrapping in MetricResult
Type Safety: Pydantic validation ensures type correctness at runtime
Cleaner Code: Users can write simpler metric functions when custom reasons aren't needed

🤖 Generated with Claude Code

anistark

LGTM

However, this might break the custom llm based metrics. validation might break.

src/ragas/metrics/decorator.py

openhands-ai · 2025-09-23T17:42:42Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- CI

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #2302 at branch `fix/new-metrics`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

jjmachan added 7 commits September 22, 2025 17:11

wrap with MetricResult if we have a users pass without MetricResult

bcc16bc

manual type validation and error messages

dd1b444

pydantic validation

edd71b4

call metric directly

968d9b1

simplified async based on aevaluate

874c388

better repr for MetricResult

b53a8f5

cleanup decorator factory

17401ae

dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Sep 23, 2025

jjmachan changed the title ~~cleanup discrete metric creation~~ fix: improve metric decorators with better validation and error handling Sep 23, 2025

jjmachan requested a review from anistark September 23, 2025 02:42

anistark approved these changes Sep 23, 2025

View reviewed changes

anistark reviewed Sep 23, 2025

View reviewed changes

src/ragas/metrics/decorator.py Show resolved Hide resolved

allowed arbitary types

c3c3fe4

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Sep 23, 2025

added typing support to metrics

af88da6

jjmachan merged commit 4678acf into main Sep 23, 2025
8 checks passed

jjmachan deleted the fix/new-metrics branch September 23, 2025 18:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: improve metric decorators with better validation and error handling #2302

fix: improve metric decorators with better validation and error handling #2302

Uh oh!

jjmachan commented Sep 23, 2025 •

edited

Loading

Uh oh!

anistark left a comment •

edited

Loading

Uh oh!

Uh oh!

openhands-ai bot commented Sep 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: improve metric decorators with better validation and error handling #2302

fix: improve metric decorators with better validation and error handling #2302

Uh oh!

Conversation

jjmachan commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

1. Automatic MetricResult Wrapping

2. Enhanced Input Validation with Pydantic

3. Improved Async/Sync Handling

4. Direct Callable Support

5. Better MetricResult Representation

Testing

Breaking Changes

Benefits

Uh oh!

anistark left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

openhands-ai bot commented Sep 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jjmachan commented Sep 23, 2025 •

edited

Loading

anistark left a comment •

edited

Loading