Better names for scorer variations

Currently when you want to create several instances of the same scorer, inspect creates a unique metric name for each of them by appending an index to each one (in [metrics_unique_key](https://github.com/UKGovernmentBEIS/inspect_ai/blob/fab509736be20d1b489f7eefea450bf40df277b8/src/inspect_ai/_eval/task/results.py#L538)).

In the example below, the scores will show up as something like `model_graded_qa` and `model_graded_qa2` from which it is rather difficult to back out which model they refer to.
```python
Task(
    dataset=dataset,
    solver=[
        system_message(SYSTEM_MESSAGE),
        generate()
    ],
    scorer=[
        model_graded_qa(model="openai/gpt-4"), 
        model_graded_qa(model="google/gemini-2.5-pro")
    ],
)
```

Ideally there would be an easy way to dynamically set the scorer's name.

Two proposals for how to do this:

1. Some way to override the name of a scorer (and maybe other attributes like metrics): `scorer_with(model_graded_qa(model="openai/gpt-4"), 'model_graded_qa:gpt-4')`
2. Some way to set the name dynamically within the scorer
```python
@scorer()
def model_graded_qa(model = None):
     async def score...

     return override(score, name=f"model_graded_qa:{model}" if model else "model_graded_qa")
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Better names for scorer variations #2531

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Better names for scorer variations #2531

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions