-
Notifications
You must be signed in to change notification settings - Fork 314
Open
Description
Currently when you want to create several instances of the same scorer, inspect creates a unique metric name for each of them by appending an index to each one (in metrics_unique_key).
In the example below, the scores will show up as something like model_graded_qa
and model_graded_qa2
from which it is rather difficult to back out which model they refer to.
Task(
dataset=dataset,
solver=[
system_message(SYSTEM_MESSAGE),
generate()
],
scorer=[
model_graded_qa(model="openai/gpt-4"),
model_graded_qa(model="google/gemini-2.5-pro")
],
)
Ideally there would be an easy way to dynamically set the scorer's name.
Two proposals for how to do this:
- Some way to override the name of a scorer (and maybe other attributes like metrics):
scorer_with(model_graded_qa(model="openai/gpt-4"), 'model_graded_qa:gpt-4')
- Some way to set the name dynamically within the scorer
@scorer()
def model_graded_qa(model = None):
async def score...
return override(score, name=f"model_graded_qa:{model}" if model else "model_graded_qa")
Metadata
Metadata
Assignees
Labels
No labels