Skip to content

Evaluation Viewer: Performance metrics: Questions with AI auto-evals and Expert evals #419

@lisafast

Description

@lisafast

I ran into a problem today in analyzing results of the trial where there were a big set of questions with BOTH an auto-eval and an expert eval.
Ended up having to make a bunch of formulas to create two new columns with a 'Final-score' and a 'Score-source' where:
Final score was blank if no scores, one of the scores if only one score, and expert score if both scores were present. The score-source gave me a way to create a pivot table with both AI and expert columns, where no question was in both.

In the performance metrics, this can happen too. So this is two issues in one:

  • @ryanhyma did you decide how these should be handled in the database? Should the auto-eval be deleted automatically if an expert adds an expert eval? This is an alternative solution to the problem rather than handling later in the performance metrics
  • when an expert loads a chat for expert eval, shouldn't they see the existing auto-eval @anniecrombie ?

Thoughts?

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions