-
Notifications
You must be signed in to change notification settings - Fork 118
Description
A pretty common use case for mir_eval is to iteratively call mir_eval.[task].evalute(...)
on a sequence of pairs of reference and estimates, then collect the results afterward into a dataframe for subsequent analysis.
This pattern is so common in fact, that it may be worth providing some scaffolding to streamline and standardize it.
What I'm thinking is something like the following pattern:
df = mir_eval.collections.evaluate(task='beat', generator, **kwargs)
where generator
is a generator that yields dictionaries containing the fields necessary as input to the given task's evaluator (e.g., ref_intervals
and est_intervals
or whatever), optionally an id
field (otherwise a counter index is constructed while iterating), and kwargs
are additional kwargs for the evaluator.
The resulting df
would have as columns all of the fields of the task's evaluator function, and an index column to key on the provided (or generated) id
.
Caveats
- Doing this would probably necessitate adding pandas as a dependency. I think this is fine in 2025.
- We may need to build out some scaffolding utilities (to be housed under a
collections
module) to make it easier to build these generators. I don't have a great sense of how this would look yet, but it would probably become clear after prototyping the core functionality and using it a bit.