Sparsemax not actually used in COMET-KIWI, XCOMET-XL/XXL

Hi,
I have been playing around with re-implementing some of your models in Marian and while progressing through the code I noticed that you are not actually using sparsemax for Comet-KIWI and Comet-XL/XXL, instead you are falling back to a softmax.

In both cases you forgot to pass the `layer_transformation` parameter to its base class:

See here for `UnifiedMetric`
https://github.com/Unbabel/COMET/blob/2bcf66604b30dcde98565854d5f36026c19f580a/comet/models/multitask/unified_metric.py#L106

and here for `XCOMETMetric`
https://github.com/Unbabel/COMET/blob/2bcf66604b30dcde98565854d5f36026c19f580a/comet/models/multitask/xcomet_metric.py#L54

In both cases the `layer_transformation` parameter does not appear in the parameter list of the base class below, but the base class has `softmax` as the default. 

In my re-implementation I am reproducing your exact numbers for Comet-KIWI with a softmax, not the sparsemax. While the sparsemax works fine for COMET-22 ref-based. 

It's not clear to me if the model was trained with a softmax or sparsemax, but you might either have a train/inference mismatch here or at the very least your models are doing something different than you expected/described.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sparsemax not actually used in COMET-KIWI, XCOMET-XL/XXL #195

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Sparsemax not actually used in COMET-KIWI, XCOMET-XL/XXL #195

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions