Skip to content

Sparsemax not actually used in COMET-KIWI, XCOMET-XL/XXL #195

@emjotde

Description

@emjotde

Hi,
I have been playing around with re-implementing some of your models in Marian and while progressing through the code I noticed that you are not actually using sparsemax for Comet-KIWI and Comet-XL/XXL, instead you are falling back to a softmax.

In both cases you forgot to pass the layer_transformation parameter to its base class:

See here for UnifiedMetric

layer_transformation: str = "sparsemax",

and here for XCOMETMetric

layer_transformation: str = "sparsemax",

In both cases the layer_transformation parameter does not appear in the parameter list of the base class below, but the base class has softmax as the default.

In my re-implementation I am reproducing your exact numbers for Comet-KIWI with a softmax, not the sparsemax. While the sparsemax works fine for COMET-22 ref-based.

It's not clear to me if the model was trained with a softmax or sparsemax, but you might either have a train/inference mismatch here or at the very least your models are doing something different than you expected/described.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions