-
Notifications
You must be signed in to change notification settings - Fork 96
Description
Hi,
I have been playing around with re-implementing some of your models in Marian and while progressing through the code I noticed that you are not actually using sparsemax for Comet-KIWI and Comet-XL/XXL, instead you are falling back to a softmax.
In both cases you forgot to pass the layer_transformation
parameter to its base class:
See here for UnifiedMetric
layer_transformation: str = "sparsemax", |
and here for XCOMETMetric
layer_transformation: str = "sparsemax", |
In both cases the layer_transformation
parameter does not appear in the parameter list of the base class below, but the base class has softmax
as the default.
In my re-implementation I am reproducing your exact numbers for Comet-KIWI with a softmax, not the sparsemax. While the sparsemax works fine for COMET-22 ref-based.
It's not clear to me if the model was trained with a softmax or sparsemax, but you might either have a train/inference mismatch here or at the very least your models are doing something different than you expected/described.