You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/using_metrics.rst
+120-1Lines changed: 120 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,4 +49,123 @@ Adding model predictions and references can be done using either one of the :fun
49
49
50
50
The model predictions and references can be provided in a wide number of formats (python lists, numpy arrays, pytorch tensors, tensorflow tensors), the metric object will take care of converting them to a suitable format for temporary storage and computation (as well as bringing them back to cpu and detaching them from gradients for PyTorch tensors).
51
51
52
-
The exact format of the inputs is specific to each metric script and can be read in the
52
+
The exact format of the inputs is specific to each metric script and can be found in :obj:`nlp.Metric.features`, :obj:`nlp.Metric.inputs_descriptions` and the string representation of the :class:`nlp.Metric` object:
Produces BLEU scores along with its sufficient statistics
89
+
from a source against one or more references.
90
+
91
+
Args:
92
+
predictions: The system stream (a sequence of segments)
93
+
references: A list of one or more reference streams (each a sequence of segments)
94
+
smooth: The smoothing method to use
95
+
smooth_value: For 'floor' smoothing, the floor to use
96
+
force: Ignore data that looks already tokenized
97
+
lowercase: Lowercase the data
98
+
tokenize: The tokenizer to use
99
+
Returns:
100
+
'score': BLEU score,
101
+
'counts': Counts,
102
+
'totals': Totals,
103
+
'precisions': Precisions,
104
+
'bp': Brevity penalty,
105
+
'sys_len': predictions length,
106
+
'ref_len': reference length,
107
+
108
+
Here we can see that the ``sacrebleu`` metric expect a sequence of segments as predictions and a list of one or several sequences of segments as references.
109
+
110
+
You can find more information on the segments in the description, homepage and publication of ``sacrebleu`` which can be access with the respective attributes on the metric:
111
+
112
+
.. code-block::
113
+
>>> print(metric.description)
114
+
SacreBLEU provides hassle-free computation of shareable, comparable, and reproducible BLEU scores.
115
+
Inspired by Rico Sennrich's `multi-bleu-detok.perl`, it produces the official WMT scores but works with plain text.
116
+
It also knows all the standard test sets and handles downloading, processing, and tokenization for you.
117
+
118
+
See the [README.md] file at https://github.com/mjpost/sacreBLEU for more information.
119
+
120
+
>>> print(metric.homepage)
121
+
https://github.com/mjpost/sacreBLEU
122
+
>>> print(metric.citation)
123
+
@inproceedings{post-2018-call,
124
+
title = "A Call for Clarity in Reporting {BLEU} Scores",
125
+
author = "Post, Matt",
126
+
booktitle = "Proceedings of the Third Conference on Machine Translation: Research Papers",
127
+
month = oct,
128
+
year = "2018",
129
+
address = "Belgium, Brussels",
130
+
publisher = "Association for Computational Linguistics",
0 commit comments