Ergun Bicici
7 years ago
Dear Moses maintainers,
I discovered that the translations obtained differ when alignment
flags (--mark-unknown
--unknown-word-prefix UNK --print-alignment-inf) are used. Comparison table
is attached (en-ru and ru-en are being recomputed). We expect them to be
the same since alignment flags only print additional information and they
are not supposed to alter decoding. In both, the same EMS system was re-run
with the alignment information flags or not.
- Average of the absolute difference is 0.0094 BLEU (about 1 BLEU
points).
- Average of the difference is 0.0051 BLEU (about 0.5 BLEU points,
results are better with alignment flags).
ᅩ
/opt/Programs/SMT/moses/mosesdecoder/bin/moses --version
Moses code version (git tag or commit hash):
mmt-mvp-v0.12.1-2775-g65c75ff07-dirty
Libraries used:
Boost version 1.62.0
git status
On branch RELEASE-4.0
Your branch is up to date with 'origin/RELEASE-4.0'.
Note: Using alignment information to recase tokens was tried in [1] for
en-fi and en-tr to claim positive results. We tried this method in all
translation directions we considered as as can be seen in the align row,
this only improves the performance for tr-en and en-tr and for tr-en Moses
provides better translations without the alignment flags.
[1]The JHU Machine Translation Systems for WMT 2016
Shuoyang Ding, Kevin Duh, Huda Khayrallah, Philipp Koehn and Matt Post
http://www.statmt.org/wmt16/pdf/W16-2310.pdf
Best Regards,
Ergun
Ergun Biçici
http://bicici.github.com/ <http://ergunbicici.blogspot.com/>
I discovered that the translations obtained differ when alignment
flags (--mark-unknown
--unknown-word-prefix UNK --print-alignment-inf) are used. Comparison table
is attached (en-ru and ru-en are being recomputed). We expect them to be
the same since alignment flags only print additional information and they
are not supposed to alter decoding. In both, the same EMS system was re-run
with the alignment information flags or not.
- Average of the absolute difference is 0.0094 BLEU (about 1 BLEU
points).
- Average of the difference is 0.0051 BLEU (about 0.5 BLEU points,
results are better with alignment flags).
ᅩ
/opt/Programs/SMT/moses/mosesdecoder/bin/moses --version
Moses code version (git tag or commit hash):
mmt-mvp-v0.12.1-2775-g65c75ff07-dirty
Libraries used:
Boost version 1.62.0
git status
On branch RELEASE-4.0
Your branch is up to date with 'origin/RELEASE-4.0'.
Note: Using alignment information to recase tokens was tried in [1] for
en-fi and en-tr to claim positive results. We tried this method in all
translation directions we considered as as can be seen in the align row,
this only improves the performance for tr-en and en-tr and for tr-en Moses
provides better translations without the alignment flags.
[1]The JHU Machine Translation Systems for WMT 2016
Shuoyang Ding, Kevin Duh, Huda Khayrallah, Philipp Koehn and Matt Post
http://www.statmt.org/wmt16/pdf/W16-2310.pdf
Best Regards,
Ergun
Ergun Biçici
http://bicici.github.com/ <http://ergunbicici.blogspot.com/>