Supplementary Material for
On the Elements of an Accurate Tree-to-String Machine Translation System
Graham Neubig, Kevin Duh
Proceedings of ACL 2014 (PDF)
This is the supplementary material for the paper "On the Elements of an Accurate Tree-to-String Machine Translation System" presented at ACL 2014. In summary, the paper showed the following three elements are very important to obtain high accuracy when using tree-to-string translation systems, with experiments on patent data for English-Japanese and Japanese-English:
Unfortunately the data used in the paper is from NTCIR, which requires permission to use (although this permission is free), so I cannot provide the data to reproduce the experiments. I can, however provide data from the Kyoto Free Translation Task and scripts to achieve similar results. The files and results can be found below:
All the following results are for English-Japanese with the Travatar decoder. For reference a phrase based system with Moses using Nile got a score of BLEU=22.23, RIBES=68.00.
Align: GIZA (-)/Nile (+) | - | - | - | - | + | + | + | + |
Parse: Trees (-)/Forests (+) | - | - | + | + | - | - | + | + |
Search: CP (-)/HS (+) | - | + | - | + | - | + | - | + |
BLEU: | 21.61 | 22.36 | 22.67 | 23.04 | 23.41 | 23.81 | 24.00 | 24.78 |
RIBES: | 71.27 | 71.59 | 71.47 | 71.43 | 72.62 | 72.60 | 73.50 | 73.64 |