Supplementary Material for
On the Elements of an Accurate Tree-to-String Machine Translation System
Graham Neubig, Kevin Duh
Proceedings of ACL 2014 (PDF)

This is the supplementary material for the paper "On the Elements of an Accurate Tree-to-String Machine Translation System" presented at ACL 2014. In summary, the paper showed the following three elements are very important to obtain high accuracy when using tree-to-string translation systems, with experiments on patent data for English-Japanese and Japanese-English:

Unfortunately the data used in the paper is from NTCIR, which requires permission to use (although this permission is free), so I cannot provide the data to reproduce the experiments. I can, however provide data from the Kyoto Free Translation Task and scripts to achieve similar results. The files and results can be found below:

Data and (messy) Scripts

Results

All the following results are for English-Japanese with the Travatar decoder. For reference a phrase based system with Moses using Nile got a score of BLEU=22.23, RIBES=68.00.

Align: GIZA (-)/Nile (+) ----++++
Parse: Trees (-)/Forests (+)--++--++
Search: CP (-)/HS (+) -+-+-+-+
BLEU: 21.6122.3622.6723.0423.4123.8124.0024.78
RIBES:71.2771.5971.4771.4372.6272.6073.5073.64