The Kyoto Free Translation Task (KFTT)

by Graham Neubig (my last name at gmail.com)

Welcome to the home of the Kyoto Free Translation Task (KFTT). The KFTT is a task for the evaluation and development of Japanese-English machine translation systems. It is designed to allow for free, simple to start, reproducible, and progressive research. Details about the task's principles, the features of the data being used, how to start, and more are listed below.

Principles
Data Details
Getting Started

Data+System Download
Data Only
Word Segmenter Only
Alignment Data

Tracks (Leader Board)
References/Citations
Version History

Principles

The Kyoto Free Translation Task was designed around four principles. That research should be free, should be simple, should be reproducible, and should be progressive. Specifically:

Free is in the sense of the open source movement, in that it both costs no money, and that anyone can do it at any time. The KFTT is free in both senses in that we are providing the necessary files free of charge, and that anyone can report their results and get published on the official page at any time.
Simple means that you can get started doing research with a minimum of overhead. Modern machine translation systems are complicated, and often require that you go through a variety of steps to get them running. For the KFTT, we will try to make sure that once you have installed all necessary third party software you can prepare data and build a baseline system with only a single script.
Reproducible research means that all of your code is released to be independently confirmed by a third party. All participants in the "open" track of the KFTT will release their code to ensure that gains can be verified. However, there will also be a "closed" track for organizations that are not able to participate in the open track due to legal restrictions, etc.
Progressiveness is the very essence of research: "standing on the shoulders of giants." However, in many cases, research results are reported using baseline methods that are ten years old. In the KFTT, all code in the open track will be released, and can thus be used as a baseline to be compared with or improved upon.

Data Details

Original Data

The Kyoto Free Translation Task is a task for Japanese-English translation that focuses on Wikipedia articles related to Kyoto. The data used was originally prepared by the National Institute for Information and Communication Technology (NICT) and released as the Japanese-English Bilingual Corpus of Wikipedia's Kyoto Articles (we are simply using the data, NICT does not specifically endorse or sponsor this task). The features of the data are as follows:

It is Japanese->English data translated and checked by professional translators. The Japanese-English language pair is an interesting and difficult one due to the divergence between the languages. It is one of the language pairs that still needs a lot of work (as of Feb. 2011).
It is encyclopedic text in a specialized domain. As a result, there are a number of interesting and difficult phenomena that need to be tackled such as transliteration and semantic translation of concepts that exist in the source but not the target language.
It is freely distributable under the Creative Commons Attribution-Share-Alike License 3.0. The processed data used by this task is also all distributable under the same conditions.

Data Processing

In order to make it more simple to perform machine translation on this data, performed a number of processing steps.

Extracting the data into a format usable by most machine translation systems.
Tokenizing the data into words. English tokenizing was performed using scripts included with the Moses toolkit, and Japanese tokenization was performed using a model for KyTea that was lightly adapted to the Kyoto domain (the model is included in the package).
Separating the data into training, tuning, development, and test sets. The training data should be used for training statistical models, tuning data used for tuning weights, development data used for testing the system in development, and testing data used for reporting final results.
Cleaning the training data to remove sentences with less than 1 or more than 40 words.

After this processing, the size of the data is as follows (ver. 1.0):

	Articles	Sentences	Japanese Words	English Words
Train	14126	440k	12.0M	11.5M
Train (clean)	14126	330k	6.09M	5.91M
Tune	15	1235	34.4k	30.8k
Dev	15	1166	26.8k	24.3k
Test	15	1160	28.5k	26.7k

Getting Started

Data + Baseline System

In order to get started on the Kyoto Free Translation Task, you need to do three things, download the data, install the third party software that makes it work, and run the script that compiles the data and trains a baseline system. The data and compilation scripts can be downloaded here:

The two current baselines are the KFTT Moses Baseline, v. 1.4, which is a standard Moses setup, and KFTT lader, v. 1.0 which uses lader for pre-ordering. Using lader will give you the best results, but is somewhat computationally intensive.

Previous versions: v. 1.3 v. 1.2 v. 1.1 v. 1.0

Please note that this data is distributed under the Creative Commons Attribution-Share-Alike License 3.0 license. Next, you need to download and install the third party software that makes the system work.

GIZA++: A tool for unsupervised word alignment.
Moses: A decoder for phrase-based machine translation.
SRILM: A toolkit for building language models.
KyTea: A word segmenter (tokenizer) for Japanese (v. 0.4.0+).
lader: A pre-orderer that can help fix the reordering problems between English and Japanese.
MT helper scripts: A variety of scripts that are useful for Machine translation.

You should also set two environmental variables:

SCRIPTS_ROOTDIR must be set to the location of Moses support scripts (reference).
MT_SCRIPTS must be set to the directory containing the MT helper scripts.

Once you have finished installing the third party software, expand the KFTT's .tar.gz file and run process.sh. Note that if you have installed any of the third party tools in an unusual place, you can modify process.sh to point to them appropriately. Also, the training may take a significant amount of time (24 hours?), so try to run it on a fast machine with several GB of memory if possible.

$ tar -xzf kfft-XXX.tar.gz
$ cd kfft-XXX
$ nohup ./process.sh &> process.log &

When the script has finished running, if everything has gone well, you will see a report of the final BLEU scores at the end of process.log, which should match the reported scores. If you get stuck please feel free to contact me (Graham) at any time.

Data Only

When using the Kyoto Free Translation Task for research, we recommend that you go through the steps to build the full baseline system. But if you only need the data for whatever reason, it can be downloaded via the below link.

Kyoto Free Translation Task (Data Only v. 1.0)

Segmenter Only

If you would just like the word segmenter used for Japanese tokenization, download KyTea (v. 0.4.0+) and use it with the KFTT segmentation model.

Alignment Data

If you would like to test the accuracy of a word alignment method on the KFTT data, or would like to train a supervised alignment system, you can use the following data. All alignments were created by two annotators then checked for consistency over 1235 sentences of the tuning set.

Kyoto Free Translation Task Japanese-English Alignment Data

Tracks (Leader Board)

The goal of this task is to create a standard way for people to compare and improve Japanese-English translation systems. In order to do so, we are taking a friendly competition format, where participants try to compete to improve results. There are two tracks:

For the open track, you must submit a package including any software that you used and instructions, I will reproduce your results, and post them on the web site if you approve. It doesn't matter if you have published a paper about the results or not. We strongly encourage participants to join this track if possible.
For the closed track, you do not need to make your software public, but the results must be published in a peer-reviewed forum, or your system must be available through a web API.

Results are measured by BLEU Score. Starting at version 1.3, both English and Japanese results are measured with lowercased, tokenized text (previously English results were detokenized and cased text, and are shown in light grey).

Open

Japanese -> English
Date	System Name	Participants	Institution	dev BLEU	test BLEU	Comment
2012-9-2	KyTea/GIZA++/Moses/Lader 1.0	Graham Neubig	NAIST	16.93	19.35	KyTea/GIZA++/Moses with preordering using lader
2012-4-9	KyTea/GIZA++/Moses 1.3/1.4	Graham Neubig	NAIST	15.41	17.68	KyTea v. 0.4.1/GIZA++/Moses baseline
2012-2-4	KyTea/GIZA++/Moses 1.2	Graham Neubig	Kyoto University	9.40	10.53	KyTea v. 0.4.0/GIZA++/Moses baseline
2011-5-16	KyTea/GIZA++/Moses 1.1	Graham Neubig	Kyoto University	8.98	10.58	KyTea v. 0.3.0/GIZA++/Moses baseline

English -> Japanese
Date	System Name	Participant Names	Institution	dev BLEU	test BLEU	Comment
2012-9-2	KyTea/GIZA++/Moses/Lader 1.0	Graham Neubig	NAIST	21.08	23.15	KyTea/GIZA++/Moses with preordering using lader
2012-4-9	KyTea/GIZA++/Moses v. 1.3/1.4	Graham Neubig	NAIST	19.24	21.03	KyTea v. 0.4.1/GIZA++/Moses baseline
2012-2-4	KyTea/GIZA++/Moses v. 1.2	Graham Neubig	Kyoto University	19.00	20.85	KyTea v. 0.4.0/GIZA++/Moses baseline
2011-5-16	KyTea/GIZA++/Moses v. 1.1	Graham Neubig	Kyoto University	18.70	20.32	KyTea v. 0.3.0/GIZA++/Moses baseline

Closed

Japanese -> English
Date	System Name	Participants	Institution	dev BLEU	test BLEU	Comment
2011-2-18	Google Translate	Graham Neubig	Kyoto University	5.25	5.27	Google Translate results from 2011-2-18.
2011-2-18	Excite Translate	Graham Neubig	Kyoto University	3.83	4.31	Excite Translate results from 2011-2-18.

English -> Japanese
Date	System Name	Participant Names	Institution	dev BLEU	test BLEU	Comment
2011-2-18	Google Translate	Graham Neubig	Kyoto University	11.43	11.53	Google Translate results from 2011-2-18. Japanese results were re-segmented using KyTea for evaluation.
2011-2-18	Excite Translate	Graham Neubig	Kyoto University	6.40	7.25	Excite Translate results from 2011-2-18. Japanese results were re-segmented using KyTea for evaluation.

References/Citations

This section contains references for this task and the systems performing in it.

Task

If you would like to cite this task when writing a paper, please use the following information:

Graham Neubig, "The Kyoto Free Translation Task," http://www.phontron.com/kftt, 2011.

@misc{neubig11kftt,
	author = {Graham Neubig},
	title = {The {Kyoto} Free Translation Task},
	howpublished = {http://www.phontron.com/kftt},
	year = {2011}
}

Systems

If you perform research using this task, please contact me and I will list it here.

Version History

KyTea/GIZA++/Moses Version 1.4 (2013-5-11)

In version 1.3 some zero-length sentences were extracted from the XML file, so now only sentences with length greater than zero are extracted (thanks to Tetsuo Kiso for the patch!).

KyTea/GIZA++/Moses/Lader Version 1.0 (2012-9-2)

This version introduces the pre-ordering of Neubig et al. (2012) as implemented by lader, which significantly improves over the Moses baseline.

KyTea/GIZA++/Moses Version 1.3 (2012-4-9)

Previous versions evaluated English results using recased and detokenized text, but this led to unstable and extremely low BLEU scores. As a result versions 1.3 and up will use tokenized and uncased text.

KyTea/GIZA++/Moses Version 1.2 (2012-2-4)

Upgraded the task to work with KyTea version 0.4.0.

KyTea/GIZA++/Moses Version 1.1 (2011-5-16)

Version 1.0 didn't work with the latest version of KyTea, so this is fixed (thanks to Atsushi Fujita for pointing this out).

Version 1.0 (2011-2-18)

The initial release of the task.