Advanced Research Seminar I/III
Graduate School of Information Science
Nara Institute of Science and Technology
January/february 2014
Machine translation is a technology to automatically translate from one language to another, and is becoming more and more widely used as the accuracy is reaching more usable levels. In this course I will introduce the basics behind machine-translation in a hands-on manner. At the end of each class there will be an exercise where you try out what you learned on real data.
Prerequisites: basic probability, programming skills.
Optional Textbooks:
Jan 28, 30. Feb 4, 6. (9:20-10:50am) @ IS Building Room L2
Results | |||
---|---|---|---|
# | Username | LM Entropy | Method |
1 | clusthermal | 7.476017 | Kneser Ney Bigram |
2 | Spivak | 7.481626 | Linear Interpolation Trigram |
3 | Colorful black ideas (3p. team) | 7.562 | Modified Absolute Discounting Bigram |
4 | tmrevolutionx/delacroix/abhi | 7.710441 | Witten-Bell Bigram |
5 | yemantu | 7.733526 | Witten-Bell Bigram |
6 | s10018 | 7.747171 | Witten-Bell Bigram |
7 | gacon | 7.750392 | Witten-Bell Bigram |
8 | pongsakorn-u | 7.818588 | Linear Interpolation Bigram (adjusted weights) |
9 | 1351213 | 7.82259 | Linear Interpolation Bigram (adjusted weights) |
9 | Lenessia-Erhart-Cowen | 7.8226 | Linear Interpolation Bigram (adjusted weights) |
11 | Baseline | 7.883958 | Linear Interpolation Bigram |
Results | |||||
---|---|---|---|---|---|
# | Username | F-Measure | Precision | Recall | Method |
1 | Colorful black ideas | 0.453471 | 0.747608 | 0.325434 | Dice/Max-Score/Grow-Diag |
2 | s10018 | 0.444331 | 0.772623 | 0.311832 | Dice/Max-Score/Diag |
3 | Lenessia-Erhart-Cowen | 0.429068 | 0.870318 | 0.284717 | Dice+Absolute Discounting/Max-Score/Intersect |
4 | tmrevolutionx/delacroix/abhi | 0.425522 | 0.491179 | 0.375348 | Dice/Competitive Linking |
5 | gacon | 0.424353 | 0.70838 | 0.302903 | Dice/Competitive Linking |
6 | pongsakorn-u | 0.420217 | 0.848739 | 0.279234 | Dice/Max-Score/Diag |
7 | Spivak | 0.420003 | 0.838293 | 0.280193 | Dice/Max-Score+Threshold/Intersect |
8 | Baseline | 0.419585 | 0.857011 | 0.277796 | Dice/Max-Score/Intersect |
9 | clusthermal/1351213 | 0.414811 | 0.779329 | 0.28262 | Model 1/Max-Score/Intersect |
10 | yemantu | 0.394243 | 0.467954 | 0.340594 | Dice/Competitive Linking |
Final Results | |||
---|---|---|---|
# | Username | BLEU | Method |
1 | clusthermal/gacon/1351213 | 0.059281 | Grow-Diag Alignment/Trigram LM/Beam Width 200/Distortion Limit 10/Parameter Tuning |
2 | s10018 | 0.056561 | Big Alignment Data/Witten Bell LM/Phrase Length 6/Parameter Tuning/Beam Width 110/Distortion Limit 10/Forward/Backward Phrase Probabilities |
3 | Colorful black ideas | 0.055404 | 4-gram LM/Distortion Limit 12/Parameter Tuning |
4 | Lenessia-Erhart-Cowen | 0.051152 | Katz-Brown Preordering |
5 | tmrevolutionx/delacroix/abhi | 0.050997 | Grow-Diag Alignment/Translation Parameters |
6 | yemantu | 0.0501 | Distortion Limit 10 |
7 | pongaskorn-u | 0.047973 | Distortion Limit 7/LM Parameters |
8 | Baseline | 0.046933 | |
9 | Spivak | 0 | Trigram LM |