Training Options
The following options are accepted by Travatar's train-travatar.pl script. There is a possibility that this documentation is out-of-date, so check inside travatar-train.pl to confirm the latest version.
-work_dir The working directory to use -travatar_dir The directory of Travatar -bin_dir A directory for external bin files (mainly GIZA) -threads The number of threads to use -src_file The source file you want to train on -src_words A file of plain-text sentences from the source -src_format The source file format (penn/egret) -trg_file The target file you want to train on -trg_words A file of plain-text sentences from the target -trg_format The target file format (word/penn/egret) -lex_srctrg of the source word given the target P(f|e) -lex_trgsrc of the target word given the source P(e|f) -align_file A file containing alignments -align The type of alignment to use (giza) -symmetrize The type of symmetrization to use (grow) -normalize Normalize rule counts to probabilities -binarize Binarize trees in a certain direction -compose The number of rules to compose -attach Where to attach rules -nonterm_len The maximum number of non-terminals in a rule -term_len The maximum number of terminals in a rule -nbest_rules The maximum number of rules for each source -tm_file An already created TM file -lm_file An already created LM file -config_file Where to output the configuration file -no_lm Indicates that no LM will be used