|
Polibits 2011
Knowledge Expansion of a Statistical Machine Translation System using Morphological ResourcesKeywords: machine translation, knowledge, morphological resources. Abstract: translation capability of a phrase-based statistical machine translation (pbsmt) system mostly depends on parallel data and phrases that are not present in the training data are not correctly translated. this paper describes a method that efflciently expands the existing knowledge of a pbsmt system without adding more parallel data but using external morphological resources. a set of new phrase associations is added to translation and reordering models; each of them corresponds to a morphological variation of the source/target/both phrases of an existing association. new associations are generated using a string similarity score based on morphosyntactic information. we tested our approach on en-fr and fr-en translations and results showed improvements of the performance in terms of automatic scores (bleu and meteor) and reduction of out-of-vocabulary (oov) words. we believe that our knowledge expansion framework is generic and could be used to add different types of information to the model.
|