|
计算机应用研究 2013
Research on out-of-vocabulary words'recognition in Uyghur-Chinese machine translation
|
Abstract:
Aimed at the phenomenon that there are so many out-of-vocabulary words in Uyghur-Chinese machine translation and the situation that the Uyghur language resources are very scarce, combined the features of Uyghur and string similarity algorithms, the paper presented an out-of-vocabulary word recognition model of Uyghur-Chinese machine translation which based on string similarity algorithms. With the help of phrase based model's phrase table, and the external dictionary, the model computed the maximum strings similarity between the out-of-vocabulary word and the Uyghur words' in phrase table and dictionary, got the translation corresponding to the Uyghur word. The experiments show that compared with the out-of-vocabulary words recognition method which based on word segmentation, this model is better retaining the words' information, and also improves the quality of the translation.