全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Unknown Word Detection via Syntax Analyzer

DOI: 10.11591/ij-ai.v2i3.1802

Full-Text   Cite this paper   Add to My Lib

Abstract:

A knowledge resource is the central repository of data for all Natural Language Processing (NLP) applications and development of NLP applications mostly depend on coverage of knowledge resources. The multipurpose Myanmar Language Lexico-conceptual Knowledge Resource (ML2KR) and Myanmar function tagged corpus were developed as initial resources by using semiautomatic approach. ML2KR consists of Myanmar WordNet, Myanmar English bilingual computational lexicon and morphological processor. Myanmar language is morphologically rich and agglutinative language. Therefore, it is usually required to segment Myanmar texts prior to further processing. Segmentation has two main problems, word ambiguity that more than one meaning and unknown word occurrence that a word does not have in the lexicon. In this paper, we address on the unknown word occurrence issue. To detect the new unrestricted character patterns of words, character based parsing syntax analyzer is built by using Context Free Grammar (CFG). Firstly, unknown words are considered as a Name by Name Entity Recognition with forward and backward rule based approach. If the name does not agree with syntax analyzer, all possible unknown words are verified to update the lexicon and Myanmar WordNet.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133