|
An empirical comparative study of medical entity recognition Une étude comparative empirique sur la reconnaissance des entités médicalesKeywords: Medical Entity Recognition , Information Extraction , Machine Learning Abstract: Several research efforts tackled medical entity recognition from texts. However, to our knowledge, there are no comparative studies for the following approaches: (i) the extraction of noun phrases in an independent step before the final categorization step and (ii) identifying simultaneously entity boundaries and categories. In this paper, we focus on these approaches by experimenting with different methods based on rules and/or machine learning techniques. We compare their performance and evaluate their scalability on two standard medical corpora. The results confirm that machine learning methods are more robust than rule-based ones provided that a sufficient number of examples is available. They also point out the lack of scalability of such methods on corpora of different genres. Hybrid methods combining statistical and semantic techniques allow improving the performance obtained by machine learning
|