全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Text Document Classification: An Approach Based on Indexing

Keywords: Text documents , Representation , Term sequence , Status Matrix , B-Tree , Classification

Full-Text   Cite this paper   Add to My Lib

Abstract:

In this paper we propose a new method of classifying text documents. Unlike conventional vector spacemodels, the proposed method preserves the sequence of term occurrence in a document. The termsequence is effectively preserved with the help of a novel datastructure called ‘Status Matrix’. Further thecorresponding classification technique has been proposed for efficient classification of text documents. Inaddition, in order to avoid sequential matching during classification, we propose to index the terms in Btree, an efficient index scheme. Each term in B-tree is associated with a list of class labels of thosedocuments which contain the term. Further the corresponding classification technique has beenproposed. To corroborate the efficacy of the proposed representation and status matrix basedclassification, we have conducted extensive experiments on various datasets.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413