全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
电子学报  2015 

特征的支持度与其分类能力的关系研究

DOI: 10.3969/j.issn.0372-2112.2015.02.007, PP. 248-254

Keywords: 频繁模式,分类,特征选择,信息增益

Full-Text   Cite this paper   Add to My Lib

Abstract:

频繁模式挖掘在分类问题中得到了广泛的应用,大量的工作利用频繁模式挖掘对分类问题进行特征选择,但对于为什么频繁模式挖掘可以在分类问题中进行有效的特征选择则缺乏系统的研究.为了为频繁模式挖掘在分类问题中的特征选择应用提供理论基础,需要确立特征的支持度与特征分类能力之间的关系,本文以特征的信息增益作为分类能力的评价准则,讨论其与特征支持度之间的联系.首先证明了信息增益是特征支持度的上凸函数;然后,在二类问题和多类问题情况下,分别证明了具有低支持度或高支持度的特征具有有限的信息增益,即具有低支持度或高支持度的特征具有有限的分类能力.最后,通过仿真实验验证了支持度与信息增益之间的关系,为频繁模式挖掘在分类问题中的应用提供了理论基础.

References

[1]  陈晓云,陈,王雷,李荣陆,胡运发.基于分类规则树的频繁模式文本分类[J].软件学报,2006,17 (5):1017-1025. Chen Xiaoyun,Chen Yi,Wang Lei,Li Ronglu,Hu Yunfa.Text categorization based on classification rules tree by frequent patterns[J].Journal of Software,2006,17(5):1017-1025.(in Chinese)
[2]  H Lodhi,C Saunders,J Shawe-Taylor,N Cristianini,C Watkins.Text classification using string kernels[J].Journal of Machine Learning Research,2002,2(3):419-444.
[3]  Y Li,S M Chung,J D Holt.Text document clustering based on frequent word meaning sequences[J].Data and Knowledge Engineering,2008,64(1):381-404.
[4]  赵建邦,董安国,高琳.一种用于生物网络数据的频繁模式挖掘算法[J].电子学报,2010,38(8):1803-1807. Zhao Jianbang,Dong Anguo,Gao Lin.An algorithm for frequent pattern mining in biological networks[J].Acta Electronica Sinica,2010,38(8):1803-1807.(in Chinese)
[5]  Young-Rae Cho,Aidong Zhang.Predicting protein function by frequent functional association pattern mining in protein interaction networks[J].IEEE Transactions on Information Technology in Biomedicine,2010,14(1):30-36.
[6]  R Alves,D R Baena,J S A Ruiz.Gene association analysis:a survey of frequent pattern mining from gene expression data[J].Briefings in Bioinformatics,2010,11(2):210-224.
[7]  Han Jiawei,Cheng Hong,Xin Dong,Yan Xifeng.Frequent pattern mining:current status and future directions[J].Journal of Data Mining and Knowledge Discovery,2007,15(1):55-86.
[8]  H Carl,F John.Sequential pattern mining-approaches and algorithms[J].ACM Computing Surveys,2013,45(2):1-19.
[9]  高琳,覃桂敏,周晓峰.图数据中频繁模式挖掘算法研究综述[J].电子学报,2008,36(8):1603-1609. Lin Gao,Guimin Qin,Xiaofeng Zhou.An overview of algorithms for mining frequent patterns in graph data[J].Acta Electronica Sinica,2008,36(8):1603-1809.(in Chinese)
[10]  万里,廖建新,朱晓民,倪萍.一种基于频繁模式的时间序列分类框架[J].电子与信息学报,2010,32(2):261-266. Li Wan,Jianxin Liao,Xiaomin Zhu,Ping Ni.A frequent pattern based times series classification framework[J].Journal of Electronics and Information Technology,2010,32(2):261-266.(in Chinese)
[11]  Lee Jae-Gil,Han Jiawei,Li Xiaolei,Cheng Hong.Mining discriminative patterns for classifying trajectories on road networks[J].IEEE Transactions on Knowledge and Data Engineering,2011,23(5):713-726.
[12]  C M Bishop.Pattern Recognition and Machine Learning[M].Springer Press,2006:55-57.
[13]  Hong Cheng,Xifeng Yan,Jiawei Han,Chih-Wei Hsu.Discriminative frequent pattern analysis for effective classification[A].In:IEEE 23rd International Conference on Data Engineering[C].Istanbul,Turkey,2007,716-725.
[14]  B Stephen,L Vandenberghe.Convex Optimization[M].Cambridge University Press,England.2004:136-146.
[15]  R S Michalski,I Mozetic,J Hong,N Lavrac.The multi-purpose incremental learning system AQ15 and its testing application to three medical domains[A].In Proceedings of the Fifth National Conference on Artificial Intelligence[C].Philadelphia,America.1986.1041-1045.
[16]  I Guyon,R Steve Gunn,A Ben-Hur,G Dror.Result analysis of the NIPS 2003 feature selection challenge[A].Proceedings on Advances in Neural Information Processing Systems[C].Vancouver,Canada.2004.545-552.
[17]  A R Rocha,R Sousa,G A Barreto,J S Cardoso.Diagnostic of pathology on the vertebral column with embedded reject option[A].Proceedings of the 5th Iberian Conference on Pattern Recognition and Image Analysis[C].Grancanaria,Spain.2011.588-595,

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133