全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
-  2018 

基于知识图谱和LDA模型的社会媒体数据抽取

DOI: 10.3969/j.issn.1000-5641.2018.05.016

Keywords: 社会媒体数据, 数据抽取, 隐含狄利克雷分配, 知识图谱
Key words: social media data extraction LDA (Latent Dirichlet Allocation) knowledge graph

Full-Text   Cite this paper   Add to My Lib

Abstract:

摘要 社会媒体数据的抽取,是社会舆论集散、新闻信息传播、企业品牌推广、商业营销拓展等研究和应用的基础,准确的抽取结果是数据分析有效性的重要保证.本文针对社会媒体数据的非结构、多主题特征,基于LDA(Latent DirichletAllocation)模型挖掘数据中的隐含主题,利用数据特征词序列和知识图谱描述的实体及实体间的关联关系,实现对特定领域数据的抽取.建立在"今日头条"新闻数据和新浪微博数据之上的实验结果表明,本文提出的方法能有效地实现社会媒体数据的抽取.
Abstract:Social media data extraction forms the basis of research and applications related to public opinion, news dissemination, corporate brand promotion, commercial marketing development, etc. Accurate extraction results are critical to guarantee the effectiveness of the data analysis. In this paper, we analyze the underlying topics in data based on the LDA (Latent Dirichlet Allocation) model; we further implement data extraction in specific domains by adopting featured word sequences and knowledge graphs that describe entities and relevant relationships. Experimental results using "Headline Today" news and Sina Weibo data show that our proposed method can be used to extract social media data effectively.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133