全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
-  2019 

Proportional Classification Revisited: Automatic Content Analysis of Political Manifestos Using Active Learning

DOI: 10.1177/0894439318758389

Keywords: content analysis,active learning,proportional classification,text classification,text as data,supervised machine learning,computer-assisted content analysis,computational social science,big data

Full-Text   Cite this paper   Add to My Lib

Abstract:

Supervised machine learning is a promising methodological innovation for content analysis (CA) to approach the challenge of ever-growing amounts of text in the digital era. Social scientists have pointed to accurate measurement of category proportions and trends in large collections as their primary goal. Proportional classification, for example, allows for time-series analysis of diachronic data sets or correlation of categories with text-external covariates. We evaluate the performance of two common approaches for this goal: a method based on regression analysis with feature profiles from entire collections and a method aggregating classifier decisions for individual documents. For both, we observed a significant negative effect on classification performance due to the uneven distribution of characteristic language structures within the text collection. For proportional classification, this poses considerable problems. To fix this problem, we propose a workflow of active learning, which alternates between machine learning and human coding. Results from experiments with empirical data (political manifestos) demonstrate that active learning enables researchers to create training sets for automatic CA efficiently, reliably, and with high accuracy for the desired goal while retaining control over the automatic process

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413