%0 Journal Article %T Domain Adaptation for Opinion Classification: A Self-Training Approach %A Ning Yu %J Journal of Information Science Theory and Practice %D 2013 %I Korea Institute of Science and Technology Information %R 10.1633/jistap.2013.1.1.1 %X Domain transfer is a widely recognized problem for machine learning algorithms because models built upon one data domain generally do not perform well in another data domain. This is especially a challenge for tasks such as opinion classification, which often has to deal with insufficient quantities of labeled data. This study investigates the feasibility of self-training in dealing with the domain transfer problem in opinion classification via leveraging labeled data in non-target data domain(s) and unlabeled data in the target-domain. Specifically, self-training is evaluated for effectiveness in sparse data situations and feasibility for domain adaptation in opinion classification. Three types of Web content are tested: edited news articles, semi-structured movie reviews, and the informal and unstructured content of the blogosphere. Findings of this study suggest that, when there are limited labeled data, self-training is a promising approach for opinion classification, although the contributions vary across data domains. Significant improvement was demonstrated for the most challenging data domain-the blogosphere-when a domain transfer-based self-training strategy was implemented. %K Domain adaptation %K Opinion classification %K Self-training %K Semi-supervised learning %K Sentiment analysis %K Machine learning %U http://dx.doi.org/10.1633/JISTaP.2013.1.1.1