全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Objectivity, Reliability, and Validity of Search Engine Count Estimates

Keywords: Data quality , goodness criteria , Web mining , search engines , search engine counts

Full-Text   Cite this paper   Add to My Lib

Abstract:

Count estimates ("hits") provided by Web search engines have received much attention as a yardstick to measure a variety of phenomena of interest as diverse as, e.g., language statistics, popularity of authors, or similarity between words. Common to these activities is the intention to use Web search engines not only for search but for ad hoc measurement. Using search engine count estimates (SECEs) in this way means that a phenomenon of interest, e.g., the popularity of an author, is conceived of as a measurand, and SECEs are taken to be its quantitative measures. However, the data quality of SECEs has not yet been studied systematically, and concerns have been raised against the use of this kind of data. This article examines the data quality of SECEs focusing on classical goodness criteria, i.e., objectivity, reliability, and validity. The results of a series of studies indicate that with the exception of Boolean queries that use disjunction or negation objectivity as well as test-retest reliability and parallel-test reliability of SECEs is good for most types of browsers and search engines examined. Estimation of validity required model development (all-subsets regression) revealing satisfying results by using an explorative approach to feature selection. The ndings are discussed in the light of previous objections and perspectives for using Web search count estimates are delineated.

Full-Text

Contact Us

[email protected]

QQ:3279437679

WhatsApp +8615387084133