OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Journal of Probability and Statistics 2013

A Survey Design for a Sensitive Binary Variable Correlated with Another Nonsensitive Binary Variable

DOI: 10.1155/2013/827048

Jun-Wu Yu,Yang Lu,Guo-Liang Tian

Full-Text Cite this paper Add to My Lib

Abstract:

Tian et al. (2007) introduced a so-called hidden sensitivity model for evaluating the association of two sensitive questions with binary outcomes. However, in practice, we sometimes need to assess the association between one sensitive binary variable (e.g., whether or not a drug user, the number of sex partner being ？1 or >1, and so on) and one nonsensitive binary variable (e.g., good or poor health status, with or without cervical cancer, and so on). To address this issue, by sufficiently utilizing the information contained in the non-sensitive binary variable, in this paper, we propose a new survey scheme, called combination questionnaire design/model, which consists of a main questionnaire and a supplemental questionnaire. The introduction of the supplemental questionnaire which is indeed a design of direct questioning can effectively reduce the noncompliance behavior since more respondents will not be faced with the sensitive question. Likelihood-based inferences including maximum likelihood estimates via the expectation-maximization algorithm, asymptotic confidence intervals, and bootstrap confidence intervals of parameters of interest are derived. A likelihood ratio test is provided to test the association between the two binary random variables. Bayesian inferences are also discussed. Simulation studies are performed, and a cervical cancer data set in Atlanta is used to illustrate the proposed methods. 1. Introduction Warner [1] introduced a randomized response technique to obtain truthful answers to questions with sensitive attributes. Using the Warner design, Kraemer [2] derived a bivariate correlation between an attribute with polytomous responses and an attribute with normally distributed responses. Fox and Tracy [3] derived estimation of the Pearson product moment correlation coefficient between two sensitive questions by assuming that randomized response observations can be treated as individual-level scores that are contaminated by random measurement error. Edgell et al. [4] considered the correlation between two sensitive questions using the unrelated question design or the additive constants design. Christofides [5] presented a randomized response technique with two randomization devices to estimate the proportion of individuals having two sensitive characteristics at the same time. Kim and Warde [6] considered a multinomial randomized response model which can handle untruthful responses. They also derived the Pearson product moment correlation estimator which may be used to quantify the linear relationship between two variables when

References

[1]	S. L. Warner, “Randomized response: a survey technique for eliminating evasive answer bias,” Journal of the American Statistical Association, vol. 60, no. 309, pp. 63–69, 1965.
[2]	H. C. Kraemer, “Estimation and testing of bivariate association using data generated by the randomized response technique,” Psychological Bulletin, vol. 87, no. 2, pp. 304–308, 1980.
[3]	J. A. Fox and P. E. Tracy, “Measuring associations with randomized response,” Social Science Research, vol. 13, no. 2, pp. 188–197, 1984.
[4]	S. E. Edgell, S. Himmelfarb, and D. J. Cira, “Statistical efficiency of using two quantitative randomized response techniques to estimate correlation,” Psychological Bulletin, vol. 100, no. 2, pp. 251–256, 1986.
[5]	T. C. Christofides, “Randomized response technique for two sensitive characteristics at the same time,” Metrika, vol. 62, no. 1, pp. 53–63, 2005.
[6]	J. M. Kim and W. D. Warde, “Some new results on the multinomial randomized response model,” Communications in Statistics: Theory and Methods, vol. 34, no. 4, pp. 847–856, 2005.
[7]	G. J. L. M. Lensvelt-Mulders, J. J. Hox, P. G. M. van der Heijden, and C. J. M. Maas, “Meta-analysis of randomized response research: thirty-five years of validation,” Sociological Methods & Research, vol. 33, no. 3, pp. 319–348, 2005.
[8]	G. L. Tian, J. W. Yu, M. L. Tang, and Z. Geng, “A new non-randomized model for analysing sensitive questions with binary outcomes,” Statistics in Medicine, vol. 26, no. 23, pp. 4238–4252, 2007.
[9]	A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society B, vol. 39, no. 1, pp. 1–38, 1977.
[10]	G.-L. Tian, K. W. Ng, and Z. Geng, “Bayesian computation for contingency tables with incomplete cell-counts,” Statistica Sinica, vol. 13, no. 1, pp. 189–206, 2003.
[11]	K. W. Ng, M. L. Tang, M. Tan, and G. L. Tian, “Grouped Dirichlet distribution: a new tool for incomplete categorical data analysis,” Journal of Multivariate Analysis, vol. 99, no. 3, pp. 490–509, 2008.
[12]	D. R. Cox and D. Oakes, Analysis of Survival Data, Monographs on Statistics and Applied Probability, Chapman & Hall, London, UK, 1984.
[13]	M. A. Tanner and W. H. Wong, “The calculation of posterior distributions by data augmentation,” Journal of the American Statistical Association, vol. 82, no. 398, pp. 528–550, 1987.
[14]	M. A. Tanner, Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood Functions, Springer Series in Statistics, Springer, New York, NY, USA, 3rd edition, 1996.
[15]	B. Efron and R. J. Tibshirani, An Introduction to the Bootstrap, vol. 57 of Monographs on Statistics and Applied Probability, Chapman & Hall, New York, NY, USA, 1993.
[16]	A. Agresti, Categorical Data Analysis, Wiley Series in Probability and Statistics, Wiley-Interscience, New York, NY, USA, 2nd edition, 2002.
[17]	G. D. Williamson and M. Haber, “Models for three-dimensional contingency tables with completely and partially cross-classified data,” Biometrics, vol. 50, no. 1, pp. 194–203, 1994.
[18]	J. W. Yu, G. L. Tian, and M. L. Tang, “Two new models for survey sampling with sensitive characteristic: design and analysis,” Metrika, vol. 67, no. 3, pp. 251–263, 2008.
[19]	M. L. Tang, G. L. Tian, N. S. Tang, and Z. Q. Liu, “A new non-randomized multi-category response model for surveys with a single sensitive question: design and analysis,” Journal of the Korean Statistical Society, vol. 38, no. 4, pp. 339–349, 2009.
[20]	Y. Liu and G. L. Tian, “Multi-category parallel models in the design of surveys with sensitive questions,” Statistics and Its Interface, vol. 6, no. 1, pp. 137–149, 2013.
[21]	G. L. Tian, “A new non-randomized response model: the parallel model,” Statistica Neerlandica. In press.
[22]	K. W. Ng, G. L. Tian, and M. L. Tang, Dirichlet and Related Distributions: Theory, Methods and Applications, Wiley Series in Probability and Statistics, John Wiley & Sons, Chichester, UK, 2011.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133