%0 Journal Article
%T Robust Medical Test Evaluation Using Flexible Bayesian Semiparametric Regression Models
%A Adam J. Branscum
%A Wesley O. Johnson
%A Andre T. Baron
%J Epidemiology Research International
%D 2013
%I Hindawi Publishing Corporation
%R 10.1155/2013/131232
%X The application of Bayesian methods is increasing in modern epidemiology. Although parametric Bayesian analysis has penetrated the population health sciences, flexible nonparametric Bayesian methods have received less attention. A goal in nonparametric Bayesian analysis is to estimate unknown functions (e.g., density or distribution functions) rather than scalar parameters (e.g., means or proportions). For instance, ROC curves are obtained from the distribution functions corresponding to continuous biomarker data taken from healthy and diseased populations. Standard parametric approaches to Bayesian analysis involve distributions with a small number of parameters, where the prior specification is relatively straight forward. In the nonparametric Bayesian case, the prior is placed on an infinite dimensional space of all distributions, which requires special methods. A popular approach to nonparametric Bayesian analysis that involves Polya tree prior distributions is described. We provide example code to illustrate how models that contain Polya tree priors can be fit using SAS software. The methods are used to evaluate the covariate-specific accuracy of the biomarker, soluble epidermal growth factor receptor, for discerning lung cancer cases from controls using a flexible ROC regression modeling framework. The application highlights the usefulness of flexible models over a standard parametric method for estimating ROC curves. 1. Introduction Bayesian analysis is often used in the support of epidemiologic research [1每7]. A contemporary area of research in the population health sciences involves the development and application of statistical methods for evaluating the accuracy of medical tests. With binary outcome data, statistical methods focus on estimating sensitivity and specificity, while with quantitative data, standard objects of interest are the receiver operating characteristic (ROC) curve and area under the curve (AUC). The ROC curve can be regarded as a graphical portrayal of the degree of separation between the distributions of test outcomes for ※diseased§ and nondiseased populations. The formula for an ROC curve depends on and , the sensitivity and specificity of the test at classification threshold . We proceed under the innocuous assumption that test outcomes tend to be higher for individuals in the diseased population. Let denote a continuous test outcome for a disease , where disease status is labeled as for disease absent and for disease present. In general, can be any continuously measured classifier that varies according to a cumulative
%U http://www.hindawi.com/journals/eri/2013/131232/