%0 Journal Article %T Feature Selection in Predictive Modeling: A Systematic Study on Drug Response Heterogeneity for Type II Diabetic Patients %A Fei Wang %A James Flory %A Jingyuan Chou %J Archive of "AMIA Summits on Translational Science Proceedings". %D 2019 %X With the rapid development of computer hardware and software technologies, more and more electronic health data from insurance claims, clinical trials and hospitals are becoming readily available. These data provide a rich resource for developing various healthcare analytics algorithms, among which predictive modeling is of key importance in many real health problems. One important issue for data-driven predictive modeling is high dimensionality, and feature selection is one effective strategy to reduce the number of independent variables and control the confounding factors. However, most of the existing studies just pick one feature selection approach without comprehensive investigations. In this paper, we investigate the issue of drug response heterogeneity for type II diabetes mellitus (T2DM) patients using a large scale clinical trial data. Our goal is to find out the important factors that may lead to the response heterogeneity for three popular T2DM drugs, Metformin, Rosiglitazone and Glimepiride. We implemented 8 different feature selection approaches and compared their performances with various measures including prediction error and the consistency of the identified important factors. Finally, we ensemble all factor lists picked by different algorithms and obtain a final set of factors that contribute to the drug response heterogeneities and verified them through existing literature %U https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6568100/