%0 Journal Article %T Modelling Key Population Attrition in the HIV and AIDS Programme in Kenya Using Random Survival Forests with Synthetic Minority Oversampling Technique-Nominal Continuous %A Evan Kahacho %A Charity Wamwea %A Bonface Malenje %A Gordon Aomo %J Journal of Data Analysis and Information Processing %P 11-36 %@ 2327-7203 %D 2023 %I Scientific Research Publishing %R 10.4236/jdaip.2023.111002 %X HIV and AIDS has continued to be a major public health concern, and hence one of the epidemics that the world resolved to end by 2030 as highlighted in sustainable development goals (SDGs). A colossal amount of effort has been taken to reduce new HIV infections, but there are still a significant number of new infections reported. HIV prevalence is more skewed towards the key population who include female sex workers (FSW), men who have sex with men (MSM), and people who inject drugs (PWID). The study design was retrospective and focused on key population enrolled in a comprehensive HIV and AIDS programme by the Kenya Red Cross Society from July 2019 to June 2021. Individuals who were either lost to follow up, defaulted (dropped out, transferred out, or relocated) or died were classified as attrition; while those who were active and alive by the end of the study were classified as retention. The study used density analysis to determine the spatial differences of key population attrition in the 19 targeted counties, and used Kilifi county as an example to map attrition cases in smaller administrative areas (sub-county level). The study used synthetic minority oversampling technique-nominal continuous (SMOTE-NC) to balance the datasets since the cases of attrition were much less than retention. The random survival forests model was then fitted to the balanced dataset. The model correctly identified attrition cases using the predicted ensemble mortality and their survival time using the estimated Kaplan-Meier survival function. The predictive performance of the model was strong and way better than random chance with concordance indices greater than 0.75. %K Random Survival Forests %K Synthetic Minority Oversampling Technique-Nominal Continuous (SMOTE-NC) %K Key Population %K Female Sex Workers (FSW) %K Men Who Have Sex with Men (MSM) %K People Who Inject Drugs (PWID) %U http://www.scirp.org/journal/PaperInformation.aspx?PaperID=122693