Modelling Key Population Attrition in the HIV and AIDS Programme in Kenya Using Random Survival Forests with Synthetic Minority Oversampling Technique-Nominal Continuous
HIV and AIDS has continued to be a major public
health concern, and hence one of the epidemics that the world resolved to end
by 2030 as highlighted in sustainable development goals (SDGs). A colossal
amount of effort has been taken to reduce new HIV infections, but there are
still a significant number of new infections
reported. HIV prevalence is more skewed towards the key population who
include female sex workers (FSW), men who have sex with men (MSM), and people
who inject drugs (PWID). The study design was retrospective and focused on key
population enrolled in a comprehensive HIV and AIDS programme by the Kenya Red
Cross Society from July 2019 to June 2021. Individuals who were either lost to
follow up, defaulted (dropped out, transferred out, or relocated) or died were
classified as attrition; while those who were active and alive by the end of
the study were classified as retention. The
study used density analysis to determine the spatial differences of key population attrition in the 19 targeted counties, and used Kilifi county as an example to map attrition cases in smaller
administrative areas (sub-county level). The study used synthetic
minority oversampling technique-nominal continuous (SMOTE-NC) to balance the datasets since the cases of
attrition were much less than retention. The random survival forests model was
then fitted to the balanced dataset. The model correctly identified attrition
cases using the predicted ensemble mortality and their survival time using the
estimated Kaplan-Meier survival function. The predictive performance of the
model was strong and way better than random chance with concordance indices
greater than 0.75.
References
[1]
UNAIDS (2020) Unaids Data 2020. https://www.unaids.org/sites/default/files/media_asset/2020_aids-data-book_en.pdf
[2]
NASCOP (2020) HIV and Aids Progress Report 2020. https://www.unaids.org/sites/default/files/media_asset/Annual_Progress_Report_HIV_Prevention.pdf
USAID (2021) Key Populations: Achieving Equitable Access to End Aids—U.S. Agency for International Development. https://www.usaid.gov/global-health/health-areas/hiv-and-aids/technical-areas/key-populations
[6]
Hassan, A.S., Mwaringa, S.M., Ndirangu, K.K., Sanders, E.J., de Wit, T.F.R. and Berkley, J.A. (2015) Incidence and Predictors of Attrition from Antiretroviral Care among Adults in a Rural HIV Clinic in Coastal Kenya: A Retrospective Cohort Study. BMC Public Health, 15, Article No. 478. https://doi.org/10.1186/s12889-015-1814-2
[7]
Makurumidze, R., Mutasa-Apollo, T., Decroo, T., Choto, R.C., Takarinda, K.C., Dzangare, J., Lynen, L., Van Damme, W., Hakim, J., Magure, T., et al. (2020) Retention and Predictors of Attrition among Patients Who Started Antiretroviral Therapy in Zimbabwe’s National Antiretroviral Therapy Programme between 2012 and 2015. PLOS ONE, 15, e0222309. https://doi.org/10.1371/journal.pone.0222309
[8]
Nacarapa, E., Verdu, M.E., Nacarapa, J., Macuacua, A., Chongo, B., Osorio, D., Munyangaju, I., Mugabe, D., Paredes, R., Chamarro, A., et al. (2021) Predictors of Attrition among Adults in a Rural HIV Clinic in Southern Mozambique: 18-Year Retrospective Study. Scientific Reports, 11, Article No. 17897. https://doi.org/10.1038/s41598-021-97466-2
[9]
Graham, S.M., Mugo, P., Gichuru, E., Thiong’o, A., Macharia, M., Okuku, H.S., van der Elst, E., Price, M.A., Muraguri, N. and Sanders, E.J. (2013) Adherence to Antiretroviral Therapy and Clinical Outcomes among Young Adults Reporting High-Risk Sexual Behavior, Including Men Who Have Sex with Men, in Coastal Kenya. AIDS and Behavior, 17, 1255-1265. https://doi.org/10.1007/s10461-013-0445-9
[10]
Madkins, K., Greene, G.J., Hall, E., Jimenez, R., Parsons, J.T., Sullivan, P.S. and Mustanski, B. (2018) Attrition and HIV Risk Behaviors: A Comparison of Young Men Who Have Sex with Men Recruited from Online and Offline Venues for an Online HIV Prevention Program. Archives of Sexual Behavior, 47, 2135-2148. https://doi.org/10.1007/s10508-018-1253-0
[11]
Zhang, D., Li, C., Meng, S., Qi, J., Fu, X. and Sun, J. (2014) Attrition of MSM with HIV/Aids along the Continuum of Care from Screening to CD4 Testing in China. AIDS Care, 26, 1118-1121. https://doi.org/10.1080/09540121.2014.902420
[12]
Altaweel, M. (2017) Density Mapping with Gis-Gis Lounge. https://www.gislounge.com/density-mapping/
[13]
Goldenberg, S.M., Deering, K., Amram, O., Guillemi, S., Nguyen, P., Montaner, J. and Shannon, K. (2017) Community Mapping of Sex Work Criminalization and Violence: Impacts on HIV Treatment Interruptions among Marginalized Women Living with HIV in Vancouver, Canada. International Journal of STD & AIDS, 28, 1001-1009. https://doi.org/10.1177/0956462416685683
Wanjiru, W.H. (2021) Improved Balanced Random Survival Forest for the Analysis of Right Censored Data: Application in Determining under Five Child Mortality. Ph.D. Thesis, Moi University, Melbourne.
[16]
Nekooeimehr, I. and Lai-Yuen, S.K. (2016) Adaptive Semi-Unsupervised Weighted Oversampling (A-SUWO) for Imbalanced Datasets. Expert Systems with Applications, 46, 405-416. https://doi.org/10.1016/j.eswa.2015.10.031
[17]
Breiman, L. (2001) Random Forests. Machine Learning, 45, 5-32. https://doi.org/10.1023/A:1010933404324
[18]
Hothorn, T. and Lausen, B. (2003) On the Exact Distribution of Maximally Selected Rank Statistics. Computational Statistics & Data Analysis, 43, 121-137. https://doi.org/10.1016/S0167-9473(02)00225-6
[19]
Ishwaran, H., Kogalur, U.B., Blackstone, E.H. and Lauer, M.S. (2008) Random Survival Forests. The Annals of Applied Statistics, 2, 841-860. https://doi.org/10.1214/08-AOAS169
[20]
Ramezankhani, A., Tohidi, M., Azizi, F. and Hadaegh, F. (2017) Application of Survival Tree Analysis for Exploration of Potential Interactions between Predictors of Incident Chronic Kidney Disease: A 15-Year Follow-Up Study. Journal of Translational Medicine, 15, Article No. 240. https://doi.org/10.1186/s12967-017-1346-x
[21]
Spooner, A., Chen, E., Sowmya, A., Sachdev, P., Kochan, N.A., Trollor, J. and Brodaty, H. (2020) A Comparison of Machine Learning Methods for Survival Analysis of High-Dimensional Clinical Data for Dementia Prediction. Scientific Reports, 10, Article No. 20410. https://doi.org/10.1038/s41598-020-77220-w
[22]
Mageto, D.K., Mwalili, S.M. and Waititu, A.G. (2015) Modelling of Credit Risk: Random Forests versus Cox Proportional Hazard Regression. American Journal of Theoretical and Applied Statistics, 4, 247-253. https://doi.org/10.11648/j.ajtas.20150404.13
[23]
Ptak-Chmielewska, A. and Matuszyk, A. (2020) Application of the Random Survival Forests Method in the Bankruptcy Prediction for Small and Medium Enterprises. Argumenta Oeconomica, 44, 127-142. https://doi.org/10.15611/aoe.2020.1.06
[24]
Hamid, O., Tapak, M., Poorolajal, J., Amini, P. and Tapak, L. (2017) Application of Random Survival Forest for Competing Risks in Prediction of Cumulative Incidence Function for Progression to Aids. Epidemiology, Biostatistics, and Public Health, 14, e12663-1.
[25]
Rahmayanti, I.A., Sediono, S., Saifudin, T. and Ana, E. (2021) Applying Smote-NC on Cart Algorithm to Handle Imbalanced Data in Customer Churn Prediction: A Case Study of Telecommunications Industry. Syntax Literate Jurnal Ilmiah Indonesia, 6, 1321-1337.
[26]
Islahulhaq, W.W. and Ratih, I.D. (2021) Classification of Non-Performing Financing Using Logistic Regression and Synthetic Minority Over-Sampling Technique-Nominal Continuous (Smote-NC). International Journal of Advances in Soft Computing and its Applications, 13,115-128. https://doi.org/10.15849/IJASCA.211128.09
[27]
Mogensen, U.B., Ishwaran, H. and Gerds, T.A. (2012) Evaluating Random Forests for Survival Analysis Using Prediction Error Curves. Journal of Statistical Software, 50, 1-23. https://doi.org/10.18637/jss.v050.i11
[28]
Zhou, Y. and McArdle, J.J. (2015) Rationale and Applications of Survival Tree and Survival Ensemble Methods. Psychometrika, 80, 811-833. https://doi.org/10.1007/s11336-014-9413-1