全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Advancing Type II Diabetes Predictions with a Hybrid LSTM-XGBoost Approach

DOI: 10.4236/jdaip.2024.122010, PP. 163-188

Keywords: LSTM, XGBoost, Hybrid Models, Machine Learning. Deep Learning

Full-Text   Cite this paper   Add to My Lib

Abstract:

In this paper, we explore the ability of a hybrid model integrating Long Short-Term Memory (LSTM) networks and eXtreme Gradient Boosting (XGBoost) to enhance the prediction accuracy of Type II Diabetes Mellitus, which is caused by a combination of genetic, behavioral, and environmental factors. Utilizing comprehensive datasets from the Women in Data Science (WiDS) Datathon for the years 2020 and 2021, which provide a wide range of patient information required for reliable prediction. The research employs a novel approach by combining LSTM’s ability to analyze sequential data with XGBoost’s strength in handling structured datasets. To prepare this data for analysis, the methodology includes preparing it and implementing the hybrid model. The LSTM model, which excels at processing sequential data, detects temporal patterns and trends in patient history, while XGBoost, known for its classification effectiveness, converts these patterns into predictive insights. Our results demonstrate that the LSTM-XGBoost model can operate effectively with a prediction accuracy achieving 0.99. This study not only shows the usefulness of the hybrid LSTM-XGBoost model in predicting diabetes but it also provides the path for future research. This progress in machine learning applications represents a significant step forward in healthcare, with the potential to alter the treatment of chronic diseases such as diabetes and lead to better patient outcomes.

References

[1]  Sevilla-Gonzalez, M.D.R., Bourguet-Ramirez, B., Lazaro-Carrera, L.S., Martagon-Rosado, A.J., Gomez-Velasco, D.V. and Viveros-Ruiz, T.L. (2022) Evaluation of a Web Platform to Record Lifestyle Habits in Subjects at Risk of Developing Type 2 Diabetes in a Middle-Income Population: Prospective Interventional Study. JMIR Diabetes, 7, e25105.
https://doi.org/10.2196/25105
[2]  Alam, T.M., Iqbal, M.A., Ali, Y., Wahab, A., Ijaz, S., Baig, T.I., Hussain, A., Malik, M.A., Raza, M.M., Ibrar, S., et al. (2019) A Model for Early Prediction of Diabetes. Informatics in Medicine Unlocked, 16, Article ID: 100204.
https://doi.org/10.1016/j.imu.2019.100204
[3]  Bhat, S.S., Selvam, V., Ansari, G.A., Ansari, M.D., Rahman, M.H., et al. (2022) Prevalence and Early Prediction of Diabetes Using Machine Learning in North Kashmir: A Case Study of District Bandipora. Computational Intelligence and Neuroscience, 2022, Article ID: 2789760.
https://doi.org/10.1155/2022/2789760
[4]  American Diabetes Association (2010) Diagnosis and Classification of Diabetes Mellitus. Diabetes Care, 33, S62-S69.
https://doi.org/10.2337/dc10-S062
[5]  Bhat, S.S. and Ansari, G.A. (2021) Predictions of Diabetes and Diet Recommendation System for Diabetic Patients Using Machine Learning Techniques. 2021 2nd International Conference for Emerging Technology (INCET), Belagavi, 21-23 May 2021, 1-5.
[6]  Chen, T.Q. and Guestrin, C. (2016) Xgboost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 13-17 August 2016, 785-794.
https://doi.org/10.1145/2939672.2939785
[7]  Ahamed, B.S., Arya, M.S. and Nancy, A.O. (2022) Diabetes Mellitus Disease Prediction Using Machine Learning Classifiers and Techniques Using the Concept of Data Augmentation and Sampling. In: Tuba, M., Akashe, S. and Joshi, A., Eds., ICT Systems and Sustainability: Proceedings of ICT4SD 2022, Springer, Berlin, 401-413.
https://doi.org/10.1007/978-981-19-5221-0_40
[8]  Zhang, X.J. and Zhang, Q.R. (2020) Short-Term Traffic Flow Prediction Based on LSTM-XGBoost Combination Model. CMES-Computer Modeling in Engineering & Sciences, 125, 95-109.
https://doi.org/10.32604/cmes.2020.011013
[9]  Zhu, X., Chu, J., Wang, K.D., Wu, S.F., Yan, W. and Chiam, K. (2021) Prediction of Rockhead Using a Hybrid N-XGboost Machine Learning Framework. Journal of Rock Mechanics and Geotechnical Engineering, 13, 1231-1245.
https://doi.org/10.1016/j.jrmge.2021.06.012
[10]  Bai, L. and Pinson, P. (2019) Distributed Reconciliation in Day-Ahead Wind Power Forecasting. Energies, 12, Article No. 1112.
https://doi.org/10.3390/en12061112
[11]  Ganie, S.M. and Malik, M.B. (2022) An Ensemble Machine Learning Approach for Predicting Type-II Diabetes Mellitus Based on Lifestyle Indicators. Healthcare Analytics, 2, Article ID: 100092.
https://doi.org/10.1016/j.health.2022.100092
[12]  Kopitar, L., Kocbek, P., Cilar, L., Sheikh, A. and Stiglic, G. (2022) Early Detection of Type 2 Diabetes Mellitus Using Machine Learning-Based Prediction Models. Scientific Reports, 10, Article No. 11981.
https://doi.org/10.1038/s41598-020-68771-z
[13]  Balci, F. (2022) A Hybrid Attention-Based LSTM-XGboost Model for Detection of ECG-Based Atrial Fibrillation. Gazi University Journal of Science Part A: Engineering and Innovation, 9, 199-210.
https://doi.org/10.54287/gujsa.1128006
[14]  Miao, Y.J., Gowayyed, M. and Metze, F. (2015) Eesen: End-to-End Speech Recognition Using Deep RNN Models and WFST-Based Decoding. 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, 13-17 December 2015, 167-174.
https://doi.org/10.1109/ASRU.2015.7404790
[15]  Sak, H., Senior, A.W. and Beaufays, F. (2014) Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling. Proceedings Interspeech 2014, Singapore, 14-18 September 2014, 338-342
https://doi.org/10.21437/Interspeech.2014-80
[16]  Chen, T.Q., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., Mitchell, R., Cano, I., Zhou, T.Y., et al. (2015) Xgboost: Extreme Gradient Boosting. R Package Version 0.4-2, 1, 1-4.
[17]  Deng, L., Yu, D., et al. (2014) Deep Learning: Methods and Applications. Foundations and Trends® in Signal Processing, 7, 197-387.
https://doi.org/10.1561/2000000039
[18]  Ciregan, D., Meier, U. and Schmidhuber, J. (2012) Multi-Column Deep Neural Networks for Image Classification. 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, 16-21 June 2012, 3642-3649.
https://doi.org/10.1109/CVPR.2012.6248110
[19]  Shwartz-Ziv, R. and Armon, A. (2022) Tabular Data: Deep Learning Is Not All You Need. Information Fusion, 81, 84-90.
https://doi.org/10.1016/j.inffus.2021.11.011
[20]  Jin, Y.R., Qin, C.J., Huang, Y.X., Zhao, W.Y. and Liu, C.L. (2020) Multi-Domain Modeling of Atrial Fibrillation Detection with Twin Attentional Convolutional Long Short-Term Memory Neural Networks. Knowledge-Based Systems, 193, Article ID: 105460.
https://doi.org/10.1016/j.knosys.2019.105460
[21]  Mitchell, R. and Frank, E. (2017) Accelerating the XGboost Algorithm Using GPU Computing. PeerJ Computer Science, 3, e127.
https://doi.org/10.7717/peerj-cs.127

Full-Text

Contact Us

[email protected]

QQ:3279437679

WhatsApp +8615387084133