全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A Machine Learning-Based Web Application for Heart Disease Prediction

DOI: 10.4236/ica.2024.151002, PP. 9-27

Keywords: Heart Disease, US Center for Disease Control and Prevention, Machine Learn-ing, Imbalanced Data, Web Application

Full-Text   Cite this paper   Add to My Lib

Abstract:

This work leveraged predictive modeling techniques in machine learning (ML) to predict heart disease using a dataset sourced from the Center for Disease Control and Prevention in the US. The dataset was preprocessed and used to train five machine learning models: random forest, support vector machine, logistic regression, extreme gradient boosting and light gradient boosting. The goal was to use the best performing model to develop a web application capable of reliably predicting heart disease based on user-provided data. The extreme gradient boosting classifier provided the most reliable results with precision, recall and F1-score of 97%, 72%, and 83% respectively for Class 0 (no heart disease) and 21% (precision), 81% (recall) and 34% (F1-score) for Class 1 (heart disease). The model was further deployed as a web application.

References

[1]  Chakraborty, C., Bhattacharya, M., Pal, S. and Lee, S. (2023) From Machine Learning to Deep Learning: An Advances of the Recent Data-Driven Paradigm Shift in Medicine and Healthcare. Current Research in Biotechnology, 7, Article ID: 100164.
https://doi.org/10.1016/j.crbiot.2023.100164
[2]  Mbunge, E. and Batani, J. (2023) Application of Deep Learning and Machine Learning Models to Improve Healthcare in Sub-Saharan Africa: Emerging Opportunities, Trends, and Implications. Telematics and Informatics Reports, 11, Article ID: 100097.
https://doi.org/10.1016/j.teler.2023.100097
[3]  Motwani, A., Shukla, P.K. and Pawar, M. (2022) Ubiquitous and Smart Healthcare Monitoring Frameworks Based on Machine Learning: A Comprehensive Review. Artificial Intelligence in Medicine, 134, Article ID: 102431.
https://doi.org/10.1016/j.artmed.2022.102431
[4]  Rasheed, K., Qayyum, A., Ghaly, M., et al. (2022) Explainable, Trustworthy, and Ethical Machine Learning for Healthcare: A Survey. Computers in Biology and Medicine, 149, Article ID: 106043.
https://doi.org/10.1016/j.compbiomed.2022.106043
[5]  Liao, W., He, J., Luo, X., Wu, M., Shen, Y., Li, C. and Chen, N. (2022) Automatic Delineation of Gross Tumor Volume Based on Magnetic Resonance Imaging by Performing a Novel Semisupervised Learning Framework in Nasopharyngeal Carcinoma. International Journal of Radiation Oncology Biology Physics, 113, 893-902.
https://doi.org/10.1016/j.ijrobp.2022.03.031
[6]  Pierre, K., Haneberg, A.G., Kwak, S., Peters, K.R., Hochhegger, B., Sananmuang, T., Tunlayadechanont, P., Tighe, P.J., Mancuso, A. and Forghani, R. (2023) Applications of Artificial Intelligence in the Radiology Roundtrip: Process Streamlining, Workflow Optimization, and Beyond. Seminars in Roentgenology, 58, 158-169.
https://doi.org/10.1053/j.ro.2023.02.003
[7]  Zhai, K., Yousef, M.S., Mohammed, S., Al-Dewik, N.I. and Qoronfleh, M.W. (2023) Optimizing Clinical Workflow Using Precision Medicine and Advanced Data Analytics. Processes, 11, Article No. 939.
https://doi.org/10.3390/pr11030939
[8]  Javaid, M., Haleem, A., Singh, R.P., Suman, R. and Rab, S. (2022) Significance of Machine Learning in Healthcare: Features, Pillars and Applications. International Journal of Intelligent Networks, 3, 58-73.
https://doi.org/10.1016/j.ijin.2022.05.002
[9]  Behera, M.P., Sarangi, A., Mishra, D. and Sarangi, S.K. (2023) A Hybrid Machine Learning Algorithm for Heart and Liver Disease Prediction Using Modified Particle Swarm Optimization with Support Vector Machine. Procedia Computer Science, 218, 818-827.
https://doi.org/10.1016/j.procs.2023.01.062
[10]  Abdalrada, A.S., Abawajy, J. and Al-Quraishi, T. (2022) Machine Learning Models for Prediction of Co-Occurrence of Diabetes and Cardiovascular Diseases: A Retrospective Cohort Study. Journal of Diabetes & Metabolic Disorders, 21, 251-261.
https://doi.org/10.1007/s40200-021-00968-z
[11]  Chari, S., et al. (2023) Informing Clinical Assessment by Contextualizing Post-Hoc Explanations of Risk Prediction Models in Type-2 Diabetes. Artificial Intelligence in Medicine, 137, Article ID: 102498.
https://doi.org/10.1016/j.artmed.2023.102498
[12]  Dworzynski, P., Aasbrenn, M., Rostgaard, K., Melbye, M., Gerds, T.A., Hjalgrim, H. and Pers, T.H. (2020) Nationwide Prediction of Type 2 Diabetes Comorbidities. Scientific Reports, 10, Article No. 1776.
https://doi.org/10.1038/s41598-020-58601-7
[13]  Ojeme, B. and Mbogho, A. (2016) Selecting Learning Algorithms for Simultaneous Identification of Depression and Comorbid Disorders. Procedia Computer Science, 96, 1294-1303.
https://doi.org/10.1016/j.procs.2016.08.174
[14]  Tennenhouse, L.G., Marrie, R.A., Bernstein, C.N., Lix, L.M. and CIHR Team in Defining the Burden and Managing the Effects of Psychiatric Comorbidity in Chronic Immunoinflammatory Disease (2020) Machine-Learning Models for Depression and Anxiety in Individuals with Immune-Mediated Inflammatory Disease. Journal of Psychosomatic Research, 134, Article ID: 110126.
https://doi.org/10.1016/j.jpsychores.2020.110126
[15]  Wang, X., Eichhorn, J., Haq, I. and Baghal, A. (2021) Resting-State Brain Metabolic Fingerprinting Clusters (Biomarkers) and Predictive Models for Major Depression in Multiple Myeloma Patients. PLOS ONE, 16, e0251026.
https://doi.org/10.1371/journal.pone.0251026
[16]  Farran, B., Channanath, A.M., Behbehani, K. and Thanaraj, T.A. (2013) Predictive Models to Assess Risk of Type 2 Diabetes, Hypertension and Comorbidity: Machine-Learning Algorithms and Validation Using National Health Data from Kuwait—A Cohort Study. BMJ Open, 3, e002457.
https://doi.org/10.1136/bmjopen-2012-002457
[17]  Nikolaou, V., et al. (2021) The Cardiovascular Phenotype of Chronic Obstructive Pulmonary Disease (COPD): Applying Machine Learning to the Prediction of Cardiovascular Comorbidities. Respiratory Medicine, 186, Article ID: 106528.
https://doi.org/10.1016/j.rmed.2021.106528
[18]  Glauser, T., et al. (2020) Identifying Epilepsy Psychiatric Comorbidities with Machine Learning. Acta Neurologica Scandinavica, 141, 388-396.
https://doi.org/10.1111/ane.13216
[19]  Linden, T., De Jong, J., Lu, C., Kiri, V., Haeffs, K. and Fröhlich, H. (2021) An Explainable Multimodal Neural Network Architecture for Predicting Epilepsy Comorbidities Based on Administrative Claims Data. Frontiers in Artificial Intelligence, 4, Article ID: 610197.
https://doi.org/10.3389/frai.2021.610197
[20]  Asih, P.S., Azhar, Y., Wicaksono, G.W. and Akbi, D.R. (2023) Interpretable Machine Learning Model for Heart Disease Prediction. Procedia Computer Science, 227, 439-445.
https://doi.org/10.1016/j.procs.2023.10.544
[21]  Nashif, S., Raihan, Md.R., Islam, Md.R. and Imam, M.H. (2018) Heart Disease Detection by Using Machine Learning Algorithms and a Real-Time Cardiovascular Health Monitoring System. World Journal of Engineering and Technology, 6, 854-873.
https://doi.org/10.4236/wjet.2018.64057
[22]  Uddin, S., Wang, S., Lu, H., Khan, A., Hajati, F. and Khushi, M. (2022) Comorbidity and Multimorbidity Prediction of Major Chronic Diseases Using Machine Learning and Network Analytics. Expert Systems with Applications, 205, Article ID: 117761.
https://doi.org/10.1016/j.eswa.2022.117761
[23]  Yang, P., Qiu, H., Wang, L. and Zhou, L. (2022) Early Prediction of High-Cost Inpatients with Ischemic Heart Disease Using Network Analytics and Machine Learning. Expert Systems with Applications, 210, Article ID: 118541.
https://doi.org/10.1016/j.eswa.2022.118541
[24]  Australian Government Department of Health (2020) Chronic Conditions in Australia.
https://www.health.gov.au/topics/chronic-conditions/chronic-conditions-in-australia
[25]  Janosi, A., Steinbrunn, W., Pfisterer, M. and Detrano, R. (1988) Heart Disease. UCI Machine Learning Repository.
[26]  Mortaz, E. (2020) Imbalance Accuracy Metric for Model Selection in Multi-Class Imbalance Classification Problems. Knowledge-Based Systems, 210, Article ID: 106490.
https://doi.org/10.1016/j.knosys.2020.106490
[27]  Bangdiwala, S.I., Fonn, S., Okoye, O., et al. (2010) Workforce Resources for Health in Developing Countries. Public Health Reviews, 32, 296-318.
https://doi.org/10.1007/BF03391604
[28]  Lamuri, A., et al. (2023) Burnout Dimension Profiles among Healthcare Workers in Indonesia. Heliyon, 9, e14519.
https://doi.org/10.1016/j.heliyon.2023.e14519
[29]  Moyo, E., et al. (2023) Burnout among Healthcare Workers during Public Health Emergencies in Sub-Saharan Africa: Contributing Factors, Effects, and Prevention Measures. Human Factors in Healthcare, 3, Article ID: 100039.
https://doi.org/10.1016/j.hfh.2023.100039
[30]  Asante, A. and Hall, J. (2011) A Review of Health Leadership and Management Capacity in Papua New Guinea. Human Resources for Health Knowledge Hub, University of New South Wales, Sydney.
https://sph.med.unsw.edu.au/sites/default/files/sphcm/Centres_and_Units/LM_PNG_Report.pdf
[31]  Mitchell, M., Thomason, J., Donaldson, D. and Garner, P. (1991) The Cost of Rural Health Services in Papua New Guinea. Papua and New Guinea Medical Journal, 34, 276-284.
[32]  Wiltshire, C., Watson, A.H.A., Lokinap, D. and Currie, T. (2020) Papua New Guinea’s Primary Health Care System: Views from the Front Line. ANU and UPNG.
[33]  World Bank Group (2017) Health Financing System Assessment Papua New Guinea. World Bank Publications, Washington DC.
https://documents1.worldbank.org/curated/en/906971515655591305/pdf/122589-wp-p154901-public-23994-png-health-financing-system-assessment-web.pdf
[34]  Centers for Disease Control and Prevention (CDC) (2020) Behavioral Risk Factor Surveillance System. Data Collected through the Behavioral Risk Factor Surveillance System.
https://www.cdc.gov/brfss/annual_data/annual_2020.html
[35]  Kreyszig, E. (1979) Advanced Engineering Mathematics. 4th Edition, Wiley, Hoboken, 880.
[36]  Weisstein, E.W. (n.d.) Arithmetic Mean. From MathWorld—A Wolfram Web Resource.
https://mathworld.wolfram.com/ArithmeticMean.html
[37]  Weisstein, E.W. (n.d.) Standard Deviation. From MathWorld—A Wolfram Web Resource.
https://mathworld.wolfram.com/StandardDeviation.html
[38]  MathWorks. (n.d.) Sequence Classification Using Inverse Frequency Class Weights.
https://www.mathworks.com/help/deeplearning/ug/sequence-classification-using-inverse-frequency-class-weights.html
[39]  Stack Overflow Community (2019) How to Calculate Unbalanced Weights for BCEWithLogitsLoss in Pytorch. Stack Overflow.
https://stackoverflow.com/questions/57021620/how-to-calculate-unbalanced-weights-for-bcewithlogitsloss-in-pytorch
[40]  Tantai, H. (2023, February) Use Weighted Loss Function to Solve Imbalanced Data Classification Problems. Medium.
https://medium.com/@zergtant/use-weighted-loss-function-to-solve-imbalanced-data-classification-problems-749237f38b75
[41]  Liu, Y., Wang, Y. and Zhang, J. (2012) New Machine Learning Algorithm: Random Forest. In: Liu, B., Ma, M. and Chang, J., Eds., Information Computing and Applications, Lecture Notes in Computer Science, Vol. 7473, Springer, Berlin, 246-252.
https://doi.org/10.1007/978-3-642-34062-8_32
[42]  Kecman, V. (2005) Support Vector Machines—An Introduction. In: Wang, L., Ed., Support Vector Machines: Theory and Applications, Studies in Fuzziness and Soft Computing, Vol. 177, Springer, Berlin, 1-47.
https://doi.org/10.1007/10984697_1
[43]  Starbuck, C. (2023) Logistic Regression. In: Starbuck, C., Ed., The Fundamentals of People Analytics, Springer, Cham, 223-238.
https://doi.org/10.1007/978-3-031-28674-2_12
[44]  Chen, T. and Guestrin, C. (2016) XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 13-17 August 2016, 785-794.
https://doi.org/10.1145/2939672.2939785
[45]  Ke, G., et al. (2017) LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, 4-9 December 2017, 314s9-3157.
https://dl.acm.org/doi/10.5555/3294996.3295074
[46]  Buckland, M. and Gey, F. (1994) The Relationship between Recall and Precision. Journal of the American Society for Information Science, 45, 12-19.
https://doi.org/10.1002/(SICI)1097-4571(199401)45:1<12::AID-ASI2>3.0.CO;2-L
[47]  Yu, L. and Zhou, N. (2021) Survey of Imbalanced Data Methodologies.
[48]  Ogunsanya, M., Isichei, J. and Desai, S. (2023) Grid Search Hyperparameter Tuning in Additive Manufacturing Processes. Manufacturing Letters, 35, 1031-1042.
https://doi.org/10.1016/j.mfglet.2023.08.056

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413