全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

An Application of Machine Learning to Thalassemia Diagnosis

DOI: 10.4236/jcc.2024.122013, PP. 211-230

Keywords: Multicollinearity, Statistical Analysis Models, Data Mining, PCA-LR, PLS

Full-Text   Cite this paper   Add to My Lib

Abstract:

Mediterranean anemia is a genetic disease that currently relies heavily on expert clinical experience to determine whether patients are affected. This method is overly reliant on expert experience and is not precise enough. This paper proposes two modeling methods to predict whether patients have Mediterranean anemia. The first method involves using Principal Component Analysis (PCA) to reduce the dimensionality of the data, followed by logistic regression modeling (PCA-LR) on the reduced dataset. The second method involves building a Partial Least Squares Regression (PLS) model. Experimental results show that the prediction accuracy of the PCA-LR model is 87.5% (degree = 2, λ=4), and the prediction accuracy of the PLS model is 92.5% (ncomp = 4), indicating good predictive performance of the models.

References

[1]  Cao, A. and Galanello, R. (2010) Beta-Thalassemia. Genetics in Medicine, 12, 61-76.
https://doi.org/10.1097/GIM.0b013e3181cd68ed
[2]  Saleem, M., Aslam, W., Lali, M.I.U., et al. (2023) Predicting Thalassemia Using Feature Selection Techniques: A Comparative Analysis. Diagnostics, 13, Article 3441.
https://doi.org/10.3390/diagnostics13223441
[3]  Ferih, K., Elsayed, B., Elshoeibi, A.M., et al. (2023) Applications of Artificial Intelligence in Thalassemia: A Comprehensive Review. Diagnostics, 13, Article 1551.
https://doi.org/10.3390/diagnostics13091551
[4]  Singh, A., Mora, J. and Panepinto, J.A. (2018) Identification of Patients with Hemoglobin SS/Sβ0 Thalassemia Disease and Pain Crises within Electronic Health Records. Blood Advances, 2, 1172-1179.
https://doi.org/10.1182/bloodadvances.2018017541
[5]  Das, R., Saleh, S., Nielsen, I., et al. (2022) Performance Analysis of Machine Learning Algorithms and Screening Formulae for β-Thalassemia Trait Screening of Indian Antenatal Women. International Journal of Medical Informatics, 167, Article ID: 104866.
https://doi.org/10.1016/j.ijmedinf.2022.104866
[6]  Fu, Y.K., Liu, H.M., Lee, L.H., et al. (2021) The TVGH-NYCU Thal-Classifier: Development of a Machine-Learning Classifier for Differentiating Thalassemia and Non-Thalassemia Patients. Diagnostics, 11, Article 1725.
https://doi.org/10.3390/diagnostics11091725
[7]  Angelucci, E., Muretto, P., Lucarelli, G., et al. (1997) Phlebotomy to Reduce Iron Overload in Patients Cured of Thalassemia by Bone Marrow Transplantation. Blood, 90, 994-998.
https://doi.org/10.1182/blood.V90.3.994
[8]  Xie, F., Ye, L., Chang, J.C., et al. (2014) Seamless Gene Correction of β-Thalassemia Mutations in Patient-Specific iPSCs Using CRISPR/Cas9 and piggyBac. Genome Research, 24, 1526-1533.
https://doi.org/10.1101/gr.173427.114
[9]  Ren, Z., Sun, G., Zhang, Q., et al. (2023) LC-MS/MS-Based Absolute Quantitation of Hemoglobin Subunits from Dried Blood Spots Reveals Novel Biomarkers for α-Thalassemia Silent Carriers. Analytical Chemistry, 95, 9244-9251.
https://doi.org/10.1021/acs.analchem.3c00895
[10]  Giraldo, L.F., Lozano, F. and Quijano, N. (2011) Foraging Theory for Dimensionality Reduction of Clustered Data. Machine Learning, 82, 71-90.
https://doi.org/10.1007/s10994-009-5156-0
[11]  Abdelmoula, W.M., Stopka, S.A., Randall, E.C., et al. (2022) massNet: Integrated Processing and Classification of Spatially Resolved Mass Spectrometry Data Using Deep Learning for Rapid Tumor Delineation. Bioinformatics, 38, 2015-2021.
https://doi.org/10.1093/bioinformatics/btac032
[12]  Zhou, C., Li, Y., Wu, W., et al. (2023) Preparation and Performance Analysis of a Dimension-Controlled Nano-Drag-Reducing Agent for Low-Permeability Reservoirs. Energy and Fuels, 37, 3908-3917.
https://doi.org/10.1021/acs.energyfuels.3c00077
[13]  Luo, L., He, G., Chen, C., et al. (2022) Adaptive Data Dimensionality Reduction for Chemical Process Modeling Based on the Information Criterion Related to Data Association and Redundancy. Industrial & Engineering Chemistry Research, 61, 1148-1166.
https://doi.org/10.1021/acs.iecr.1c04926
[14]  Chabriel, G., Kleinsteuber, M., Moreau, E., et al. (2014) Joint Matrices Decompositions and Blind Source Separation: A Survey of Methods, Identification, and Applications. IEEE Signal Processing Magazine, 31, 34-43.
https://doi.org/10.1109/MSP.2014.2298045
[15]  Kanavaki, A., Spengos, K., Moraki, M., et al. (2017) Serum Levels of S100b and NSE Proteins in Patients with Non-Transfusion-Dependent Thalassemia as Biomarkers of Brain Ischemia and Cerebral Vasculopathy. International Journal of Molecular Sciences, 18, Article 2724.
https://doi.org/10.3390/ijms18122724
[16]  Yin, S., Zhu, X. and Kaynak, O. (2015) Improved PLS Focused on Key-Performance-Indica-tor-Related Fault Diagnosis. IEEE Transactions on Industrial Electronics, 62, 1651-1658.
https://doi.org/10.1109/TIE.2014.2345331
[17]  Wold, S., Kettaneh, N. and Tjessem, K. (2015) Hierarchical Multiblock PLS and PC Models for Easier Model Interpretation and as an Alternative to Variable Selection. Journal of Chemometrics, 10, 463-482.
https://doi.org/10.1002/(SICI)1099-128X(199609)10:5/6%3C463::AID-CEM445%3E3.0.CO;2-L
[18]  You, L.X. and Chen, J.H. (2022) Autogenerated Multilocal PLS Models without Pre-Classification for Quality Monitoring of Nonlinear Processes with Unevenly Distributed Data. Industrial & Engineering Chemistry Research, 61, 5898-5913.
https://doi.org/10.1021/acs.iecr.1c04461
[19]  Betül, Ç., Ayyldz, H. and Tuncer, T. (2020) Discrimination of β-Thalassemia and Iron Deficiency Anemia through Extreme Learning Machine and Regularized Extreme Learning Machine Based Decision Support System. Medical Hypotheses, 138, Article ID: 109611.
https://doi.org/10.1016/j.mehy.2020.109611
[20]  Saraf, S.L., Akingbola, T.S., Shah, B.N., et al. (2016) Genetic Modifiers Identify a High Risk Group for Stroke in Three Independent Cohorts of Sickle Cell Anemia Patients. Blood, 128, 1015.
https://doi.org/10.1182/blood.V128.22.1015.1015
[21]  Paokanta, P., Ceccarelli, M., Harnpornchai, N., et al. (2012) Rule Induction for Screening Thalassemia Using Machine Learning Techniques: C5.0 and CART. ICIC Express Letters, 6, 301-306.
[22]  Paokanta, P., Ceccarelli, M. and Srichairatanakool, S. (2010) The Effeciency of Data Types for Classification Performance of Machine Learning Techniques for Screening β-Thalassemia. 2010 3rd International Symposium on Applied Sciences in Biomedical and Communication Technologies (ISABEL 2010), Rome, 7-10 November 2010, 1-4.
https://doi.org/10.1109/ISABEL.2010.5702769
[23]  Ergon, R. (2004) Informative PLS Score-Loading Plots for Process Understanding. Journal of Process Control, 14, 889-897.
https://doi.org/10.1016/j.jprocont.2004.02.004

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413