OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Journal of Data Analysis and Information Processing 2023

Application of Regularized Logistic Regression and Artificial Neural Network Model for Ozone Classification across El Paso County, Texas, United States

DOI: 10.4236/jdaip.2023.113012, PP. 217-239

Callistus Obunadike, Adekunle Adefabi, Somtobe Olisah, David Abimbola, Kunle Oloyede

Keywords: Machine Learning, Ozone Prediction, Pollutants Forecasting, Atmospheric Monitoring, Air Quality, Logistic Regression, Artificial Neural Network

Full-Text Cite this paper Add to My Lib

Abstract:

This paper focuses on ozone prediction in the atmosphere using a machine learning approach. We utilize air pollutant and meteorological variable datasets from the El Paso area to classify ozone levels as high or low. The LR and ANN algorithms are employed to train the datasets. The models demonstrate a remarkably high classification accuracy of 89.3% in predicting ozone levels on a given day. Evaluation metrics reveal that both the ANN and LR models exhibit accuracies of 89.3% and 88.4%, respectively. Additionally, the AUC values for both models are comparable, with the ANN achieving 95.4% and the LR obtaining 95.2%. The lower the cross-entropy loss (log loss), the higher the model’s accuracy or performance. Our ANN model yields a log loss of 3.74, while the LR model shows a log loss of 6.03. The prediction time for the ANN model is approximately 0.00 seconds, whereas the LR model takes 0.02 seconds. Our odds ratio analysis indicates that features such as “Solar radiation”, “Std. Dev. Wind Direction”, “outdoor temperature”, “dew point temperature”, and “PM10” contribute to high ozone levels in El Paso, Texas. Based on metrics such as accuracy, error rate, log loss, and prediction time, the ANN model proves to be faster and more suitable for ozone classification in the El Paso, Texas area.

References

[1]	Di, Q., Wang, Y., Zanobetti, A., Wang, Y., Koutrakis, P., Choirat, C., Dominici, F. and Schwartz, J.D. (2017) Air Pollution and Mortality in the Medicare Population. The New England Journal of Medicine, 376, 2513-2522. https://doi.org/10.1056/NEJMoa1702747
[2]	Lin, S., Liu, X., Le, L.H. and Hwang, S.-A. (2008) Chronic Exposure to Ambient Ozone and Asthma Hospital Admissions among Children. Environmental Health Perspectives, 116, 1725-1730. https://doi.org/10.1289/ehp.11184
[3]	Jerrett, M., Burnett, R.T., Pope, C.A., Ito, K., Thurston, G., Krewski, D., Shi, Y., Calle, E. and Thun, M. (2009) Long-Term Ozone Exposure and Mortality. The New England Journal of Medicine, 360, 1085-1095. https://doi.org/10.1056/NEJMoa0803894
[4]	Parker, J.D., Akinbami, L.J. and Woodruff, T.J. (2009) Air Pollution and Childhood Respiratory Allergies in the United States. Environmental Health Perspectives, 117, 140-147. https://doi.org/10.1289/ehp.11497
[5]	Bhuiyan, M.A.M., Sahi, R.K., Islam, M.R. and Mahmud, S. (2021) Machine Learning Techniques Applied to Predict Tropospheric Ozone in a Semi-Arid Climate Region. Mathematics, 9, Article No. 2901. https://doi.org/10.3390/math9222901
[6]	U.S. EPA. Nonattainment Areas for Criteria Pollutants (Green Book). https://www.epa.gov/green-book
[7]	U.S. Environmental Protection Agency. Integrated Science Assessment (ISA) for Ozone and Related Photochemical Oxidants. https://www.epa.gov/isa/integrated-science-assessment-isa-ozone-and-related-photochemical-oxidants
[8]	Medina-Ramón, M. and Schwartz, J. (2008) Who Is More Vulnerable to Die from Ozone Air Pollution? Epidemiology, 19, 672-679. https://doi.org/10.1097/EDE.0b013e3181773476
[9]	Olufemi, I., Obunadike, C., Adefabi, A. and Abimbola, D. (2023) Application of Logistic Regression Model in Prediction of Early Diabetes across United States. International Journal of Scientific and Management Research, 6, 34-48. https://doi.org/10.37502/IJSMR.2023.6502
[10]	Tran, B., Sudusinghe, C., Nguyen, S. and Alahakoon, D. (2023) Building Interpretable Predictive Models with Context-Aware Evolutionary Learning. Applied Soft Computing, 132, Article ID: 109854. https://doi.org/10.1016/j.asoc.2022.109854
[11]	Issitt, R.W., Cortina-Borja, M., Bryant, W., Bowyer, S., Taylor, A.M. and Sebire, N. (2022) Classification Performance of Neural Networks versus Logistic Regression Models: Evidence from Healthcare Practice. Cureus, 14, e22443. https://doi.org/10.7759/cureus.22443
[12]	Valluri, C., Raju, S. and Patil, V.H. (2022) Customer Determinants of Used Auto Loan Churn: Comparing Predictive Performance Using Machine Learning Techniques. Journal of Marketing Analytics, 10, 279-296. https://doi.org/10.1057/s41270-021-00135-6
[13]	Xie, X., Wang, L. and Wang, A. (2010) Artificial Neural Network Modeling for Deciding If Extractions Are Necessary Prior to Orthodontic Treatment. The Angle Orthodontist, 80, 262-266. https://doi.org/10.2319/111608-588.1
[14]	Abiodun, O.I., Jantan, A., Omolara, A.E., Dada, K.V., Mohamed, N.A. and Arshad, H. (2018) State-of-the-Art in Artificial Neural Network Applications: A Survey. Heliyon, 4, e00938. https://doi.org/10.1016/j.heliyon.2018.e00938
[15]	Sarker, I.H. (2021) Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science, 2, Article No. 160. https://doi.org/10.1007/s42979-021-00592-x
[16]	Couronné, R., Probst, P. and Boulesteix, A.-L. (2018) Random Forest versus Logistic Regression: A Large-Scale Benchmark Experiment. BMC Bioinformatics, 19, Article No. 270. https://doi.org/10.1186/s12859-018-2264-5
[17]	Sarker, I.H. (2021) Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Computer Science, 2, Article No. 420. https://doi.org/10.1007/s42979-021-00815-1
[18]	Alzubaidi, L., Zhang, J., Humaidi, A.J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M.A., Al-Amidie, M. and Farhan, L. (2021) Review of Deep Learning: Concepts, CNN Architectures, Challenges, Applications, Future Directions. Journal of Big Data, 8, Article No. 53. https://doi.org/10.1186/s40537-021-00444-8
[19]	Montesinos López, O.A., Montesinos López, A. and Crossa, J. (2022) Multivariate Statistical Machine Learning Methods for Genomic Prediction. Springer, Cham. https://doi.org/10.1007/978-3-030-89010-0
[20]	Albaradei, S., Thafar, M., Alsaedi, A., Van Neste, C., Gojobori, T., Essack, M. and Gao, X. (2021) Machine Learning and Deep Learning Methods That Use Omics Data for Metastasis Prediction. Computational and Structural Biotechnology Journal, 19, 5008-5018. https://doi.org/10.1016/j.csbj.2021.09.001
[21]	Duan, F., Zhang, S., Yan, Y. and Cai, Z. (2022) An Oversampling Method of Unbalanced Data for Mechanical Fault Diagnosis Based on MeanRadius-SMOTE. Sensors, 22, Article No. 5166. https://doi.org/10.3390/s22145166
[22]	Karrar, A.E. (2022) The Effect of Using Data Pre-Processing by Imputations in Handling Missing Values. Indonesian Journal of Electrical Engineering and Informatics, 10, 375-384. https://doi.org/10.52549/ijeei.v10i2.3730
[23]	Bin Rafiq, R., Modave, F., Guha, S. and Albert, M.V. (2020) Validation Methods to Promote Real-World Applicability of Machine Learning in Medicine. 2020 3rd International Conference on Digital Medicine and Image Processing, Kyoto, 6-9 November 2020, 13-19. https://doi.org/10.1145/3441369.3441372

Full-Text

Contact Us

[email protected]

QQ:3279437679

WhatsApp +8615387084133