%0 Journal Article %T Unveiling the Predictive Capabilities of Machine Learning in Air Quality Data Analysis: A Comparative Evaluation of Different Regression Models %A Mosammat Mustari Khanaum %A Md Saidul Borhan %A Farzana Ferdoush %A Mohammed Ali Nause Russel %A Mustafa Murshed %J Open Journal of Air Pollution %P 142-159 %@ 2169-2661 %D 2023 %I Scientific Research Publishing %R 10.4236/ojap.2023.124009 %X Air quality is a critical concern for public health and environmental regulation. The Air Quality Index (AQI), a widely adopted index by the US Environmental Protection Agency (EPA), serves as a crucial metric for reporting site-specific air pollution levels. Accurately predicting air quality, as measured by the AQI, is essential for effective air pollution management. In this study, we aim to identify the most reliable regression model among linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), logistic regression, and K-nearest neighbors (KNN). We conducted four different regression analyses using a machine learning approach to determine the model with the best performance. By employing the confusion matrix and error percentages, we selected the best-performing model, which yielded prediction error rates of 22%, 23%, 20%, and 27%, respectively, for LDA, QDA, logistic regression, and KNN models. The logistic regression model outperformed the other three statistical models in predicting AQI. Understanding these models' performance can help address an existing gap in air quality research and contribute to the integration of regression techniques in AQI studies, ultimately benefiting stakeholders like environmental regulators, healthcare professionals, urban planners, and researchers. %K Regression Analysis %K Air Quality Index %K Linear Discriminant Analysis %K Quadratic Discriminant Analysis %K Logistic Regression %K K-Nearest Neighbors %K Machine Learning %K Big Data Analysis %U http://www.scirp.org/journal/PaperInformation.aspx?PaperID=129707