A comparison of machine learning-based approaches in estimating surface PM2.5 concentrations focusing on Artificial Neural Networks and high pollution events

Surface PM2.5 concentrations have significant implications for human health, necessitating accurate estimations. This study compares various machine learning models, including linear models, tree-based algorithms, and artificial neural networks (ANNs) for estimating PM2.5 concentrations using the MERRA-2 dataset from 2012 to 2023. Mutual information and Spearman cross-feature correlation scores are used during feature selections. The performance of models is evaluated using metrics including normalized Nash–Sutcliffe efficiency (NNSE), root mean standard deviation ratio (RSR), and mean percentage error (MPE). Our results show that ANNs outperform linear and tree models, particularly in estimating daily PM2.5 concentrations of 35–1000 µg/m3. ANNs improve NNSE by 119% and 46%, RSR by 40% and 24%, and MPE by 44% and 30% from linear and tree models, respectively, indicating ANN’s superior estimation performance during high pollution days. The sensitivity analysis of features that interpret the models suggests that the total extinction AOD at 550 nm and surface CO concentrations are the most important features in the Western and Eastern U.S., respectively. The findings suggest that even the simplest NNs provide better air quality estimates, especially during high pollution events, which is beneficial for long-term exposure analysis. Future research should explore more sophisticated NN architectures with spatial and temporal variations in PM2.5 to improve the model performance.

To Access Resource:

Go to Resource HomepageHTML

Questions? Email Resource Support Contact:

opensky@ucar.edu
UCAR/NCAR - Library

Resource Type	publication
Temporal Range Begin	N/A
Temporal Range End	N/A
Temporal Resolution	N/A
Bounding Box North Lat	N/A
Bounding Box South Lat	N/A
Bounding Box West Long	N/A
Bounding Box East Long	N/A
Spatial Representation	N/A
Spatial Resolution	N/A
Related Links	Related Dataset #1 : A Comparison of Machine Learning-based Approaches in Estimating Surface PM2.5 Concentrations focusing on Artificial Neural Networks and High Pollution Events
Additional Information	N/A
Resource Format	PDF
Standardized Resource Format	PDF
Asset Size	N/A
Legal Constraints	Copyright author(s). This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Access Constraints	None
Software Implementation Language	N/A

Resource Support Name	N/A
Resource Support Email	opensky@ucar.edu
Resource Support Organization	UCAR/NCAR - Library
Distributor	N/A
Metadata Contact Name	N/A
Metadata Contact Email	opensky@ucar.edu
Metadata Contact Organization	UCAR/NCAR - Library

Author	Wei, S. Shores, Kyle Xu, Y.
Publisher	UCAR/NCAR - Library
Publication Date	2025-01-05T00:00:00
Digital Object Identifier (DOI)	Not Assigned
Alternate Identifier	N/A
Resource Version	N/A
Topic Category	geoscientificInformation
Progress	N/A
Metadata Date	2025-07-10T19:55:07.271122
Metadata Record Identifier	edu.ucar.opensky::articles:42865
Metadata Language	eng; USA
Suggested Citation	Wei, S., Shores, Kyle, Xu, Y.. (2025). A comparison of machine learning-based approaches in estimating surface PM2.5 concentrations focusing on Artificial Neural Networks and high pollution events. UCAR/NCAR - Library. https://n2t.net/ark:/85065/d70v8j5n. Accessed 05 August 2025.

Harvest Source

ISO-19139 ISO-19139 Metadata
Download Metadata (XML) · View Full Metadata (HTML)

A comparison of machine learning-based approaches in estimating surface PM2.5 concentrations focusing on Artificial Neural Networks and high pollution events

To Access Resource:

Questions? Email Resource Support Contact:

Scientific Information

Contact Information

Citation Information

Harvest Source