{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T16:23:36Z","timestamp":1772036616213,"version":"3.50.1"},"reference-count":59,"publisher":"MDPI AG","issue":"15","license":[{"start":{"date-parts":[[2023,7,29]],"date-time":"2023-07-29T00:00:00Z","timestamp":1690588800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation","doi-asserted-by":"publisher","award":["42030606"],"award-info":[{"award-number":["42030606"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>PM2.5 refers to the total mass concentration of tiny particulates in the atmosphere near the surface, obtained by means of in situ observations and satellite remote sensing. Given the highly limited number of ground observation stations of inhomogeneous distribution and an ill-posed remote sensing approach, increasing efforts have been devoted to the application of machine-learning (ML) models to both ground and satellite data. A key satellite-derived parameter, aerosol optical thickness (AOD), has been most commonly used as a proxy of PM2.5, although their correlation is fraught with large uncertainties. A critical question that has been overlooked concerns how much AOD helps to improve the retrieval of PM2.5 relative to its uncertainty incurred concurrently. The question is addressed here by taking advantage of high-density PM2.5 stations in eastern China to evaluate the contributions of AOD, determined as the difference in the accuracy of PM2.5 retrievals with and without AOD for varying densities of PM2.5 stations, using four popular ML models (i.e., Random Forest, Extra-trees, XGBoost, and LightGBM). Our results reveal that as the density of monitoring stations decreases, both the feature importance and permutation importance of satellite AOD demonstrate a consistent upward trend (p &lt; 0.05). Furthermore, the ML models without AOD exhibit faster declines in overall accuracy and predictive ability compared with the models with AOD assessed using the sample-based and station-based (spatial) independent cross-validation approaches. Overall, a 10% reduction in the number of stations results in an increase of 0.7\u20131.2% and 0.6\u20131.2% in uncertainty in estimated and predicted accuracies, respectively. These findings attest to the indispensable role of satellite AOD in the PM2.5 retrieval process through ML because it can significantly mitigate the negative impact of the sparse distribution of monitoring sites. This role becomes more important as the number of PM2.5 stations decreases.<\/jats:p>","DOI":"10.3390\/rs15153780","type":"journal-article","created":{"date-parts":[[2023,7,31]],"date-time":"2023-07-31T01:48:50Z","timestamp":1690768130000},"page":"3780","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["How Important Is Satellite-Retrieved Aerosol Optical Depth in Deriving Surface PM2.5 Using Machine Learning?"],"prefix":"10.3390","volume":"15","author":[{"given":"Zhongyan","family":"Tian","sequence":"first","affiliation":[{"name":"Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8803-7056","authenticated-orcid":false,"given":"Jing","family":"Wei","sequence":"additional","affiliation":[{"name":"Department of Atmospheric and Oceanic Science, Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD 20740, USA"}]},{"given":"Zhanqing","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Atmospheric and Oceanic Science, Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD 20740, USA"}]}],"member":"1968","published-online":{"date-parts":[[2023,7,29]]},"reference":[{"key":"ref_1","unstructured":"IPCC 2021 (2021). Climate Change, 2021: The Physical Science Basis, Cambridge University Press. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change IPCC Working Group I Contribution to AR5Rep."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1016\/j.envpol.2016.11.043","article-title":"Impact of diurnal variability and meteorological factors on the PM2.5-AOD relationship: Implications for PM2.5 remote sensing","volume":"221","author":"Guo","year":"2017","journal-title":"Environ. Poll."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"13026","DOI":"10.1029\/2019JD030758","article-title":"East Asian Study of Tropospheric Aerosols and their Impact on Regional Clouds, Precipitation, and Climate (EAST-AIRCPC)","volume":"124","author":"Li","year":"2019","journal-title":"J. Geophys. Res. Atmos."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1016\/j.atmosenv.2014.12.067","article-title":"Representativeness of air quality monitoring networks","volume":"104","author":"Duyzer","year":"2015","journal-title":"Atmos. Environ."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"875","DOI":"10.1016\/j.uclim.2017.11.001","article-title":"Allocating optimum sites for air quality monitoring stations using GIS suitability analysis","volume":"24","author":"Alsahli","year":"2018","journal-title":"Urban Clim."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Chen, N., Yang, M., Du, W., and Min, H. (2021). PM2.5 estimation and spatial-temporal pattern analysis based on the modified support vector regression model and the 1 km resolution MAIAC AOD in Hubei, China. ISPRS Int. J. Geo-Inf., 10.","DOI":"10.3390\/ijgi10010031"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"D21201","DOI":"10.1029\/2005JD006996","article-title":"Estimating ground-level PM2.5 using aerosol optical depth determined from satellite remote sensing","volume":"111","author":"Martin","year":"2006","journal-title":"J. Geophys. Res. Atmos."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"250","DOI":"10.1016\/j.atmosres.2016.06.018","article-title":"Can MODIS AOD be employed to derive PM2.5 in Beijing-Tianjin-Hebei over China?","volume":"181","author":"Ma","year":"2016","journal-title":"Atmos. Res."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1016\/j.rse.2016.05.025","article-title":"Remote sensing of ground-level PM2.5 combining AOD and backscattering profile","volume":"183","author":"Li","year":"2016","journal-title":"Remote Sens. Environ."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1016\/j.atmosres.2004.06.001","article-title":"Aerosol polarized phase function and single-scattering albedo retrieved from ground-based measurements","volume":"71","author":"Li","year":"2004","journal-title":"Atmos. Res."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"5880","DOI":"10.1016\/j.atmosenv.2006.03.016","article-title":"Satellite remote sensing of particulate matter and air quality assessment over global cities","volume":"40","author":"Gupta","year":"2006","journal-title":"Atmos. Environ."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"A109","DOI":"10.1289\/ehp.0901732","article-title":"What can affect AOD\u2013PM2.5 association?","volume":"118","author":"Kumar","year":"2010","journal-title":"Environ. Health Perspect."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2095","DOI":"10.1029\/2003GL018174","article-title":"Intercomparison between satellite-derived aerosol optical thickness and PM2.5 mass: Implications for air quality studies","volume":"30","author":"Wang","year":"2003","journal-title":"Geophys. Res. Lett."},{"key":"ref_14","first-page":"544","article-title":"A multi-year comparison of PM2.5 and AOD for the Helsinki region","volume":"15","author":"Natunen","year":"2010","journal-title":"Boreal Environ. Res."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"11913","DOI":"10.1021\/es302673e","article-title":"Incorporating local land use regression and satellite aerosol optical depth in a hybrid model of spatiotemporal PM2.5 exposures in the Mid-Atlantic states","volume":"46","author":"Kloog","year":"2012","journal-title":"Environ. Sci. Technol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1289\/ehp.1409481","article-title":"Satellite-based spatiotem-poral trends in PM2. 5 concentrations: China 2004-2013","volume":"124","author":"Ma","year":"2016","journal-title":"Environ. Health Perspect."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1016\/j.atmosenv.2015.11.061","article-title":"Opposite seasonality of the aerosol optical depth and the surface particulate matter concentration over the North China Plain","volume":"127","author":"Qu","year":"2016","journal-title":"Atmos. Environ."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"15921","DOI":"10.5194\/acp-18-15921-2018","article-title":"Relationships between the planetary boundary layer height and surface pollutants derived from lidar observations over China: Regional pattern and influencing factors","volume":"18","author":"Su","year":"2018","journal-title":"Atmos. Chem. Phys."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"5304","DOI":"10.1016\/j.atmosenv.2006.04.044","article-title":"Comparison of spatial and temporal variations of aerosol optical thickness and particulate matter over Europe","volume":"40","author":"Koelemeijer","year":"2006","journal-title":"Atmos. Environ."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"652","DOI":"10.1021\/es2025752","article-title":"Exposure assessment for estimation of the global burden of disease attributable to outdoor air pollution","volume":"46","author":"Brauer","year":"2012","journal-title":"Environ. Sci. Technol."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"104934","DOI":"10.1016\/j.envint.2019.104934","article-title":"A comparison of linear regression, regularization, and machine learning algorithms to develop Europe-wide spatial models of fine particles and nitrogen dioxide","volume":"130","author":"Chen","year":"2019","journal-title":"Environ. Int."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"D14205","DOI":"10.1029\/2008JD011496","article-title":"Particulate matter air quality assessment using integrated surface, satellite, and meteorological products: Multiple regression approach","volume":"114","author":"Gupta","year":"2009","journal-title":"J. Geophys. Res. Atmos."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"7436","DOI":"10.1021\/es5009399","article-title":"Estimating ground-level PM2.5 in China using satellite remote sensing","volume":"48","author":"Ma","year":"2014","journal-title":"Environ. Sci. Tech."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"8327","DOI":"10.1007\/s11356-015-6027-9","article-title":"Estimating national-scale ground-level PM2.5 concentration in China using geographically weighted regression based on MODIS and MISR AOD","volume":"23","author":"You","year":"2016","journal-title":"Environ. Sci. Pollut. Res."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1016\/j.rse.2017.12.018","article-title":"Satellite-based mapping of daily high-resolution ground PM2.5 in China via space-time regression modeling","volume":"206","author":"He","year":"2018","journal-title":"Remote Sens. Environ."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1016\/j.rse.2017.07.023","article-title":"Full-coverage high-resolution daily PM2.5 estimation using MAIAC AOD in the Yangtze River Delta of China","volume":"199","author":"Xiao","year":"2017","journal-title":"Remote Sens. Environ."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"3269","DOI":"10.1021\/es049352m","article-title":"Estimating ground-level PM2.5 in the eastern United States using satellite remote sensing","volume":"39","author":"Liu","year":"2005","journal-title":"Environ. Sci. Technol."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"7991","DOI":"10.5194\/acp-11-7991-2011","article-title":"A novel calibration approach of MODIS AOD data to predict PM2.5 concentrations","volume":"11","author":"Lee","year":"2011","journal-title":"Atmos. Chem. Phys."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"141093","DOI":"10.1016\/j.scitotenv.2020.141093","article-title":"Estimating PM2.5 with high-resolution 1-km AOD data and an improved machine learning model over Shenzhen, China","volume":"746","author":"Chen","year":"2020","journal-title":"Sci. Total Environ."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"111221","DOI":"10.1016\/j.rse.2019.111221","article-title":"Estimating 1-km-resolution PM2.5 concentrations across China using the space-time random forest approach","volume":"231","author":"Wei","year":"2019","journal-title":"Remote Sens. Environ."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"3273","DOI":"10.5194\/acp-20-3273-2020","article-title":"Improved 1-km-resolution PM2.5 estimates across China using enhanced space-time extremely randomized trees","volume":"20","author":"Wei","year":"2020","journal-title":"Atmos. Chem. Phys."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"112136","DOI":"10.1016\/j.rse.2020.112136","article-title":"Reconstructing 1-km-resolution high-quality PM2.5 data records from 2000 to 2018 in China: Spatiotemporal variations and policy implications","volume":"252","author":"Wei","year":"2021","journal-title":"Remote Sens. Environ."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"012127","DOI":"10.1088\/1755-1315\/113\/1\/012127","article-title":"Application of XGBoost algorithm in hourly PM2.5 concentration prediction","volume":"113","author":"Pan","year":"2018","journal-title":"IOP Conf. Ser. Earth Environ. Sci."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"7863","DOI":"10.5194\/acp-21-7863-2021","article-title":"Himawari-8-derived diurnal variations of ground-level PM2.5 pollution across China using the fast space-time Light Gradient Boosting Machine (LightGBM)","volume":"21","author":"Wei","year":"2021","journal-title":"Atmos. Chem. Phys."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1016\/j.rse.2016.08.027","article-title":"Satellite-based ground PM2.5 estimation using timely structure adaptive modeling","volume":"186","author":"Fang","year":"2016","journal-title":"Remote Sens. Environ."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1016\/j.envpol.2015.09.042","article-title":"Estimating ground-level PM10 in a Chinese city by combining satellite data, meteorological information and a land use regression model","volume":"208","author":"Meng","year":"2016","journal-title":"Environ. Pollut."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1016\/j.envres.2017.07.044","article-title":"Development of a model for particulate matter pollution in Australia with implications for other satellite-based models","volume":"159","author":"Pereira","year":"2017","journal-title":"Environ. Res."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"110735","DOI":"10.1016\/j.envres.2021.110735","article-title":"The comparison of AOD-based and non-AOD prediction models for daily PM2.5 estimation in Guangdong province, China with poor AOD coverage","volume":"195","author":"Chen","year":"2021","journal-title":"Environ. Res."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"037004","DOI":"10.1289\/EHP9752","article-title":"Deep ensemble machine learning framework for the estimation of PM2.5 concentrations","volume":"130","author":"Yu","year":"2022","journal-title":"Environ. Health Perspect."},{"key":"ref_40","first-page":"D03210","article-title":"Multi-Angle Implementation of Atmospheric Correction (MAIAC): 1. Radiative transfer basis and look-up tables","volume":"116","author":"Lyapustin","year":"2011","journal-title":"J. Geophys. Res. Atmos."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"5741","DOI":"10.5194\/amt-11-5741-2018","article-title":"MODIS Collection 6 MAIAC algorithm","volume":"11","author":"Lyapustin","year":"2018","journal-title":"Atmos. Meas. Tech."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1999","DOI":"10.1002\/qj.3803","article-title":"The ERA5 global reanalysis","volume":"146","author":"Hersbach","year":"2020","journal-title":"Q. J. R. Meteorol. Soc."},{"key":"ref_43","unstructured":"Peuch, V.H., Engelen, R., Ades, M., Barre, J., and Suttie, M. (2018). IGARSS 2018\u20132018 IEEE International Geoscience and Remote Sensing Symposium, IEEE."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1511","DOI":"10.5194\/acp-23-1511-2023","article-title":"Ground-level gaseous pollutants (NO2, SO2, and CO) in China: Daily seamless mapping and spatiotemporal variations","volume":"23","author":"Wei","year":"2023","journal-title":"Atmos. Chem. Phys."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Malakar, N.K., Lary, D.J., Moore, A., Gencaga, D., Roscoe, B., Albayrak, A., Petrenko, M., and Wei, J. (2012, January 24\u201326). Estimation and bias correction of aerosol abundance using data-driven machine learning and remote sensing. Proceedings of the 2012 Conference on Intelligent Data Understanding (CIDU 2012), Boulder, CO, USA.","DOI":"10.1109\/CIDU.2012.6382197"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"S611","DOI":"10.4081\/gh.2014.292","article-title":"Estimating the global abundance of ground level presence of particulate matter (PM2.5)","volume":"8","author":"Lary","year":"2014","journal-title":"Geospat. Health"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"3887","DOI":"10.1021\/es505846r","article-title":"Spatiotemporal prediction of fine particulate matter during the 2008 northern California wildfires using machine learning","volume":"49","author":"Reid","year":"2015","journal-title":"Environ. Sci. Technol."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2002","journal-title":"Mach. Learn."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1016\/j.scitotenv.2018.04.251","article-title":"A machine learning method to estimate PM2.5 concentrations across China with remote sensing, meteorological and land use information","volume":"636","author":"Chen","year":"2018","journal-title":"Sci. Total Environ."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"6936","DOI":"10.1021\/acs.est.7b01210","article-title":"Estimating PM2.5 concentrations in the conterminous United States using the random forest approach","volume":"51","author":"Hu","year":"2017","journal-title":"Environ. Sci. Tech."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10994-006-6226-1","article-title":"Extremely randomized trees","volume":"63","author":"Geurts","year":"2006","journal-title":"Mach. Learn."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_53","unstructured":"Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T. (2017). Advances in Neural Information Processing Systems, ACM. Available online: https:\/\/dl.acm.org\/doi\/10.5555\/3294996.3295074."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"1413","DOI":"10.1080\/03610926.2020.1764042","article-title":"Unbiased variable importance for random forests","volume":"51","author":"Loecher","year":"2022","journal-title":"Commun. Stat. Theory Methods"},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1198\/016214501753168271","article-title":"Classification trees with unbiased multiway splits","volume":"96","author":"Kim","year":"2001","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"569","DOI":"10.1109\/TPAMI.2009.187","article-title":"Sensitivity analysis of k-fold cross validation in prediction error estimation","volume":"32","author":"Rodriguez","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Wei, J., Li, Z., Chen, X., Li, C., Sun, Y., Wang, J., Lyapustin, A., Brasseur, G., Jiang, M., and Sun, L. (2023). Separating daily 1-km PM2.5 inorganic chemical composition in China since 2000 via deep learning integrating ground, satellite, and model data. Environ. Sci. Tech.","DOI":"10.1021\/acs.est.3c00272"},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"106290","DOI":"10.1016\/j.envint.2020.106290","article-title":"The ChinaHighPM10 dataset: Generation, validation, and spatiotemporal variations from 2015 to 2019 across China","volume":"146","author":"Wei","year":"2021","journal-title":"Environ. Int."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"9988","DOI":"10.1021\/acs.est.2c03834","article-title":"Ground-level NO2 surveillance from space across China for high resolution using interpretable spatiotemporally weighted artificial intelligence","volume":"56","author":"Wei","year":"2022","journal-title":"Environ. Sci. Tech."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/15\/3780\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:22:18Z","timestamp":1760127738000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/15\/3780"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,29]]},"references-count":59,"journal-issue":{"issue":"15","published-online":{"date-parts":[[2023,8]]}},"alternative-id":["rs15153780"],"URL":"https:\/\/doi.org\/10.3390\/rs15153780","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,29]]}}}