{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,28]],"date-time":"2026-02-28T03:44:42Z","timestamp":1772250282171,"version":"3.50.1"},"reference-count":59,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2021,1,1]],"date-time":"2021-01-01T00:00:00Z","timestamp":1609459200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000104","name":"National Aeronautics and Space Administration","doi-asserted-by":"publisher","award":["80NSSC19K0191"],"award-info":[{"award-number":["80NSSC19K0191"]}],"id":[{"id":"10.13039\/100000104","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006196","name":"Jet Propulsion Laboratory","doi-asserted-by":"publisher","award":["1588347"],"award-info":[{"award-number":["1588347"]}],"id":[{"id":"10.13039\/100006196","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000066","name":"National Institute of Environmental Health Sciences","doi-asserted-by":"publisher","award":["R01-ES027892"],"award-info":[{"award-number":["R01-ES027892"]}],"id":[{"id":"10.13039\/100000066","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>A task for environmental health research is to produce complete pollution exposure maps despite limited monitoring data. Satellite-derived aerosol optical depth (AOD) is frequently used as a predictor in various models to improve PM2.5 estimation, despite significant gaps in coverage. We analyze PM2.5 and AOD from July 2011 in the contiguous United States. We examine two methods to aid in gap-filling AOD: (1) lattice kriging, a spatial statistical method adapted to handle large amounts data, and (2) random forest, a tree-based machine learning method. First, we evaluate each model\u2019s performance in the spatial prediction of AOD, and we additionally consider ensemble methods for combining the predictors. In order to accurately assess the predictive performance of these methods, we construct spatially clustered holdouts to mimic the observed patterns of missing data. Finally, we assess whether gap-filling AOD through one of the proposed ensemble methods can improve prediction of PM2.5 in a random forest model. Our results suggest that ensemble methods of combining lattice kriging and random forest can improve AOD gap-filling. Based on summary metrics of performance, PM2.5 predictions based on random forest models were largely similar regardless of the inclusion of gap-filled AOD, but there was some variability in daily model predictions.<\/jats:p>","DOI":"10.3390\/rs13010126","type":"journal-article","created":{"date-parts":[[2021,1,1]],"date-time":"2021-01-01T22:35:48Z","timestamp":1609540548000},"page":"126","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":40,"title":["Imputing Satellite-Derived Aerosol Optical Depth Using a Multi-Resolution Spatial Model and Random Forest for PM2.5 Prediction"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5106-4157","authenticated-orcid":false,"given":"Behzad","family":"Kianian","sequence":"first","affiliation":[{"name":"Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5477-2186","authenticated-orcid":false,"given":"Yang","family":"Liu","sequence":"additional","affiliation":[{"name":"Gangarosa Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6316-1640","authenticated-orcid":false,"given":"Howard H.","family":"Chang","sequence":"additional","affiliation":[{"name":"Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA"}]}],"member":"1968","published-online":{"date-parts":[[2021,1,1]]},"reference":[{"key":"ref_1","unstructured":"WHO (2020, August 24). Ambient (Outdoor) Air Pollution. Available online: https:\/\/web.archive.org\/web\/20200824220508\/https%3A%2F%2Fwww.who.int%2Fnews-room%2Ffact-sheets%2Fdetail%2Fambient-%2528outdoor%2529-air-quality-and-health."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2224","DOI":"10.1016\/S0140-6736(12)61766-8","article-title":"A comparative risk assessment of burden of disease and injury attributable to 67 risk factors and risk factor clusters in 21 regions, 1990\u20132010: A systematic analysis for the Global Burden of Disease Study 2010","volume":"380","author":"Lim","year":"2012","journal-title":"Lancet"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1038\/nature15371","article-title":"The contribution of outdoor air pollution sources to premature mortality on a global scale","volume":"525","author":"Lelieveld","year":"2015","journal-title":"Nature"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"2989","DOI":"10.5194\/amt-6-2989-2013","article-title":"The Collection 6 MODIS aerosol products over land and ocean","volume":"6","author":"Levy","year":"2013","journal-title":"Atmos. Meas. Tech."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1097\/MOP.0000000000000326","article-title":"Satellite remote sensing in epidemiological studies","volume":"28","author":"Just","year":"2016","journal-title":"Curr. Opin. Pediatr."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Chu, Y., Liu, Y., Li, X., Liu, Z., Lu, H., Lu, Y., Mao, Z., Chen, X., Li, N., and Ren, M. (2016). A review on predicting ground PM2.5 concentration using satellite aerosol optical depth. Atmosphere, 7.","DOI":"10.3390\/atmos7100129"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"174","DOI":"10.1080\/15481603.2019.1703288","article-title":"Estimating ground-level particulate matter concentrations using satellite-based data: A review","volume":"57","author":"Shin","year":"2020","journal-title":"GIScience Remote Sens."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Belle, J.H., and Liu, Y. (2016). Evaluation of Aqua MODIS Collection 6 AOD Parameters for Air Quality Research over the Continental United States. Remote Sens., 8.","DOI":"10.3390\/rs8100815"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Belle, J.H., Chang, H.H., Wang, Y., Hu, X., Lyapustin, A., and Liu, Y. (2017). The potential impact of satellite-retrieved cloud parameters on ground-level PM2.5 mass and composition. Int. J. Environ. Res. Public Health, 14.","DOI":"10.3390\/ijerph14101244"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"665","DOI":"10.1016\/j.rse.2018.12.002","article-title":"Impacts of snow and cloud covers on satellite-derived PM2.5 levels","volume":"221","author":"Bi","year":"2019","journal-title":"Remote Sens. Environ."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"596","DOI":"10.3155\/1047-3289.60.5.596","article-title":"Satellite remote sensing of particulate matter air quality: The cloud-cover problem","volume":"60","author":"Christopher","year":"2010","journal-title":"J. Air Waste Manag. Assoc."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"25601","DOI":"10.1073\/pnas.1919641117","article-title":"The 17-y spatiotemporal trend of PM2.5 and its mortality burden in China","volume":"117","author":"Liang","year":"2020","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"8159","DOI":"10.1029\/2018JD028573","article-title":"Satellite-Based Daily PM2.5 Estimates During Fire Seasons in Colorado","volume":"123","author":"Geng","year":"2018","journal-title":"J. Geophys. Res. Atmos."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"6267","DOI":"10.1016\/j.atmosenv.2011.08.066","article-title":"Assessing temporally and spatially resolved PM2.5 exposures for epidemiological studies using satellite aerosol optical depth measurements","volume":"45","author":"Kloog","year":"2011","journal-title":"Atmos. Environ."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"11913","DOI":"10.1021\/es302673e","article-title":"Incorporating local land use regression and satellite aerosol optical depth in a hybrid model of spatiotemporal PM2.5 exposures in the Mid-Atlantic states","volume":"46","author":"Kloog","year":"2012","journal-title":"Environ. Sci. Technol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1038\/jes.2015.41","article-title":"Spatiotemporal prediction of fine particulate matter using high-resolution satellite images in the Southeastern US 2003\u20132011","volume":"26","author":"Lee","year":"2016","journal-title":"J. Expo. Sci. Environ. Epidemiol."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"6936","DOI":"10.1021\/acs.est.7b01210","article-title":"Estimating PM2.5 concentrations in the conterminous United States using the random forest approach","volume":"51","author":"Hu","year":"2017","journal-title":"Environ. Sci. Technol."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1016\/j.rse.2017.07.023","article-title":"Full-coverage high-resolution daily PM2.5 estimation using MAIAC AOD in the Yangtze River Delta of China","volume":"199","author":"Xiao","year":"2017","journal-title":"Remote Sens. Environ."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"675","DOI":"10.1016\/j.envpol.2018.07.016","article-title":"Predicting monthly high-resolution PM2.5 concentrations with random forest model in the North China Plain","volume":"242","author":"Huang","year":"2018","journal-title":"Environ. Pollut."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"4752","DOI":"10.1021\/acs.est.5b05940","article-title":"Improving the accuracy of daily PM2.5 distributions derived from the fusion of ground-level measurements with aerosol optical depth observations, a case study in North China","volume":"50","author":"Lv","year":"2016","journal-title":"Environ. Sci. Technol."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"391","DOI":"10.1080\/01621459.1994.10476759","article-title":"Kriging and splines: An empirical comparison of their predictive performance in some applications","volume":"89","author":"Laslett","year":"1994","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1016\/j.atmosenv.2019.01.027","article-title":"Extreme gradient boosting model to estimate PM2.5 concentrations with missing-filled satellite data in China","volume":"202","author":"Chen","year":"2019","journal-title":"Atmos. Environ."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1016\/j.envint.2019.01.016","article-title":"Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013\u20132015, using a spatiotemporal land-use random-forest model","volume":"124","author":"Stafoggia","year":"2019","journal-title":"Environ. Int."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"134094","DOI":"10.1016\/j.scitotenv.2019.134094","article-title":"Estimating daily PM2.5 concentrations in New York City at the neighborhood-scale: Implications for integrating non-regulatory measurements","volume":"697","author":"Huang","year":"2019","journal-title":"Sci. Total Environ."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"998","DOI":"10.1016\/j.envpol.2018.09.052","article-title":"A nonparametric approach to filling gaps in satellite-retrieved aerosol optical depth for estimating ambient PM2.5 levels","volume":"243","author":"Zhang","year":"2018","journal-title":"Environ. Pollut."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"105146","DOI":"10.1016\/j.atmosres.2020.105146","article-title":"Estimation of hourly full-coverage PM2.5 concentrations at 1-km resolution in China using a two-stage random forest model","volume":"248","author":"Jiang","year":"2021","journal-title":"Atmos. Res."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/S0034-4257(98)00031-5","article-title":"AERONET\u2014A federated instrument network and data archive for aerosol characterization","volume":"66","author":"Holben","year":"1998","journal-title":"Remote Sens. Environ."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1002\/2014JD022453","article-title":"MODIS Collection 6 aerosol products: Comparison between Aqua\u2019s e-Deep Blue, Dark Target, and \u201cmerged\u201d data sets, and usage recommendations","volume":"119","author":"Sayer","year":"2014","journal-title":"J. Geophys. Res. Atmos."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"398","DOI":"10.1007\/s13253-018-00348-w","article-title":"A case study competition among methods for analyzing large spatial data","volume":"24","author":"Heaton","year":"2019","journal-title":"J. Agric. Biol. Environ. Stat."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1214\/16-SS115","article-title":"A comparison of spatial predictors when datasets could be very large","volume":"10","author":"Bradley","year":"2016","journal-title":"Stat. Surv."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"139761","DOI":"10.1016\/j.scitotenv.2020.139761","article-title":"Estimating daily ground-level PM2.5 in China with random-forest-based spatiotemporal kriging","volume":"740","author":"Shao","year":"2020","journal-title":"Sci. Total Environ."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"13260","DOI":"10.1021\/acs.est.8b02917","article-title":"An ensemble machine-learning model to predict historical PM2.5 concentrations in China from satellite data","volume":"52","author":"Xiao","year":"2018","journal-title":"Environ. Sci. Technol."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"104909","DOI":"10.1016\/j.envint.2019.104909","article-title":"An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution","volume":"130","author":"Di","year":"2019","journal-title":"Environ. Int."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"108601","DOI":"10.1016\/j.envres.2019.108601","article-title":"A Bayesian ensemble approach to combine PM2.5 estimates from statistical models using satellite imagery and numerical model simulation","volume":"178","author":"Murray","year":"2019","journal-title":"Environ. Res."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"579","DOI":"10.1080\/10618600.2014.914946","article-title":"A multiresolution Gaussian process model for the analysis of large spatial datasets","volume":"24","author":"Nychka","year":"2015","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"van der Laan, M.J., Polley, E.C., and Hubbard, A.E. (2007). Super learner. Stat. Appl. Genet. Mol. Biol., 6.","DOI":"10.2202\/1544-6115.1309"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"459","DOI":"10.1007\/s10654-018-0390-z","article-title":"Stacked generalization: An introduction to super learning","volume":"33","author":"Naimi","year":"2018","journal-title":"Eur. J. Epidemiol."},{"key":"ref_39","unstructured":"Levy, R., and Hsu, C. (2015). MODIS Atmosphere L2 Aerosol Product. NASA MODIS Adaptive Processing System."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"23073","DOI":"10.1029\/2001JD000807","article-title":"Global modeling of tropospheric chemistry with assimilated meteorology: Model description and evaluation","volume":"106","author":"Bey","year":"2001","journal-title":"J. Geophys. Res. Atmos."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1002\/jgrd.50867","article-title":"Comparison of GEOS-Chem aerosol optical depth with AERONET and MISR data over the contiguous United States","volume":"118","author":"Li","year":"2013","journal-title":"J. Geophys. Res. Atmos."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Cosgrove, B.A., Lohmann, D., Mitchell, K.E., Houser, P.R., Wood, E.F., Schaake, J.C., Robock, A., Marshall, C., Sheffield, J., and Duan, Q. (2003). Real-time and retrospective forcing in the North American Land Data Assimilation System (NLDAS) project. J. Geophys. Res. Atmos., 108.","DOI":"10.1029\/2002JD003118"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Mitchell, K.E., Lohmann, D., Houser, P.R., Wood, E.F., Schaake, J.C., Robock, A., Cosgrove, B.A., Sheffield, J., Duan, Q., and Luo, L. (2004). The multi-institution North American Land Data Assimilation System (NLDAS): Utilizing multiple GCIP products and partners in a continental distributed hydrological modeling system. J. Geophys. Res. Atmos., 109.","DOI":"10.1029\/2003JD003823"},{"key":"ref_44","unstructured":"Nychka, D., Hammerling, D., Sain, S., and Lenssen, N. (2020, December 30). LatticeKrig: Multiresolution Kriging Based on Markov Random Fields. R Package Version 8.4, 2016. Available online: https:\/\/cran.r-project.org\/web\/packages\/LatticeKrig\/index.html."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Wright, M.N., and Ziegler, A. (2017). ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. J. Stat. Softw., 77.","DOI":"10.18637\/jss.v077.i01"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1007\/BF00117832","article-title":"Stacked regressions","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach. Learn."},{"key":"ref_47","unstructured":"Polley, E.C., and van der Laan, M.J. (2010). Super learner in prediction. U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 266, U.C. Berkeley."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1515\/ijb-2014-0060","article-title":"Optimal spatial prediction using ensemble machine learning","volume":"12","author":"Davies","year":"2016","journal-title":"Int. J. Biostat."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1111\/2041-210X.13107","article-title":"blockCV: An R package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution models","volume":"10","author":"Valavi","year":"2019","journal-title":"Methods Ecol. Evol."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1016\/j.atmosenv.2019.02.025","article-title":"Gaussian Markov Random Fields versus Linear Mixed Models for satellite-based PM2.5 assessment: Evidence from the Northeastern USA","volume":"205","author":"Sarafian","year":"2019","journal-title":"Atmos. Environ."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"3686","DOI":"10.1021\/acs.est.5b05099","article-title":"Satellite-based NO2 and model validation in a national prediction model based on universal kriging and land-use regression","volume":"50","author":"Young","year":"2016","journal-title":"Environ. Sci. Technol."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"724","DOI":"10.1198\/jcgs.2010.09051","article-title":"Fixed rank filtering for spatio-temporal data","volume":"19","author":"Cressie","year":"2010","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1111\/j.1467-9868.2011.00777.x","article-title":"An explicit link between Gaussian fields and Gaussian Markov random fields: The stochastic partial differential equation approach","volume":"73","author":"Lindgren","year":"2011","journal-title":"J. R. Stat. Soc. Ser. B Stat. Methodol."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"800","DOI":"10.1080\/01621459.2015.1044091","article-title":"Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets","volume":"111","author":"Datta","year":"2016","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"1286","DOI":"10.1214\/16-AOAS931","article-title":"Nonseparable dynamic nearest neighbor Gaussian process models for large spatio-temporal data with an application to particulate matter analysis","volume":"10","author":"Datta","year":"2016","journal-title":"Ann. Appl. Stat."},{"key":"ref_56","unstructured":"Bradley, J.R. (2019). What is the best predictor that you can compute in five minutes using a given Bayesian hierarchical model?. arXiv."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1080\/01621459.2015.1123632","article-title":"A multi-resolution approximation for massive spatial datasets","volume":"112","author":"Katzfuss","year":"2017","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"100465","DOI":"10.1016\/j.spasta.2020.100465","article-title":"Spatiotemporal multi-resolution approximations for analyzing global environmental data","volume":"38","author":"Appel","year":"2020","journal-title":"Spat. Stat."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/j.atmosenv.2018.11.049","article-title":"Using gap-filled MAIAC AOD and WRF-Chem to estimate daily PM2.5 concentrations at 1 km resolution in the Eastern United States","volume":"199","author":"Goldberg","year":"2019","journal-title":"Atmos. Environ."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/1\/126\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:06:09Z","timestamp":1760159169000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/1\/126"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1,1]]},"references-count":59,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2021,1]]}},"alternative-id":["rs13010126"],"URL":"https:\/\/doi.org\/10.3390\/rs13010126","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,1,1]]}}}