{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T18:17:00Z","timestamp":1775672220921,"version":"3.50.1"},"reference-count":57,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2016,11,11]],"date-time":"2016-11-11T00:00:00Z","timestamp":1478822400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Regression tree models have been widely used for remote sensing-based ecosystem mapping. Improper use of the sample data (model training and testing data) may cause overfitting and underfitting effects in the model. The goal of this study is to develop an optimal sampling data usage strategy for any dataset and identify an appropriate number of rules in the regression tree model that will improve its accuracy and robustness. Landsat 8 data and Moderate-Resolution Imaging Spectroradiometer-scaled Normalized Difference Vegetation Index (NDVI) were used to develop regression tree models. A Python procedure was designed to generate random replications of model parameter options across a range of model development data sizes and rule number constraints. The mean absolute difference (MAD) between the predicted and actual NDVI (scaled NDVI, value from 0\u2013200) and its variability across the different randomized replications were calculated to assess the accuracy and stability of the models. In our case study, a six-rule regression tree model developed from 80% of the sample data had the lowest MAD (MADtraining = 2.5 and MADtesting = 2.4), which was suggested as the optimal model. This study demonstrates how the training data and rule number selections impact model accuracy and provides important guidance for future remote-sensing-based ecosystem modeling.<\/jats:p>","DOI":"10.3390\/rs8110943","type":"journal-article","created":{"date-parts":[[2016,11,11]],"date-time":"2016-11-11T10:05:56Z","timestamp":1478858756000},"page":"943","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":55,"title":["An Optimal Sample Data Usage Strategy to Minimize Overfitting and Underfitting Effects in Regression Tree Models Based on Remotely-Sensed Data"],"prefix":"10.3390","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3544-1856","authenticated-orcid":false,"given":"Yingxin","family":"Gu","sequence":"first","affiliation":[{"name":"ASRC InuTeq, Contractor to US Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center, 47914 252nd Street, Sioux Falls, SD 57198, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7374-1083","authenticated-orcid":false,"given":"Bruce","family":"Wylie","sequence":"additional","affiliation":[{"name":"US Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center, 47914 252nd Street, Sioux Falls, SD 57198, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5462-3225","authenticated-orcid":false,"given":"Stephen","family":"Boyte","sequence":"additional","affiliation":[{"name":"Stinger Ghaffarian Technologies (SGT), Contractor to US Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center, 47914 252nd Street, Sioux Falls, SD 57198, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4021-4623","authenticated-orcid":false,"given":"Joshua","family":"Picotte","sequence":"additional","affiliation":[{"name":"ASRC InuTeq, Contractor to US Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center, 47914 252nd Street, Sioux Falls, SD 57198, USA"}]},{"given":"Daniel","family":"Howard","sequence":"additional","affiliation":[{"name":"Stinger Ghaffarian Technologies (SGT), Contractor to US Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center, 47914 252nd Street, Sioux Falls, SD 57198, USA"}]},{"given":"Kelcy","family":"Smith","sequence":"additional","affiliation":[{"name":"Stinger Ghaffarian Technologies (SGT), Contractor to US Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center, 47914 252nd Street, Sioux Falls, SD 57198, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4911-4511","authenticated-orcid":false,"given":"Kurtis","family":"Nelson","sequence":"additional","affiliation":[{"name":"US Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center, 47914 252nd Street, Sioux Falls, SD 57198, USA"}]}],"member":"1968","published-online":{"date-parts":[[2016,11,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Anderson, J.R., Hardy, E.E., Roach, J.T., and Witmer, R.E. (1976). A Land Use and Land Cover Classification System for Use with Remote Sensor Data.","DOI":"10.3133\/pp964"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"526","DOI":"10.3390\/rs2020526","article-title":"Phenological classification of the United States: A geographic framework for extending multi-sensor time-series data","volume":"2","author":"Gu","year":"2010","journal-title":"Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"196","DOI":"10.1080\/17538940802038366","article-title":"Integrating modelling and remote sensing to identify ecosystem performance anomalies in the boreal forest, Yukon River Basin, Alaska","volume":"1","author":"Wylie","year":"2008","journal-title":"Int. J. Digit. Earth"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1880","DOI":"10.3390\/rs2081880","article-title":"Detecting ecosystem performance anomalies for land management in the upper colorado river basin using satellite observations, climate data, and ecosystem models","volume":"2","author":"Gu","year":"2010","journal-title":"Remote Sens."},{"key":"ref_5","first-page":"345","article-title":"Completion of the 2011 national land cover database for the conterminous United States\u2013representing a decade of land cover change information","volume":"81","author":"Homer","year":"2015","journal-title":"Photogramm. Eng. Remote Sens."},{"key":"ref_6","first-page":"233","article-title":"Multi-scale remote sensing sagebrush characterization with regression trees over wyoming, USA: Laying a foundation for monitoring","volume":"14","author":"Homer","year":"2012","journal-title":"Int. J. Appl. Earth Obs. Geoinf."},{"key":"ref_7","first-page":"71","article-title":"Drought monitoring with ndvi-based standardized vegetation index","volume":"68","author":"Peters","year":"2002","journal-title":"Photogramm. Eng. Remote Sens."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"811","DOI":"10.1029\/93GB02725","article-title":"Terrestrial ecosystem production: A process model based on global satellite and surface data","volume":"7","author":"Potter","year":"1993","journal-title":"Glob. Biogeochem. Cycles"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1016\/0034-4257(85)90097-5","article-title":"Satellite remote sensing of total herbaceous biomass production in the senegalese sahel: 1980\u20131984","volume":"17","author":"Tucker","year":"1985","journal-title":"Remote Sens. Environ."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"703","DOI":"10.2307\/3235884","article-title":"Measuring phenological variability from satellite imagery","volume":"5","author":"Reed","year":"1994","journal-title":"J. Veg. Sci."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1303","DOI":"10.1080\/014311600210191","article-title":"Development of a global land cover characteristics database and IGBP DISCover from 1 km AVHRR data","volume":"21","author":"Loveland","year":"2000","journal-title":"Int. J. Remote Sens."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"19","DOI":"10.2111\/04-116R2.1","article-title":"A protocol for retrospective remote sensing-based ecological monitoring of rangelands","volume":"59","author":"West","year":"2006","journal-title":"Rangel. Ecol. Manag."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1071\/WF9980159","article-title":"Fuel models and fire potential from satellite and surface observations","volume":"8","author":"Burgan","year":"1998","journal-title":"Int. J. Wildland Fire"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1016\/j.rse.2014.01.011","article-title":"Continuous change detection and classification of land cover using all available Landsat data","volume":"144","author":"Zhu","year":"2014","journal-title":"Remote Sens. Environ."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1016\/j.isprsjprs.2014.09.002","article-title":"Global land cover mapping at 30 m resolution: A pok-based operational approach","volume":"103","author":"Chen","year":"2015","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_16","first-page":"30","article-title":"Next generation of global land cover characterization, mapping, and monitoring","volume":"25","author":"Giri","year":"2013","journal-title":"Int. J. Appl. Earth Obs. Geoinform."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Schwartz, M.D. (2003). Phenology: An Integrative Environmental Science, Kluwer Academic Publ.","DOI":"10.1007\/978-94-007-0632-3"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2187","DOI":"10.1080\/01431161.2012.742215","article-title":"MODIS-informed greenness responses to daytime land surface temperature fluctuations and wildfire disturbances in the Alaskan Yukon River Basin","volume":"34","author":"Tan","year":"2012","journal-title":"Int. J. Remote Sens."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"2335","DOI":"10.1111\/j.1365-2486.2009.01910.x","article-title":"Intercomparison, interpretation, and assessment of spring phenology in North America estimated from remote sensing for 1982\u20132006","volume":"15","author":"White","year":"2009","journal-title":"Glob. Chang. Biol."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1312","DOI":"10.1016\/j.rse.2010.01.010","article-title":"A generalized regression-based model for forecasting winter wheat yields in Kansas and Ukraine using MODIS data","volume":"114","author":"Vermote","year":"2010","journal-title":"Remote Sens. Environ."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Howard, D.M., Wylie, B.K., and Tieszen, L.L. (2012). Crop classification modelling using remote sensing and environmental data in the greater Platte River Basin, USA. Int. J. Remote Sens., 33.","DOI":"10.1080\/01431161.2012.680617"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Wylie, B.K., Boyte, S.P., and Major, D.J. (2012). Ecosystem performance monitoring of rangelands by integrating modeling and remote sensing. Rangel. Ecol. Manag., 65.","DOI":"10.2111\/REM-D-11-00058.1"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1016\/j.agrformet.2015.10.011","article-title":"Drought assessment and monitoring through blending of multi-sensor indices using machine learning approaches for different climate regions","volume":"216","author":"Park","year":"2016","journal-title":"Agric. Forest Meteorol."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1016\/j.rse.2007.02.016","article-title":"Developing a continental-scale measure of gross primary production by combining MODIS and ameriflux data through support vector machine approach","volume":"110","author":"Yang","year":"2007","journal-title":"Remote Sens. Environ."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1016\/j.rse.2009.10.013","article-title":"A continuous measure of gross primary production for the conterminous United States derived from MODIS and ameriflux data","volume":"114","author":"Xiao","year":"2010","journal-title":"Remote Sens. Environ."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Zhang, L., Wylie, B.K., Ji, L., Gilmanov, T.G., Tieszen, L.L., and Howard, D.M. (2011). Upscaling carbon fluxes over the great plains grasslands: Sinks and sources. J. Geophys. Res. Biogeosci., 116.","DOI":"10.1029\/2010JG001504"},{"key":"ref_27","unstructured":"RuleQuest Research. Available online: http:\/\/www.rulequest.com\/."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"40","DOI":"10.2111\/08-232.1","article-title":"Climate-driven interannual variability in net ecosystem exchange in the Northern Great Plains Grasslands","volume":"63","author":"Zhang","year":"2010","journal-title":"Rangel. Ecol. Manag."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"3489","DOI":"10.3390\/rs70403489","article-title":"Downscaling 250-m MODIS growing season NDVI based on multiple-date Landsat images and data mining approaches","volume":"7","author":"Gu","year":"2015","journal-title":"Remote Sens."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Boyte, S.P., Wylie, B.K., Major, D.J., and Brown, J.F. (2013). The integration of geophysical and enhanced moderate resolution imaging spectroradiometer normalized difference vegetation index data into a rule-based, piecewise regression-tree model to estimate cheatgrass beginning of spring growth. Int. J. Digit. Earth, 8.","DOI":"10.1080\/17538947.2013.860196"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"16","DOI":"10.2747\/1548-1603.45.1.16","article-title":"The vegetation drought response index (vegdri): A new integrated approach for monitoring drought stress in vegetation","volume":"45","author":"Brown","year":"2008","journal-title":"GISci. Remote Sens."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer. [2nd ed.].","DOI":"10.1007\/978-0-387-84858-7"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1007\/s102080010030","article-title":"Best choices for regularization parameters in learning theory: On the bias\u2014Variance problem","volume":"2","author":"Smale","year":"2002","journal-title":"Found. Comput. Math."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Gavrilova, M., Gervasi, O., Kumar, V., Tan, C.J.K., Taniar, D., Lagan\u00e1, A., Mun, Y., and Choo, H. (2006). Computational Science and Its Applications\u2014ICCSA 2006: International Conference, Glasgow, Uk, 8\u201311 May 2006. Proceedings, Part I, Springer.","DOI":"10.1007\/11751595"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Quinlan, J.R. (1993, January 27\u201329). Combining instance-based and model-based learning. Proceedings of the Tenth International Conference on Machine Learning, Amherst, MA, USA.","DOI":"10.1016\/B978-1-55860-307-3.50037-X"},{"key":"ref_36","unstructured":"Rouse, J.W., Haas, H.R., Deering, D.W., Schell, J.A., and Harlan, J.C. (1974). Monitoring the Vernal Advancement and Retrogradation (Green Wave Effect) of Natural Vegetation, NTRS."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1016\/0034-4257(79)90013-0","article-title":"Red and photographic infrared linear combinations for monitoring vegetation","volume":"8","author":"Tucker","year":"1979","journal-title":"Remote Sens. Environ."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1225","DOI":"10.1175\/1520-0469(1998)055<1225:SSDASP>2.0.CO;2","article-title":"Satellite-sensed distribution and spatial patterns of vegetation parameters over a Tallgrass Prairie","volume":"55","author":"Chen","year":"1998","journal-title":"J. Atmos. Sci."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1016\/j.rse.2008.08.015","article-title":"Phenologically-tuned MODIS NDVI-based production anomaly estimates for Zimbabwe","volume":"113","author":"Funk","year":"2009","journal-title":"Remote Sens. Environ."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1016\/j.ecolind.2012.05.024","article-title":"Mapping grassland productivity with 250-m emodis NDVI and ssurgo database over the greater Platte River Basin, USA","volume":"24","author":"Gu","year":"2013","journal-title":"Ecol. Indic."},{"key":"ref_41","unstructured":"MODIS Products Table, Available online: https:\/\/lpdaac.usgs.gov\/dataset_discovery\/modis\/modis_products_table."},{"key":"ref_42","first-page":"59","article-title":"NDVI, C3 and C4 production, and distributions in Great Plains grassland land cover classes","volume":"7","author":"Tieszen","year":"1997","journal-title":"Ecol. Appl."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"159","DOI":"10.2307\/4002804","article-title":"Satellite-based herbaceous biomass estimates in the pastoral zone of Niger","volume":"48","author":"Wylie","year":"1995","journal-title":"J. Range Manag."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1016\/j.rse.2015.10.018","article-title":"Developing a 30-m grassland productivity estimation map for Central Nebraska using 250-m MODIS and 30-m Landsat-8 observations","volume":"171","author":"Gu","year":"2015","journal-title":"Remote Sens. Environ."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"573","DOI":"10.14358\/PERS.81.7.573","article-title":"A Landsat data tiling and compositing approach optimized for change detection in the conterminous United States","volume":"81","author":"Nelson","year":"2015","journal-title":"Photogramm. Eng. Remote Sens."},{"key":"ref_46","unstructured":"USGS eMODIS Data, Available online: https:\/\/lta.cr.usgs.gov\/emodis."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Jenkerson, C.B., Maiersperger, T.K., and Schmidt, G.L. (2010). Emodis\u2014A User-Friendly Data Source.","DOI":"10.3133\/ofr20101055"},{"key":"ref_48","unstructured":"Swets, D.L., Reed, B.C., Rowland, J.R., and Marko, S.E. (1999, January 17\u201321). A weighted least-squares approach to temporal smoothing of NDVI. Proceedings of the ASPRS Annual Conference, From Image to Information, Portland, Oregon."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"16226","DOI":"10.3390\/rs71215825","article-title":"Application-ready expedited MODIS data for operational land surface monitoring of vegetation condition","volume":"7","author":"Brown","year":"2015","journal-title":"Remote Sens."},{"key":"ref_50","unstructured":"National Land Cover Database 2011, Available online: http:\/\/www.mrlc.gov\/nlcd2011.php."},{"key":"ref_51","unstructured":"Python Software Foundation. Available online: https:\/\/www.python.org\/."},{"key":"ref_52","unstructured":"Gu, Y., Wylie, B.K., and Boyte, S.P. Landsat 8 Six Spectral Band Data and MODIS NDVI Data for Assessing the Optimal Regression Tree Models. Available online: https:\/\/dx.doi.org\/10.5066\/F7319T1P."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"1467","DOI":"10.1016\/j.neunet.2004.07.002","article-title":"Fast exact leave-one-out cross-validation of sparse least-squares support vector machines","volume":"17","author":"Cawley","year":"2004","journal-title":"Neural Netw."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1016\/S0034-4257(03)00004-X","article-title":"Calibration of remotely sensed, coarse resolution NDVI to co 2 fluxes in a sagebrush-steppe ecosystem","volume":"85","author":"Wylie","year":"2003","journal-title":"Remote Sens. Environ."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1016\/j.rse.2006.09.017","article-title":"Adaptive data-driven models for estimating carbon fluxes in the Northern Great Plains","volume":"106","author":"Wylie","year":"2007","journal-title":"Remote Sens. Environ."},{"key":"ref_56","first-page":"451","article-title":"Estimating aboveground biomass in interior alaska with Landsat data and field measurements","volume":"18","author":"Ji","year":"2012","journal-title":"Int. J. Appl. Earth Obs. Geoinf."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1016\/j.agrformet.2014.06.013","article-title":"Data-driven diagnostics of terrestrial carbon dynamics over North America","volume":"197","author":"Xiao","year":"2014","journal-title":"Agric. For. Meteorol."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/8\/11\/943\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T19:35:21Z","timestamp":1760211321000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/8\/11\/943"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,11,11]]},"references-count":57,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2016,11]]}},"alternative-id":["rs8110943"],"URL":"https:\/\/doi.org\/10.3390\/rs8110943","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,11,11]]}}}