{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,20]],"date-time":"2025-12-20T22:20:45Z","timestamp":1766269245668,"version":"build-2065373602"},"reference-count":57,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2023,7,21]],"date-time":"2023-07-21T00:00:00Z","timestamp":1689897600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["42001343","42071384"],"award-info":[{"award-number":["42001343","42071384"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJGI"],"abstract":"<jats:p>Geographically weighted regression (GWR) is a classical method for estimating nonstationary relationships. Notwithstanding the great potential of the model for processing geographic data, its large-scale application still faces the challenge of high computational costs. To solve this problem, we proposed a computationally efficient GWR method, called K-Nearest Neighbors Geographically weighted regression (KNN-GWR). First, it utilizes a k-dimensional tree (KD tree) strategy to improve the speed of finding observations around the regression points, and, to optimize the memory complexity, the submatrices of neighbors are extracted from the matrix of the sample dataset. Next, the optimal bandwidth is found by referring to the spatial clustering relationship explained by K-means. Finally, the performance and accuracy of the proposed KNN-GWR method was evaluated using a simulated dataset and a Chinese house price dataset. The results demonstrated that the KNN-GWR method achieved computational efficiency thousands of times faster than existing GWR algorithms, while ensuring accuracy and significantly improving memory optimization. To the best of our knowledge, this method was able to run hundreds of thousands or millions of data on a standard computer, which can inform improvement in the efficiency of local regression models.<\/jats:p>","DOI":"10.3390\/ijgi12070295","type":"journal-article","created":{"date-parts":[[2023,7,24]],"date-time":"2023-07-24T01:12:28Z","timestamp":1690161148000},"page":"295","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["A New Algorithm for Large-Scale Geographically Weighted Regression with K-Nearest Neighbors"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0483-6249","authenticated-orcid":false,"given":"Xiaoyue","family":"Yang","sequence":"first","affiliation":[{"name":"School of Marine Technology and Geomatics, Jiangsu Ocean University, Lianyungang 222005, China"}]},{"given":"Yi","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Marine Technology and Geomatics, Jiangsu Ocean University, Lianyungang 222005, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9275-3250","authenticated-orcid":false,"given":"Shenghua","family":"Xu","sequence":"additional","affiliation":[{"name":"Research Center of Geospatial Big Data Application, Chinese Academy of Surveying and Mapping, Beijing 100036, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5055-0623","authenticated-orcid":false,"given":"Jiakuan","family":"Han","sequence":"additional","affiliation":[{"name":"School of Marine Technology and Geomatics, Jiangsu Ocean University, Lianyungang 222005, China"}]},{"given":"Zhengyuan","family":"Chai","sequence":"additional","affiliation":[{"name":"School of Marine Technology and Geomatics, Jiangsu Ocean University, Lianyungang 222005, China"}]},{"given":"Gang","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Marine Technology and Geomatics, Jiangsu Ocean University, Lianyungang 222005, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,7,21]]},"reference":[{"key":"ref_1","unstructured":"Fotheringham, A.S., Brunsdon, C., and Charlton, M.E. (2002). Geographically Weighted Regression, John Wiley & Sons."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1111\/j.1538-4632.1996.tb00936.x","article-title":"Geographically Weighted Regression: A Method for Exploring Spatial Nonstationarity","volume":"28","author":"Brunsdon","year":"1996","journal-title":"Geogr. Anal."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"148455","DOI":"10.1016\/j.scitotenv.2021.148455","article-title":"Digital mapping of zinc in urban topsoil using multisource geospatial data and random forest","volume":"792","author":"Shi","year":"2021","journal-title":"Sci. Total Environ."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1007\/s11442-017-1386-4","article-title":"Comparative evaluation of geological disaster susceptibility using multi-regression methods and spatial accuracy validation","volume":"27","author":"Jiang","year":"2017","journal-title":"J. Geogr. Sci."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"627","DOI":"10.1016\/j.geoderma.2012.05.022","article-title":"A geographically weighted regression kriging approach for mapping soil organic carbon stock","volume":"189\u2013190","author":"Kumar","year":"2012","journal-title":"Geoderma"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1111\/geb.12841","article-title":"Phylogenetically weighted regression: A method for modelling non-stationarity on evolutionary trees","volume":"28","author":"Davies","year":"2018","journal-title":"Glob. Ecol. Biogeogr."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1314","DOI":"10.1111\/geb.12203","article-title":"Generalizing the use of geographical weights in biodiversity modelling","volume":"23","author":"Mellin","year":"2014","journal-title":"Glob. Ecol. Biogeogr."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"102387","DOI":"10.1016\/j.trd.2020.102387","article-title":"Accessibility to transit, by transit, and property prices: Spatially varying relationships","volume":"85","author":"Yang","year":"2020","journal-title":"Transp. Res. Part D Transp. Environ."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"489","DOI":"10.1080\/13658816.2018.1545158","article-title":"Multiscale geographically and temporally weighted regression: Exploring the spatiotemporal determinants of housing prices","volume":"33","author":"Wu","year":"2018","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1111\/gean.12071","article-title":"Geographical and Temporal Weighted Regression (GTWR)","volume":"47","author":"Fotheringham","year":"2015","journal-title":"Geogr. Anal."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"383","DOI":"10.1080\/13658810802672469","article-title":"Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices","volume":"24","author":"Huang","year":"2010","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1611","DOI":"10.1080\/13658816.2021.1882681","article-title":"Spatiotemporal effects of climate factors on childhood hand, foot, and mouth disease: A case study using mixed geographically and temporally weighted regression models","volume":"35","author":"Hong","year":"2021","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"17707","DOI":"10.1038\/s41598-018-35721-9","article-title":"Exploration of potential risks of Hand, Foot, and Mouth Disease in Inner Mongolia Autonomous Region, China Using Geographically Weighted Regression Model","volume":"8","author":"Hong","year":"2018","journal-title":"Sci. Rep."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1080\/13658816.2011.585612","article-title":"Modelling spatial heterogeneity and anisotropy: Child anaemia, sanitation and basic infrastructure in sub-Saharan Africa","volume":"26","author":"Mainardi","year":"2012","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"22282","DOI":"10.1038\/s41598-021-01757-7","article-title":"Assessing the impact of land surface temperature on urban net primary productivity increment based on geographically weighted regression model","volume":"11","author":"Lu","year":"2021","journal-title":"Sci. Rep."},{"key":"ref_16","unstructured":"Bivand, R., Yu, D., Nakaya, T., and Garcia-Lopez, M.-A. (2022). Package SPGWR, R Foundation for Statistical Computing. R Software Package."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Oshan, T., Li, Z., Kang, W., Wolf, L., and Fotheringham, A. (2019). mgwr: A Python Implementation of Multiscale Geographically Weighted Regression for Investigating Process Spatial Heterogeneity and Scale. ISPRS Int. J. Geo-Inf., 8.","DOI":"10.3390\/ijgi8060269"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v063.i17","article-title":"GWmodel: An R Package for Exploring Spatial Heterogeneity Using Geographically Weighted Models","volume":"63","author":"Gollini","year":"2015","journal-title":"J. Stat. Softw."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1080\/13658816.2018.1521523","article-title":"Fast Geographically Weighted Regression (FastGWR): A scalable algorithm to investigate spatial process heterogeneity in millions of observations","volume":"33","author":"Li","year":"2019","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"832","DOI":"10.1080\/17538947.2019.1585976","article-title":"Big Earth data: Disruptive changes in Earth observation data management and analysis?","volume":"13","author":"Sudmanns","year":"2019","journal-title":"Int. J. Digit. Earth"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1016\/j.future.2014.10.029","article-title":"Remote sensing big data computing: Challenges and opportunities","volume":"51","author":"Ma","year":"2015","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1186\/s40537-019-0197-0","article-title":"A survey on Image Data Augmentation for Deep Learning","volume":"6","author":"Shorten","year":"2019","journal-title":"J. Big Data"},{"key":"ref_23","first-page":"346","article-title":"Reflections and speculations on the progress in Geographic Information Systems (GIS): A geographic perspective","volume":"33","author":"Batty","year":"2018","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"6999","DOI":"10.1021\/acs.est.7b00891","article-title":"High-Resolution Air Pollution Mapping with Google Street View Cars: Exploiting Big Data","volume":"51","author":"Apte","year":"2017","journal-title":"Environ. Sci. Technol."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1016\/j.bdr.2015.01.003","article-title":"Geospatial Big Data: Challenges and Opportunities","volume":"2","author":"Lee","year":"2015","journal-title":"Big Data Res."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"701","DOI":"10.14358\/PERS.86.11.701","article-title":"A New Approach to Land Registry System in Turkey: Blockchain-Based System Proposal","volume":"86","author":"Mendi","year":"2020","journal-title":"Photogramm. Eng. Remote Sens."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1111\/j.2041-210X.2010.00060.x","article-title":"Comparing spatially-varying coefficients models for analysis of ecological data with non-stationary and anisotropic residual dependence","volume":"2","author":"Finley","year":"2011","journal-title":"Methods Ecol. Evol."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1111\/j.1467-9671.2009.01181.x","article-title":"Grid-enabling Geographically Weighted Regression: A Case Study of Participation in Higher Education in England","volume":"14","author":"Harris","year":"2010","journal-title":"Trans. GIS"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"267","DOI":"10.2747\/1548-1603.44.3.267","article-title":"Modeling Owner-Occupied Single-Family House Values in the City of Milwaukee: A Geographically Weighted Regression Approach","volume":"44","author":"Yu","year":"2007","journal-title":"GISci. Remote Sens."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1016\/j.jtrangeo.2018.03.002","article-title":"A massive geographically weighted regression model of walking-environment relationships","volume":"68","author":"Feuillet","year":"2018","journal-title":"J. Transp. Geogr."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Wang, D., Yang, Y., Qiu, A., Kang, X., Han, J., and Chai, Z. (2020). A CUDA-Based Parallel Geographically Weighted Regression for Large-Scale Geographic Data. ISPRS Int. J. Geo-Inf., 9.","DOI":"10.3390\/ijgi9110653"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"258","DOI":"10.1016\/j.neucom.2020.02.058","article-title":"RNN-GWR: A geographically weighted regression approach for frequently updated data","volume":"399","author":"Tasyurek","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_33","unstructured":"and Gill, S. (2018, January 17\u201318). k-dLst Tree: K-d Tree with Linked List to Handle Duplicate Keys. Proceedings of the Emerging Trends in Expert Applications and Security, Singapore."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"107156","DOI":"10.1016\/j.asoc.2021.107156","article-title":"KDT-SPSO: A multimodal particle swarm optimisation algorithm based on k-d trees for palm tree detection","volume":"103","author":"Chen","year":"2021","journal-title":"Appl. Soft Comput."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"W572","DOI":"10.1093\/nar\/gkh436","article-title":"ProteinDBS: A real-time retrieval system for protein structure comparison","volume":"32","author":"Shyu","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"728","DOI":"10.1007\/s10115-003-0122-9","article-title":"The k-Nearest Neighbour Join: Turbo Charging the KDD Process","volume":"6","author":"Krebs","year":"2004","journal-title":"Knowl. Inf. Syst."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2227","DOI":"10.1109\/TPAMI.2014.2321376","article-title":"Scalable Nearest Neighbor Algorithms for High Dimensional Data","volume":"36","author":"Muja","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_38","first-page":"331","article-title":"Fast approximate nearest neighbors with automatic algorithm configuration","volume":"1","author":"Muja","year":"2009","journal-title":"Proc. Viss."},{"key":"ref_39","first-page":"1","article-title":"Outlier detection: Methods, models, and classification","volume":"53","author":"Boukerche","year":"2020","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"651","DOI":"10.1016\/j.patrec.2009.09.011","article-title":"Data clustering: 50 years beyond K-means","volume":"31","author":"Jain","year":"2010","journal-title":"Pattern Recognit. Lett."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1109\/TETC.2014.2330519","article-title":"A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis","volume":"2","author":"Fahad","year":"2014","journal-title":"IEEE Trans. Emerg. Top. Comput."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/j.neucom.2018.02.072","article-title":"k-means: A revisit","volume":"291","author":"Zhao","year":"2018","journal-title":"Neurocomputing"},{"key":"ref_43","unstructured":"Macqueen, J. (July, January 21). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1109\/TPAMI.1984.4767478","article-title":"K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality","volume":"PAMI-6","author":"Selim","year":"1984","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"102631","DOI":"10.1016\/j.jtrangeo.2019.102631","article-title":"Spatially varying impacts of built environment factors on rail transit ridership at station level: A case study in Guangzhou, China","volume":"82","author":"Li","year":"2020","journal-title":"J. Transp. Geogr."},{"key":"ref_46","first-page":"100622","article-title":"Using accommodation price determinants to segment tourist areas","volume":"21","year":"2021","journal-title":"J. Destin. Mark. Manag."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"135768","DOI":"10.1016\/j.jclepro.2022.135768","article-title":"Unraveling the association between the built environment and air pollution from a geospatial perspective","volume":"386","author":"Deng","year":"2023","journal-title":"J. Clean. Prod."},{"key":"ref_48","first-page":"459","article-title":"Scalable GWR: A Linear-Time Algorithm for Large-Scale Geographically Weighted Regression with Polynomial Kernels","volume":"111","author":"Murakami","year":"2020","journal-title":"Ann. Am. Assoc. Geogr."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1016\/j.spasta.2019.02.003","article-title":"Spatially varying coefficient modeling for large datasets: Eliminating N from spatial regressions","volume":"30","author":"Murakami","year":"2019","journal-title":"Spat. Stat."},{"key":"ref_50","unstructured":"Mardia, K.V., Kent, J.T., and Bibby, J.M. (1979). Multivariate Analysis, Academic Press."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1016\/j.elerap.2011.12.006","article-title":"Rsqrt: An Heuristic for Estimating the Number of Clusters to Report","volume":"11","author":"Carlis","year":"2012","journal-title":"Electron. Commer. Res. Appl."},{"key":"ref_52","first-page":"33","article-title":"Solving the Problem of the K Parameter in the KNN Classifier Using an Ensemble Learning Approach","volume":"12","author":"Hassanat","year":"2014","journal-title":"Comput. Sci."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"750","DOI":"10.1198\/016214503000000666","article-title":"Finding the Number of Clusters in a Dataset","volume":"98","author":"Sugar","year":"2003","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1111\/1467-9868.00293","article-title":"Estimating the number of clusters in a data set via the gap statistic","volume":"63","author":"Tibshirani","year":"2001","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"ref_55","unstructured":"Press, W.H., Teukolsky, S.A., Vetterling, W.T., and Flannery, B.P. (2007). Numerical Recipes: The Art of Scientific Computing, Cambridge University Press. [3rd ed.]."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"657","DOI":"10.1007\/s11004-010-9284-7","article-title":"The Use of Geographically Weighted Regression for Spatial Prediction: An Evaluation of Models Using Simulated Data Sets","volume":"42","author":"Harris","year":"2010","journal-title":"Math. Geosci."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"737","DOI":"10.1016\/j.econmod.2020.02.015","article-title":"Scale-adaptive estimation of mixed geographically weighted regression models","volume":"94","author":"Chen","year":"2021","journal-title":"Econ. Model."}],"container-title":["ISPRS International Journal of Geo-Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2220-9964\/12\/7\/295\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:16:49Z","timestamp":1760127409000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2220-9964\/12\/7\/295"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,21]]},"references-count":57,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2023,7]]}},"alternative-id":["ijgi12070295"],"URL":"https:\/\/doi.org\/10.3390\/ijgi12070295","relation":{},"ISSN":["2220-9964"],"issn-type":[{"type":"electronic","value":"2220-9964"}],"subject":[],"published":{"date-parts":[[2023,7,21]]}}}