{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:25:00Z","timestamp":1750220700624,"version":"3.41.0"},"reference-count":63,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2019,11,25]],"date-time":"2019-11-25T00:00:00Z","timestamp":1574640000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation, USA","doi-asserted-by":"publisher","award":["IIS-1907855, IIS-1525953, and CNS-1512877"],"award-info":[{"award-number":["IIS-1907855, IIS-1525953, and CNS-1512877"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Spatial Algorithms Syst."],"published-print":{"date-parts":[[2019,12,31]]},"abstract":"<jats:p>Autologistic regression is one of the most popular statistical tools to predict spatial phenomena in several applications, including epidemic diseases detection, species occurrence prediction, earth observation, and business management. In general, autologistic regression divides the space into a two-dimensional grid, where the prediction is performed at each cell in the grid. The prediction at any location is based on a set of predictors (i.e., features) at this location and predictions from neighboring locations. In this article, we address the problem of building efficient autologistic models with multinomial (i.e., categorical) prediction and predictor variables, where the categories represented by these variables are unordered. Unfortunately, existing methods to build autologistic models are designed for binary variables in addition to being computationally expensive (i.e., do not scale up for large-scale grid data such as fine-grained satellite images). Therefore, we introduce<jats:italic>RegRocket<\/jats:italic>: a scalable framework to build multinomial autologistic models for predicting large-scale spatial phenomena.<jats:italic>RegRocket<\/jats:italic>considers both the accuracy and efficiency aspects when learning the regression model parameters. To this end,<jats:italic>RegRocket<\/jats:italic>is built on top of Markov Logic Network (MLN), a scalable statistical learning framework, where its internals and data structures are optimized to process spatial data.<jats:italic>RegRocket<\/jats:italic>provides an equivalent representation of the multinomial prediction and predictor variables using MLN where the dependencies between these variables are transformed into first-order logic predicates. Then,<jats:italic>RegRocket<\/jats:italic>employs an efficient framework that learns the model parameters from the MLN representation in a distributed manner. Extensive experimental results based on two large real datasets show that<jats:italic>RegRocket<\/jats:italic>can build multinomial autologistic models, in minutes, for 1 million grid cells with 0.85 average F1-score.<\/jats:p>","DOI":"10.1145\/3366459","type":"journal-article","created":{"date-parts":[[2019,11,25]],"date-time":"2019-11-25T13:19:46Z","timestamp":1574687986000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["RegRocket"],"prefix":"10.1145","volume":"5","author":[{"given":"Ibrahim","family":"Sabek","sequence":"first","affiliation":[{"name":"University of Minnesota, USA"}]},{"given":"Mashaal","family":"Musleh","sequence":"additional","affiliation":[{"name":"University of Minnesota, USA"}]},{"given":"Mohamed F.","family":"Mokbel","sequence":"additional","affiliation":[{"name":"Qatar Computing Research Institute and University of Minnesota, USA"}]}],"member":"320","published-online":{"date-parts":[[2019,11,25]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.2307\/2404755"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1461-0248.2009.01422.x"},{"key":"e_1_2_1_3_1","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1111\/j.2517-6161.1974.tb00999.x","article-title":"Spatial interaction and the statistical analysis of lattice systems","volume":"36","author":"Besag Julian","year":"1974","journal-title":"J. Roy. Stat. Soc."},{"key":"e_1_2_1_4_1","first-page":"179","article-title":"Statistical analysis of non-lattice data","volume":"24","author":"Besag Julian","year":"1975","journal-title":"J. Roy. Stat. Soc."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1080\/02664763.2017.1357684"},{"key":"e_1_2_1_6_1","first-page":"281","article-title":"Autologistic models with interpretable parameters. J. Agricult., Biol","volume":"14","author":"Caragea Petruta C.","year":"2009","journal-title":"Environ. Stat."},{"volume-title":"Proceedings of the ICML Workshop on Structured Learning: Inferring Graphs from Structured and Unstructured Inputs.","year":"2013","author":"Chen Yang","key":"e_1_2_1_7_1"},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1111\/j.2517-6161.1958.tb00292.x","article-title":"The regression analysis of binary sequences (with discussion)","volume":"20","author":"Cox David R.","year":"1958","journal-title":"J. Roy. Stat. Soc."},{"volume-title":"Proceedings of the MLG Workshop at ACM SIGKDD.","author":"Crane Robert","key":"e_1_2_1_9_1"},{"key":"e_1_2_1_10_1","unstructured":"Digital Elevation Model (DEM) of Minnesota [n.d.]. Retrieved from http:\/\/www.mngeo.state.mn.us\/chouse\/metadata\/dem24ras.html. Digital Elevation Model (DEM) of Minnesota [n.d.]. Retrieved from http:\/\/www.mngeo.state.mn.us\/chouse\/metadata\/dem24ras.html."},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-031-01549-6","volume-title":"Markov Logic: An Interface Layer for Artificial Intelligence","author":"Domingos Pedro","year":"2009"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-9574.1988.tb01238.x"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.2307\/2532953"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00288933"},{"key":"e_1_2_1_15_1","unstructured":"Anne G\u00e9gout-Petit and Shuxian Li. 2016. Two-step centered spatio-temporal auto-logistic regression model. In Applied Statistics for Development in Africa SADA. Anne G\u00e9gout-Petit and Shuxian Li. 2016. Two-step centered spatio-temporal auto-logistic regression model. In Applied Statistics for Development in Africa SADA."},{"volume-title":"Logical Foundations of Artificial Intelligence","author":"Genesereth Michael","key":"e_1_2_1_16_1"},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1111\/j.2517-6161.1994.tb01976.x","article-title":"On the convergence of Monte Carlo maximum likelihood calculations","volume":"56","author":"Geyer Charles J.","year":"1994","journal-title":"J. Roy. Stat. Soc."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.2307\/1400401"},{"key":"e_1_2_1_19_1","first-page":"131","article-title":"Autologistic model of spatial pattern of phytophthora epidemic in bell pepper: Effects of soil variables on disease presence. J. Agricult., Biol","volume":"2","author":"Gumpertz Marcia L.","year":"1997","journal-title":"Environ. Stat."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/971697.602266"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1111\/1467-9469.00113"},{"volume-title":"Gaussian Random Field Models for Spatial Data","author":"Haran Murali","key":"e_1_2_1_22_1"},{"key":"e_1_2_1_23_1","first-page":"205","article-title":"Autologistic regression model for the distribution of vegetation. J. Agricult., Biol","volume":"8","author":"He Fangliang","year":"2003","journal-title":"Environ. Stat."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.2307\/1400634"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.32614\/RJ-2014-026"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1002\/env.1102"},{"volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201908)","author":"Tuyen","key":"e_1_2_1_27_1"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1080\/0143116031000082073"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2011.2162649"},{"edition":"3","volume-title":"Markov Random Field Modeling in Image Analysis","author":"Stan Z. Li.","key":"e_1_2_1_30_1"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1186\/1476-072X-11-7"},{"volume-title":"India. In Proceedings of the IEEE International Conference on Big Data (BigData\u201914)","author":"Lopez Daphne","key":"e_1_2_1_32_1"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1093\/biomet\/93.2.451"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.1070.0760"},{"key":"e_1_2_1_35_1","unstructured":"Multi-Resolution Land Cover Characteristics (MRLC) Consortium [n.d.]. Retrieved from https:\/\/www.mrlc.gov\/data. Multi-Resolution Land Cover Characteristics (MRLC) Consortium [n.d.]. Retrieved from https:\/\/www.mrlc.gov\/data."},{"key":"e_1_2_1_36_1","unstructured":"NASA EarthData [n.d.]. Retrieved from https:\/\/earthdata.nasa.gov\/earth-observation-data. NASA EarthData [n.d.]. Retrieved from https:\/\/earthdata.nasa.gov\/earth-observation-data."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2841051"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/348.318586"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.14778\/1978665.1978669"},{"key":"e_1_2_1_40_1","first-page":"11","article-title":"HoloClean: Holistic data repairs with probabilistic inference","volume":"10","author":"Rekatsinas Theodoros","year":"2017","journal-title":"VLDB Journal"},{"volume-title":"Gaussian Markov Random Fields: Theory And Applications (Monographs on Statistics and Applied Probability)","author":"Rue Havard","key":"e_1_2_1_41_1"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3274895.3274987"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1089\/cmb.2010.0044"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.0906-7590.2005.04093.x"},{"key":"e_1_2_1_45_1","first-page":"319","article-title":"Predicting species occurrences: Issues of accuracy and scale","volume":"84","author":"Scott J. Michael","year":"2002","journal-title":"J. Mammalogy"},{"volume-title":"Proceedings of the International Conference on Very Large Data Bases (VLDB\u201907)","year":"2007","author":"Shen Warren","key":"e_1_2_1_46_1"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1080\/00949650412331320873"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.14778\/2809974.2809991"},{"volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence (AAAI\u201905)","year":"2005","author":"Singla Parag","key":"e_1_2_1_49_1"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.biocon.2009.05.006"},{"volume-title":"Proceedings of the International Conference on Knowledge and Smart Technology(KST\u201917)","author":"Tangthaikwan K.","key":"e_1_2_1_51_1"},{"volume-title":"Mohammad Javad Yazdanpanah, Bryan Christopher Pijanowski, Sara Saeedi, and Amir Hossein Tayyebi.","year":"2010","author":"Tayyebi Amin","key":"e_1_2_1_52_1"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-009-9394-5_18"},{"key":"e_1_2_1_54_1","unstructured":"USGS National Land Cover Dataset (NLCD) [n.d.]. Retrieved from https:\/\/catalog.data.gov\/dataset\/usgs-national-land-cover-dataset-nlcd-downloadable-data-collection. USGS National Land Cover Dataset (NLCD) [n.d.]. Retrieved from https:\/\/catalog.data.gov\/dataset\/usgs-national-land-cover-dataset-nlcd-downloadable-data-collection."},{"key":"e_1_2_1_55_1","unstructured":"USGS National Transportation Dataset (NTD) [n.d.]. Retrieved from https:\/\/catalog.data.gov\/dataset\/usgs-national-transportation-dataset-ntd-downloadable-data-collectionde7d2. USGS National Transportation Dataset (NTD) [n.d.]. Retrieved from https:\/\/catalog.data.gov\/dataset\/usgs-national-transportation-dataset-ntd-downloadable-data-collectionde7d2."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10651-012-0206-3"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920942"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.3389\/fams.2017.00024"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1007\/s12561-016-9185-5"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2463702"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1198\/106186008X289641"},{"key":"e_1_2_1_62_1","first-page":"212","article-title":"Modeling spatial-temporal binary data using Markov random fields. J. Agricult., Biol","volume":"10","author":"Zhu Jun","year":"2005","journal-title":"Environ. Stat."},{"volume-title":"Proceedings of the Conference on Neural Information Processing Systems (NIPS\u201910)","author":"Zinkevich Martin","key":"e_1_2_1_63_1"}],"container-title":["ACM Transactions on Spatial Algorithms and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3366459","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3366459","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3366459","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:32:54Z","timestamp":1750199574000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3366459"}},"subtitle":["Scalable Multinomial Autologistic Regression with Unordered Categorical Variables Using Markov Logic Networks"],"short-title":[],"issued":{"date-parts":[[2019,11,25]]},"references-count":63,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2019,12,31]]}},"alternative-id":["10.1145\/3366459"],"URL":"https:\/\/doi.org\/10.1145\/3366459","relation":{},"ISSN":["2374-0353","2374-0361"],"issn-type":[{"type":"print","value":"2374-0353"},{"type":"electronic","value":"2374-0361"}],"subject":[],"published":{"date-parts":[[2019,11,25]]},"assertion":[{"value":"2018-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-11-25","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}