{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T20:57:59Z","timestamp":1760043479492,"version":"3.41.0"},"reference-count":26,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2015,12,8]],"date-time":"2015-12-08T00:00:00Z","timestamp":1449532800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2016,1,7]]},"abstract":"<jats:p>Runtime code optimization and speculative execution are becoming increasingly prominent to leverage performance in the current multi- and many-core era. However, a wider and more efficient use of such techniques is mainly hampered by the prohibitive time overhead induced by centralized data race detection, dynamic code behavior modeling, and code generation. Most of the existing Thread Level Speculation (TLS) systems rely on naively slicing the target loops into chunks and trying to execute the chunks in parallel with the help of a centralized performance-penalizing verification module that takes care of data races. Due to the lack of a data dependence model, these speculative systems are not capable of doing advanced transformations, and, more importantly, the chances of rollback are high. The polyhedral model is a well-known mathematical model to analyze and optimize loop nests. The current state-of-art tools limit the application of the polyhedral model to static control codes. Thus, none of these tools can generally handle codes with while loops, indirect memory accesses, or pointers. Apollo (Automatic POLyhedral Loop Optimizer) is a framework that goes one step beyond and applies the polyhedral model dynamically by using TLS. Apollo can predict, at runtime, whether the codes are behaving linearly or not, and it applies polyhedral transformations on-the-fly. This article presents a novel system that enables Apollo to handle codes whose memory accesses and loop bounds are not necessarily linear. More generally, this approach expands the applicability of the polyhedral model at runtime to a wider class of codes. Plugging together both linear and nonlinear accesses to the dependence prediction model enables the application of polyhedral loop optimizing transformations even for nonlinear code kernels while also allowing a low-cost speculation verification.<\/jats:p>","DOI":"10.1145\/2838734","type":"journal-article","created":{"date-parts":[[2015,12,10]],"date-time":"2015-12-10T14:22:10Z","timestamp":1449757330000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["The Polyhedral Model of Nonlinear Loops"],"prefix":"10.1145","volume":"12","author":[{"given":"Aravind","family":"Sukumaran-Rajam","sequence":"first","affiliation":[{"name":"INRIA, Team CAMUS, ICube Lab, CNRS, University of Strasbourg, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Philippe","family":"Clauss","sequence":"additional","affiliation":[{"name":"INRIA, Team CAMUS, ICube Lab, CNRS, University of Strasbourg, France"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2015,12,8]]},"reference":[{"volume-title":"Loop Transformations for Restructuring Compilers - The Foundations","author":"Banerjee U.","key":"e_1_2_2_1_1","unstructured":"U. Banerjee . 1993. Loop Transformations for Restructuring Compilers - The Foundations . Kluwer Academic Publishers . U. Banerjee. 1993. Loop Transformations for Restructuring Compilers - The Foundations. Kluwer Academic Publishers."},{"key":"e_1_2_2_2_1","unstructured":"Kevin Barker Thomas Benson Dan Campbell David Ediger Roberto Gioiosa Adolfy Hoisie Darren Kerbyson Joseph Manzano Andres Marquez Leon Song Nathan Tallent and Antonino Tumeo. 2013. PERFECT (Power Efficiency Revolution for Embedded Computing Technologies) Benchmark Suite Manual. Pacific Northwest National Laboratory and Georgia Tech Research Institute. http:\/\/hpc.pnnl.gov\/projects\/PERFECT\/.  Kevin Barker Thomas Benson Dan Campbell David Ediger Roberto Gioiosa Adolfy Hoisie Darren Kerbyson Joseph Manzano Andres Marquez Leon Song Nathan Tallent and Antonino Tumeo. 2013. PERFECT (Power Efficiency Revolution for Embedded Computing Technologies) Benchmark Suite Manual. Pacific Northwest National Laboratory and Georgia Tech Research Institute. http:\/\/hpc.pnnl.gov\/projects\/PERFECT\/."},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1133981.1134000"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1375581.1375595"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2009.5306797"},{"key":"e_1_2_2_6_1","volume-title":"Aiken","author":"Cohen Jacob","year":"2002","unstructured":"Jacob Cohen , Patricia Cohen , Stephen G. West , and Leona S . Aiken . 2002 . Applied Multiple Regression\/Correlation Analysis for the Behavioral Sciences (3rd ed.). Routledge . Jacob Cohen, Patricia Cohen, Stephen G. West, and Leona S. Aiken. 2002. Applied Multiple Regression\/Correlation Analysis for the Behavioral Sciences (3rd ed.). Routledge."},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF02577789"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/209937.209947"},{"volume-title":"Encyclopedia of Parallel Computing","author":"Feautrier Paul","key":"e_1_2_2_9_1","unstructured":"Paul Feautrier and Christian Lengauer . 2011. Polyhedron model . In Encyclopedia of Parallel Computing , David Padua (Ed.). Springer US , 1581--1592. DOI:http:\/\/dx.doi.org\/10.1007\/978-0-387-09766-4_502 10.1007\/978-0-387-09766-4_502 Paul Feautrier and Christian Lengauer. 2011. Polyhedron model. In Encyclopedia of Parallel Computing, David Padua (Ed.). Springer US, 1581--1592. DOI:http:\/\/dx.doi.org\/10.1007\/978-0-387-09766-4_502"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1880043.1880047"},{"key":"e_1_2_2_11_1","volume-title":"Automation & Test in Europe Conference & Exhibition, DATE","author":"Geuns Stefan J.","year":"2011","unstructured":"Stefan J. Geuns , Marco J. G. Bekooij , Tjerk Bijlsma , and Henk Corp oraal. 2011 . Parallelization of while loops in nested loop programs for shared-memory multiprocessor systems. In Design , Automation & Test in Europe Conference & Exhibition, DATE 2011. IEEE Computer Society, 1--6. http:\/\/doc.utwente.nl\/78154\/ Stefan J. Geuns, Marco J. G. Bekooij, Tjerk Bijlsma, and Henk Corporaal. 2011. Parallelization of while loops in nested loop programs for shared-memory multiprocessor systems. In Design, Automation & Test in Europe Conference & Exhibition, DATE 2011. IEEE Computer Society, 1--6. http:\/\/doc.utwente.nl\/78154\/"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/646661.699068"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10766-013-0259-4"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1229428.1229474"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1122971.1122997"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1866307.1866371"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1583991.1584050"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1735970.1736030"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1356058.1356082"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/207110.207148"},{"volume-title":"Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC\u201912)","author":"Ravishankar Mahesh","key":"e_1_2_2_22_1","unstructured":"Mahesh Ravishankar , John Eisenlohr , Louis-No\u00ebl Pouchet , J. Ramanujam , Atanas Rountev , and P. Sadayappan . 2012. Code generation for parallel execution of a class of irregular loops on distributed memory systems . In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC\u201912) . IEEE Computer Society Press, Los Alamitos, CA, USA. Mahesh Ravishankar, John Eisenlohr, Louis-No\u00ebl Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan. 2012. Code generation for parallel execution of a class of irregular loops on distributed memory systems. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC\u201912). IEEE Computer Society Press, Los Alamitos, CA, USA."},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2484904.2484911"},{"key":"e_1_2_2_24_1","volume-title":"Willy Wolff, Alexandra Jimborean, and Philippe Clauss.","author":"Sukumaran-Rajam Aravind","year":"2014","unstructured":"Aravind Sukumaran-Rajam , Juan Manuel Martinez , Willy Wolff, Alexandra Jimborean, and Philippe Clauss. 2014 . Speculative program parallelization with scalable and decentralized runtime verification. In Runtime Verification, Borzoo Bonakdarpour and Scott A. Smolka (Eds.), Vol. 8734 . Springer , Toronto, Canada, 124--139. DOI:http:\/\/dx.doi.org\/10.1007\/978-3-319-11164-3_11 10.1007\/978-3-319-11164-3_11 Aravind Sukumaran-Rajam, Juan Manuel Martinez, Willy Wolff, Alexandra Jimborean, and Philippe Clauss. 2014. Speculative program parallelization with scalable and decentralized runtime verification. In Runtime Verification, Borzoo Bonakdarpour and Scott A. Smolka (Eds.), Vol. 8734. Springer, Toronto, Canada, 124--139. DOI:http:\/\/dx.doi.org\/10.1007\/978-3-319-11164-3_11"},{"key":"e_1_2_2_25_1","volume-title":"Wijshoff","author":"van der Spek Harmen L. A.","year":"2008","unstructured":"Harmen L. A. van der Spek , Erwin M. Bakker , and Harry A. G . Wijshoff . 2008 . SPARK00: A benchmark package for the compiler evaluation of irregular\/sparse codes. CoRR abs\/0805.3897 (2008). Harmen L. A. van der Spek, Erwin M. Bakker, and Harry A. G. Wijshoff. 2008. SPARK00: A benchmark package for the compiler evaluation of irregular\/sparse codes. CoRR abs\/0805.3897 (2008)."},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2581122.2544141"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2400682.2400713"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2838734","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2838734","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T05:43:31Z","timestamp":1750225411000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2838734"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,12,8]]},"references-count":26,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2016,1,7]]}},"alternative-id":["10.1145\/2838734"],"URL":"https:\/\/doi.org\/10.1145\/2838734","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2015,12,8]]},"assertion":[{"value":"2015-05-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-12-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}