{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,29]],"date-time":"2025-09-29T08:11:54Z","timestamp":1759133514142,"version":"3.41.0"},"reference-count":75,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2015,12,22]],"date-time":"2015-12-22T00:00:00Z","timestamp":1450742400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/G000662\/1"],"award-info":[{"award-number":["EP\/G000662\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000288","name":"Royal Society","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000288","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Program. Lang. Syst."],"published-print":{"date-parts":[[2016,1,4]]},"abstract":"<jats:p>Current parallelizing compilers can tackle applications exercising regular access patterns on arrays or affine indices, where data dependencies can be expressed in a linear form. Unfortunately, there are cases that independence between statements of code cannot be guaranteed and thus the compiler conservatively produces sequential code. Programs that involve extensive pointer use, irregular access patterns, and loops with unknown number of iterations are examples of such cases. This limits the extraction of parallelism in cases where dependencies are rarely or never triggered at runtime. Speculative parallelism refers to methods employed during program execution that aim to produce a valid parallel execution schedule for programs immune to static parallelization. The motivation for this article is to review recent developments in the area of compiler-driven software speculation for thread-level parallelism and how they came about. The article is divided into two parts. In the first part the fundamentals of speculative parallelization for thread-level parallelism are explained along with a design choice categorization for implementing such systems. Design choices include the ways speculative data is handled, how data dependence violations are detected and resolved, how the correct data are made visible to other threads, or how speculative threads are scheduled. The second part is structured around those design choices providing the advances and trends in the literature with reference to key developments in the area. Although the focus of the article is in software speculative parallelization, a section is dedicated for providing the interested reader with pointers and references for exploring similar topics such as hardware thread-level speculation, transactional memory, and automatic parallelization.<\/jats:p>","DOI":"10.1145\/2821505","type":"journal-article","created":{"date-parts":[[2015,12,23]],"date-time":"2015-12-23T15:19:49Z","timestamp":1450883989000},"page":"1-45","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Compiler-Driven Software Speculation for Thread-Level Parallelism"],"prefix":"10.1145","volume":"38","author":[{"given":"Paraskevas","family":"Yiapanis","sequence":"first","affiliation":[{"name":"University of Manchester, Oxford Road"}]},{"given":"Gavin","family":"Brown","sequence":"additional","affiliation":[{"name":"University of Manchester, Oxford Road"}]},{"given":"Mikel","family":"Luj\u00e1n","sequence":"additional","affiliation":[{"name":"University of Manchester, Oxford Road"}]}],"member":"320","published-online":{"date-parts":[[2015,12,22]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/279358.279391"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/197405.197406"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/231379.231394"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2007.35"},{"key":"e_1_2_1_6_1","volume-title":"Workshop on Feedback-Directed and Dynamic Optimization (FDDO).","author":"Bruening Derek","year":"2000","unstructured":"Derek Bruening, Srikrishna Devabhaktuni, and Saman Amarasinghe. 2000. Softspec: Software-based speculative parallelism. In Workshop on Feedback-Directed and Dynamic Optimization (FDDO)."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2006.13"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/602770.602857"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859668"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2005.69"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/781498.781501"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.363382"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the International Conference on Parallel Processing (ICPP), 836--844","author":"Cytron Ron","year":"1986","unstructured":"Ron Cytron. 1986. DOACROSS: Beyond vectorization for multiprocessors. In Proceedings of the International Conference on Parallel Processing (ICPP), 836--844."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/876874.878706"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/1251254.1251264"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250734.1250760"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1089008.1089010"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/822079.822729"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.5555\/509058.509070"},{"key":"e_1_2_1_20_1","volume-title":"http:\/\/hadoop.apache.org\/. (2005). Accessed","author":"Hadoop Apache","year":"2015","unstructured":"Apache Hadoop. 2005. Apache Hadoop. http:\/\/hadoop.apache.org\/. (2005). Accessed February 2, 2015."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/291069.291020"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/1855056"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1133981.1133984"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/165123.165164"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2010.5434077"},{"key":"e_1_2_1_26_1","first-page":"1","article-title":"The role of return value prediction in exploiting speculative method-level parallelism","volume":"5","author":"Hu Shiwen","year":"2003","unstructured":"Shiwen Hu, Ravi Bhargava, and Lizy Kurian John. 2003. The role of return value prediction in exploiting speculative method-level parallelism. Journal of Instruction-Level Parallelsim 5 (2003), 1--21.","journal-title":"Journal of Instruction-Level Parallelsim"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2010.5649169"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2254064.2254107"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/996841.996851"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/502981"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.19"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1122971.1122997"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/2401945.2401998"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1038\/455028a"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.5555\/645563.660312"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.5555\/874076.876478"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","unstructured":"Jan Kasper Martinsen and Hakan Grahn. 2011. A methodology for evaluating JavaScript execution behavior in interactive web applications. In Computer Systems and Applications (AICCSA). 241--248. 10.1109\/AICCSA.2011.6126611","DOI":"10.1109\/AICCSA.2011.6126611"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/MIC.2012.146"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1542476.1542495"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.5555\/2371235"},{"volume-title":"Proceedings of 11th Static Analysis Symposium (SAS), 165--180","author":"Nystrom Erik M.","key":"e_1_2_1_41_1","unstructured":"Erik M. Nystrom, Hong-Seok Kim, and Wen-Mei W. Hwu. 2004. Bottom-up and top-down context-sensitive summary-based pointer analysis. In Proceedings of 11th Static Analysis Symposium (SAS), 165--180."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1583991.1584050"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2005.13"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2005.13"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-69330-7_21"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065944.1065964"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/1736020.1736030"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/1356058.1356074"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/1400112.1400113"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.5555\/1025127.1026007"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.5555\/1863166.1863169"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8191(98)00024-6"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/181181.181254"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/207110.207148"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/1088149.1088178"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/1088149.1088173"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/1806596.1806598"},{"key":"e_1_2_1_60_1","first-page":"1","article-title":"An all-software thread-level data dependence speculation system for multiprocessors","volume":"3","author":"Rundberg Peter","year":"2001","unstructured":"Peter Rundberg and Per Stenstr\u00f6m. 2001. An all-software thread-level data dependence speculation system for multiprocessors. Journal of Instruction-Level Parallelism 3 (2001), 1--28.","journal-title":"Journal of Instruction-Level Parallelism"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/1122971.1123001"},{"key":"e_1_2_1_62_1","volume-title":"Salz and Ravi Mirchandaney","author":"Joel","year":"1991","unstructured":"Joel H. Salz and Ravi Mirchandaney. 1991. The preprocessed doacross loop. In Proceedings of ICPP, 174--178."},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1145\/318789.318794"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.88484"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/1810479.1810530"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1007\/11864219_13"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/1082469.1082471"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339650"},{"key":"e_1_2_1_70_1","volume-title":"Proceedings of the International Conference of Parallel Processing, 528--535","author":"Tang Peiyi","year":"1986","unstructured":"Peiyi Tang and Pen-Chung Yew. 1986. Processor self-scheduling for multiple nested parallel loops. In Proceedings of the International Conference of Parallel Processing, 528--535."},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2007.7"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/1806596.1806604"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2008.4771802"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.5555\/1299042.1299110"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370836"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1145\/2400682.2400698"},{"key":"e_1_2_1_78_1","volume-title":"Proceedings of the International Symposium on Cluster, Cloud, and Grid Computing (CCGRID). 120--127","author":"Zhang Chenggang","year":"2013","unstructured":"Chenggang Zhang, Guodong Han, and Cho-Li Wang. 2013. GPU-TLS: An efficient runtime for speculative loop parallelization on GPUs. In Proceedings of the International Symposium on Cluster, Cloud, and Grid Computing (CCGRID). 120--127."},{"volume-title":"Proceedings of the International Conference on High-Performance Computer Architecture (HPCA), 290--301","author":"Zhong Hongtao","key":"e_1_2_1_79_1","unstructured":"Hongtao Zhong, Mojtaba Mehrara, Steven A. Lieberman, and Scott A. Mahlke. 2008. Uncovering hidden loop level parallelism in sequential applications. In Proceedings of the International Conference on High-Performance Computer Architecture (HPCA), 290--301."},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.1987.233477"}],"container-title":["ACM Transactions on Programming Languages and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2821505","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2821505","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T05:48:30Z","timestamp":1750225710000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2821505"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,12,22]]},"references-count":75,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2016,1,4]]}},"alternative-id":["10.1145\/2821505"],"URL":"https:\/\/doi.org\/10.1145\/2821505","relation":{},"ISSN":["0164-0925","1558-4593"],"issn-type":[{"type":"print","value":"0164-0925"},{"type":"electronic","value":"1558-4593"}],"subject":[],"published":{"date-parts":[[2015,12,22]]},"assertion":[{"value":"2014-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-12-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}