{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T21:53:51Z","timestamp":1775598831764,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":33,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,6,9]],"date-time":"2022-06-09T00:00:00Z","timestamp":1654732800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,6,9]]},"DOI":"10.1145\/3519939.3523701","type":"proceedings-article","created":{"date-parts":[[2022,6,2]],"date-time":"2022-06-02T21:05:05Z","timestamp":1654203905000},"page":"301-315","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":17,"title":["All you need is superword-level parallelism: systematic control-flow vectorization with SLP"],"prefix":"10.1145","author":[{"given":"Yishen","family":"Chen","sequence":"first","affiliation":[{"name":"Massachusetts Institute of Technology, USA"}]},{"given":"Charith","family":"Mendis","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign, USA"}]},{"given":"Saman","family":"Amarasinghe","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, USA"}]}],"member":"320","published-online":{"date-parts":[[2022,6,9]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"2022. Auto-Vectorization in GCC. https:\/\/gcc.gnu.org\/projects\/tree-ssa\/vectorization.html  2022. Auto-Vectorization in GCC. https:\/\/gcc.gnu.org\/projects\/tree-ssa\/vectorization.html"},{"key":"e_1_3_2_1_2_1","unstructured":"2022. Auto-Vectorization in LLVM. https:\/\/llvm.org\/docs\/Vectorizers.html  2022. Auto-Vectorization in LLVM. https:\/\/llvm.org\/docs\/Vectorizers.html"},{"key":"e_1_3_2_1_3_1","unstructured":"2022. llvm::TargetTransformInfo Class Reference. https:\/\/llvm.org\/doxygen\/classllvm_1_1TargetTransformInfo.html  2022. llvm::TargetTransformInfo Class Reference. https:\/\/llvm.org\/doxygen\/classllvm_1_1TargetTransformInfo.html"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Randy Allen and Ken Kennedy. 1987. Automatic Translation of FORTRAN Programs to Vector Form. ACM Transactions on Programming Languages and Systems.  Randy Allen and Ken Kennedy. 1987. Automatic Translation of FORTRAN Programs to Vector Form. ACM Transactions on Programming Languages and Systems.","DOI":"10.1145\/29873.29875"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/567067.567085"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Sara S. Baghsorkhi Nalini Vasudevan and Youfeng Wu. 2016. FlexVec: Auto-vectorization for Irregular Loops. In Programming Language Design and Implementation.  Sara S. Baghsorkhi Nalini Vasudevan and Youfeng Wu. 2016. FlexVec: Auto-vectorization for Irregular Loops. In Programming Language Design and Implementation.","DOI":"10.1145\/2908080.2908111"},{"key":"e_1_3_2_1_7_1","volume-title":"International Workshop on Languages and Compilers for Parallel Computing.","author":"Blainey Bob","year":"2002","unstructured":"Bob Blainey , Christopher Barton , and Jos\u00e9 Nelson Amaral . 2002 . Removing impediments to loop fusion through code transformations . In International Workshop on Languages and Compilers for Parallel Computing. Bob Blainey, Christopher Barton, and Jos\u00e9 Nelson Amaral. 2002. Removing impediments to loop fusion through code transformations. In International Workshop on Languages and Compilers for Parallel Computing."},{"key":"e_1_3_2_1_8_1","volume-title":"Vectorizing Compilers: A Test Suite and Results. In ACM\/IEEE Conference on Supercomputing.","author":"Callahan David","year":"1988","unstructured":"David Callahan , Jack J Dongarra , and David Levine . 1988 . Vectorizing Compilers: A Test Suite and Results. In ACM\/IEEE Conference on Supercomputing. David Callahan, Jack J Dongarra, and David Levine. 1988. Vectorizing Compilers: A Test Suite and Results. In ACM\/IEEE Conference on Supercomputing."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Ron Cytron Jeanne Ferrante Barry K. Rosen Mark N. Wegman and F. Kenneth Zadeck. 1991. Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Transactions on Programming Languages and Systems.  Ron Cytron Jeanne Ferrante Barry K. Rosen Mark N. Wegman and F. Kenneth Zadeck. 1991. Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Transactions on Programming Languages and Systems.","DOI":"10.1145\/115372.115320"},{"key":"e_1_3_2_1_10_1","volume-title":"Armin Gr\u00f6\u00df linger, and Christian Lengauer","author":"Grosser Tobias","year":"2012","unstructured":"Tobias Grosser , Armin Gr\u00f6\u00df linger, and Christian Lengauer . 2012 . Polly \u2013 Performing polyhedral optimizations on a low-level intermediate representation. Parallel Processing Letters . Tobias Grosser, Armin Gr\u00f6\u00df linger, and Christian Lengauer. 2012. Polly \u2013 Performing polyhedral optimizations on a low-level intermediate representation. Parallel Processing Letters."},{"key":"e_1_3_2_1_11_1","unstructured":"Khronos Group. 2009. OpenCL 1.0 Specification. http:\/\/khronos.org\/registry\/cl\/specs\/opencl-1.0.pdf  Khronos Group. 2009. OpenCL 1.0 Specification. http:\/\/khronos.org\/registry\/cl\/specs\/opencl-1.0.pdf"},{"key":"e_1_3_2_1_12_1","volume-title":"Whole Function Vectorization. In International Symposium on Code Generation and Optimization.","author":"Karrenberg Ralf","year":"2011","unstructured":"Ralf Karrenberg and Sebastian Hack . 2011 . Whole Function Vectorization. In International Symposium on Code Generation and Optimization. Ralf Karrenberg and Sebastian Hack. 2011. Whole Function Vectorization. In International Symposium on Code Generation and Optimization."},{"key":"e_1_3_2_1_13_1","volume-title":"International Workshop on Languages and Compilers for Parallel Computing. 301\u2013320","author":"Kennedy Ken","year":"1993","unstructured":"Ken Kennedy and Kathryn S McKinley . 1993 . Maximizing loop parallelism and improving data locality via loop fusion and distribution . In International Workshop on Languages and Compilers for Parallel Computing. 301\u2013320 . Ken Kennedy and Kathryn S McKinley. 1993. Maximizing loop parallelism and improving data locality via loop fusion and distribution. In International Workshop on Languages and Compilers for Parallel Computing. 301\u2013320."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Samuel Larsen and Saman Amarasinghe. 2000. Exploiting Superword Level Parallelism with Multimedia Instruction Sets. In Programming Language Design and Implementation.  Samuel Larsen and Saman Amarasinghe. 2000. Exploiting Superword Level Parallelism with Multimedia Instruction Sets. In Programming Language Design and Implementation.","DOI":"10.1145\/349299.349320"},{"key":"e_1_3_2_1_15_1","volume-title":"LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization.","author":"Lattner Chris","year":"2004","unstructured":"Chris Lattner and Vikram Adve . 2004 . LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization. Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"crossref","unstructured":"Jun Liu Yuanrui Zhang Ohyoung Jang Wei Ding and Mahmut Kandemir. 2012. A Compiler Framework for Extracting Superword Level Parallelism. In Programming Language Design and Implementation.  Jun Liu Yuanrui Zhang Ohyoung Jang Wei Ding and Mahmut Kandemir. 2012. A Compiler Framework for Extracting Superword Level Parallelism. In Programming Language Design and Implementation.","DOI":"10.1145\/2254064.2254106"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3276480"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Simon Moll and Sebastian Hack. 2018. Partial Control-Flow Linearization. In Programming Language Design and Implementation.  Simon Moll and Sebastian Hack. 2018. Partial Control-Flow Linearization. In Programming Language Design and Implementation.","DOI":"10.1145\/3192366.3192413"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"crossref","unstructured":"Dorit Nuzman Ira Rosen and Ayal Zaks. 2006. Auto-vectorization of Interleaved Data for SIMD. In Programming Language Design and Implementation.  Dorit Nuzman Ira Rosen and Ayal Zaks. 2006. Auto-vectorization of Interleaved Data for SIMD. In Programming Language Design and Implementation.","DOI":"10.1145\/1133981.1133997"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1454115.1454119"},{"key":"e_1_3_2_1_21_1","volume-title":"MacCabe","author":"Ottenstein Karl J.","year":"1990","unstructured":"Karl J. Ottenstein , Robert A. Ballance , and Arthur B . MacCabe . 1990 . The Program Dependence Web: A Representation Supporting Control-, Data-, and Demand-Driven Interpretation of Imperative Languages. In Programming Language Design and Implementation . Karl J. Ottenstein, Robert A. Ballance, and Arthur B. MacCabe. 1990. The Program Dependence Web: A Representation Supporting Control-, Data-, and Demand-Driven Interpretation of Imperative Languages. In Programming Language Design and Implementation."},{"key":"e_1_3_2_1_22_1","unstructured":"Joseph CH Park and Mike Schlansker. 1991. On predicated execution.  Joseph CH Park and Mike Schlansker. 1991. On predicated execution."},{"key":"e_1_3_2_1_23_1","volume-title":"Mark","author":"Pharr Matt","year":"2012","unstructured":"Matt Pharr and William R . Mark . 2012 . ispc: A SPMD Compiler for High-Performance CPU Programming. In Innovative Parallel Computing . Matt Pharr and William R. Mark. 2012. ispc: A SPMD Compiler for High-Performance CPU Programming. In Innovative Parallel Computing."},{"key":"e_1_3_2_1_24_1","volume-title":"Conference on Parallel Architecture and Compilation.","author":"Porpodas Vasileios","unstructured":"Vasileios Porpodas and Timothy M. Jones . 2015. Throttling Automatic Vectorization: When Less is More . In Conference on Parallel Architecture and Compilation. Vasileios Porpodas and Timothy M. Jones. 2015. Throttling Automatic Vectorization: When Less is More. In Conference on Parallel Architecture and Compilation."},{"key":"e_1_3_2_1_25_1","volume-title":"PSLP: Padded SLP Automatic Vectorization. In International Symposium on Code Generation and Optimization.","author":"Porpodas Vasileios","unstructured":"Vasileios Porpodas , Alberto Magni , and Timothy M. Jones . 2015 . PSLP: Padded SLP Automatic Vectorization. In International Symposium on Code Generation and Optimization. Vasileios Porpodas, Alberto Magni, and Timothy M. Jones. 2015. PSLP: Padded SLP Automatic Vectorization. In International Symposium on Code Generation and Optimization."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243176.3243189"},{"key":"e_1_3_2_1_27_1","volume-title":"Super-Node SLP: Optimized Vectorization for Code Sequences Containing Operators and Their Inverse Elements. In International Symposium on Code Generation and Optimization.","author":"Porpodas Vasileios","year":"2019","unstructured":"Vasileios Porpodas , Rodrigo C. O. Rocha , Evgueni Brevnov , Lu\u00eds F. W. G\u00f3es , and Timothy Mattson . 2019 . Super-Node SLP: Optimized Vectorization for Code Sequences Containing Operators and Their Inverse Elements. In International Symposium on Code Generation and Optimization. Vasileios Porpodas, Rodrigo C. O. Rocha, Evgueni Brevnov, Lu\u00eds F. W. G\u00f3es, and Timothy Mattson. 2019. Super-Node SLP: Optimized Vectorization for Code Sequences Containing Operators and Their Inverse Elements. In International Symposium on Code Generation and Optimization."},{"key":"e_1_3_2_1_28_1","unstructured":"Louis-No\u00ebl Pouchet. 2021. PolyBench\/C: the polyhedral benchmark suite. https:\/\/web.cse.ohio-state.edu\/ pouchet.2\/software\/polybench\/  Louis-No\u00ebl Pouchet. 2021. PolyBench\/C: the polyhedral benchmark suite. https:\/\/web.cse.ohio-state.edu\/ pouchet.2\/software\/polybench\/"},{"key":"e_1_3_2_1_29_1","volume-title":"Vectorization-Aware Loop Unrolling with Seed Forwarding. In International Conference on Compiler Construction.","author":"Rocha Rodrigo C. O.","year":"2020","unstructured":"Rodrigo C. O. Rocha , Vasileios Porpodas , Pavlos Petoumenos , Lu\u00eds F. W. G\u00f3es , Zheng Wang , Murray Cole , and Hugh Leather . 2020 . Vectorization-Aware Loop Unrolling with Seed Forwarding. In International Conference on Compiler Construction. Rodrigo C. O. Rocha, Vasileios Porpodas, Pavlos Petoumenos, Lu\u00eds F. W. G\u00f3es, Zheng Wang, Murray Cole, and Hugh Leather. 2020. Vectorization-Aware Loop Unrolling with Seed Forwarding. In International Conference on Compiler Construction."},{"key":"e_1_3_2_1_30_1","unstructured":"Ira Rosen Dorit Nuzman and Ayal Zaks. 2007. Loop-aware SLP in GCC. In GCC Developers Summit.  Ira Rosen Dorit Nuzman and Ayal Zaks. 2007. Loop-aware SLP in GCC. In GCC Developers Summit."},{"key":"e_1_3_2_1_31_1","volume-title":"Superword-Level Parallelism in the Presence of Control Flow. In International Symposium on Code Generation and Optimization.","author":"Shin Jaewook","year":"2005","unstructured":"Jaewook Shin , Mary Hall , and Jacqueline Chame . 2005 . Superword-Level Parallelism in the Presence of Control Flow. In International Symposium on Code Generation and Optimization. Jaewook Shin, Mary Hall, and Jacqueline Chame. 2005. Superword-Level Parallelism in the Presence of Control Flow. In International Symposium on Code Generation and Optimization."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"crossref","unstructured":"Jean-Baptiste Tristan Paul Govereau and Greg Morrisett. 2011. Evaluating Value-Graph Translation Validation for LLVM. In Programming Language Design and Implementation.  Jean-Baptiste Tristan Paul Govereau and Greg Morrisett. 2011. Evaluating Value-Graph Translation Validation for LLVM. In Programming Language Design and Implementation.","DOI":"10.1145\/1993498.1993533"},{"key":"e_1_3_2_1_33_1","unstructured":"Peng Tu and David Padua. 1995. Efficient Building and Placing of Gating Functions. In Programming Language Design and Implementation.  Peng Tu and David Padua. 1995. Efficient Building and Placing of Gating Functions. In Programming Language Design and Implementation."}],"event":{"name":"PLDI '22: 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation","location":"San Diego CA USA","acronym":"PLDI '22","sponsor":["SIGPLAN ACM Special Interest Group on Programming Languages"]},"container-title":["Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3519939.3523701","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3519939.3523701","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:10:30Z","timestamp":1750183830000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3519939.3523701"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,9]]},"references-count":33,"alternative-id":["10.1145\/3519939.3523701","10.1145\/3519939"],"URL":"https:\/\/doi.org\/10.1145\/3519939.3523701","relation":{},"subject":[],"published":{"date-parts":[[2022,6,9]]},"assertion":[{"value":"2022-06-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}