{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T07:38:55Z","timestamp":1740123535144,"version":"3.37.3"},"reference-count":28,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2019,11,11]],"date-time":"2019-11-11T00:00:00Z","timestamp":1573430400000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"},{"start":{"date-parts":[[2019,11,11]],"date-time":"2019-11-11T00:00:00Z","timestamp":1573430400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"funder":[{"DOI":"10.13039\/501100004543","name":"Chinese Scholarship Council","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100004543","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Supercomput"],"published-print":{"date-parts":[[2020,6]]},"DOI":"10.1007\/s11227-019-03057-4","type":"journal-article","created":{"date-parts":[[2019,11,11]],"date-time":"2019-11-11T09:03:14Z","timestamp":1573462994000},"page":"4731-4746","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Vectorizing programs with IF-statements for processors with SIMD extensions"],"prefix":"10.1007","volume":"76","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5027-9749","authenticated-orcid":false,"given":"Huihui","family":"Sun","sequence":"first","affiliation":[]},{"given":"Sergei","family":"Gorlatch","sequence":"additional","affiliation":[]},{"given":"Rongcai","family":"Zhao","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2019,11,11]]},"reference":[{"key":"3057_CR1","doi-asserted-by":"publisher","unstructured":"Allen JR, Kennedy K, Porterfield C et\u00a0al (1983) Conversion of control dependence to data dependence. In: Proceedings of the symposium on principles of programming languages (POPL), Austin, Texas, USA, pp 177\u2013189. https:\/\/doi.org\/10.1145\/567067.567085","DOI":"10.1145\/567067.567085"},{"key":"3057_CR2","unstructured":"AMD (2012) Using the x86 Open64 compiler suite. For x86 Open64 version 4.5.2"},{"key":"3057_CR3","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1007\/978-3-540-31985-6_8","volume-title":"Compiler construction","author":"C Barton","year":"2005","unstructured":"Barton C, Tal A, Blainey B, Amaral JN (2005) Generalized index-set splitting. In: Bodik R (ed) Compiler construction. Springer, Berlin, pp 106\u2013120"},{"issue":"2","key":"3057_CR4","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1023\/A:1014230429447","volume":"30","author":"AJC Bik","year":"2002","unstructured":"Bik AJC, Girkar M, Grey PM, Tian X (2002) Automatic intra-register vectorization for the Intel\u00ae architecture. Int J Parallel Program 30(2):65\u201398. https:\/\/doi.org\/10.1023\/A:1014230429447","journal-title":"Int J Parallel Program"},{"key":"3057_CR5","doi-asserted-by":"publisher","unstructured":"Che S, Boyer M, Meng J, Tarjan D, Sheaffer JW, Lee S, Skadron K (2009) Rodinia: A benchmark suite for heterogeneous computing. In: Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC), Austin, TX, USA, pp 44\u201354. https:\/\/doi.org\/10.1109\/IISWC.2009.5306797","DOI":"10.1109\/IISWC.2009.5306797"},{"key":"3057_CR6","volume-title":"Engineering a compiler","author":"K Cooper","year":"2011","unstructured":"Cooper K, Torczon L (2011) Engineering a compiler. Elsevier, Amsterdam"},{"key":"3057_CR7","doi-asserted-by":"publisher","unstructured":"Danalis A, Marin G, McCurdy C, Meredith JS, Roth PC, Spafford K, Tipparaju V, Vetter JS (2010) The scalable heterogeneous computing (shoc) benchmark suite. In: Proceedings of the 3rdWorkshop on General-Purpose Computation on Graphics Processing Units, ACM, pp 63\u201374. https:\/\/doi.org\/10.1145\/1735688.1735702","DOI":"10.1145\/1735688.1735702"},{"key":"3057_CR8","unstructured":"Free Software Foundation (2019) Using the GNU Compiler Collection (GCC). https:\/\/gcc.gnu.org\/onlinedocs\/gcc\/. Accessed 24 May 2019"},{"key":"3057_CR9","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s11432-016-5588-7","volume":"59","author":"H Fu","year":"2016","unstructured":"Fu H, Liao J, Yang J et al (2016) The Sunway TaihuLight supercomputer: system and applications. Sci China Inf Sci 59:1\u201316. https:\/\/doi.org\/10.1007\/s11432-016-5588-7","journal-title":"Sci China Inf Sci"},{"issue":"4","key":"3057_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/1186736.1186737","volume":"34","author":"JL Henning","year":"2006","unstructured":"Henning JL (2006) SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput Archit News 34(4):1\u201317. https:\/\/doi.org\/10.1145\/1186736.1186737","journal-title":"ACM SIGARCH Comput Archit News"},{"key":"3057_CR11","unstructured":"Intel (2019) Intel 64 and IA-32 Architectures Optimization Reference Manual. Accessed May 2019"},{"key":"3057_CR12","unstructured":"Intel (2017) Intel C++ Compiler Developer Guide and Reference. Version 18.0"},{"key":"3057_CR13","doi-asserted-by":"publisher","unstructured":"Karrenberg R, Hack S (2011) Whole-function vectorization. In: Proceedings of the international symposium on code generation and optimization (CGO), Chamonix, France, pp 141\u2013150. https:\/\/doi.org\/10.1109\/CGO.2011.5764682","DOI":"10.1109\/CGO.2011.5764682"},{"issue":"5","key":"3057_CR14","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1145\/358438.349320","volume":"35","author":"Samuel Larsen","year":"2000","unstructured":"Larsen S, Amarasinghe SP (2000) Exploiting superword level parallelism with multimedia instruction sets. In: Proceedings of the Conference on Programming Language Design and Implementation (PLDI), Vancouver, BC, Canada, pp 145\u2013156. https:\/\/doi.org\/10.1145\/358438.349320","journal-title":"ACM SIGPLAN Notices"},{"key":"3057_CR15","doi-asserted-by":"publisher","unstructured":"Lattner C, Adve VS (2004) LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of the international symposium on code generation and optimization (CGO), San Jose, CA, USA, pp 75\u201388. https:\/\/doi.org\/10.1109\/CGO.2004.1281665","DOI":"10.1109\/CGO.2004.1281665"},{"key":"3057_CR16","doi-asserted-by":"crossref","unstructured":"Lokuciejewski P, Gedikli F, Marwedel P (2009) Accelerating WCET-driven optimizations by the invariant path paradigm: a case study of loop unswitching. In: Proceedings of the 12th international workshop on software and compilers for embedded systems, SCOPES \u201909. ACM, New York, NY, USA, pp 11\u201320. http:\/\/dl.acm.org\/citation.cfm?id=1543820.1543823","DOI":"10.1145\/1543820.1543823"},{"key":"3057_CR17","unstructured":"Moll S (2019) The Region Vectorizer (RV). https:\/\/github.com\/cdl-saarland\/rv. Accessed May 2019"},{"key":"3057_CR18","doi-asserted-by":"publisher","unstructured":"Moll S, Hack S (2018) Partial control-flow linearization. In: Proceedings of the Conference on Programming Language Design and Implementation (PLDI), New York, NY, USA. https:\/\/doi.org\/10.1145\/3192366.3192413","DOI":"10.1145\/3192366.3192413"},{"key":"3057_CR19","doi-asserted-by":"publisher","unstructured":"Pharr M, Mark WR (2012) ispc: a SPMD compiler for high-performance CPU programming. In: Innovative parallel computing (InPar). IEEE, pp 1\u201313. https:\/\/doi.org\/10.1109\/InPar.2012.6339601","DOI":"10.1109\/InPar.2012.6339601"},{"key":"3057_CR20","doi-asserted-by":"publisher","unstructured":"Pohl A, Cosenza B, Juurlink BHH (2018) Control flow vectorization for ARM NEON. In: Proceedings of the 21st international workshop on software and compilers for embedded systems (SCOPES), May 28\u201330, 2018, Sankt Goar, Germany, pp 66\u201375. https:\/\/doi.org\/10.1145\/3207719.3207721","DOI":"10.1145\/3207719.3207721"},{"key":"3057_CR21","doi-asserted-by":"publisher","unstructured":"Shin J, Hall MW, Chame J (2005) Superword-level parallelism in the presence of control flow. In: Proceedings of the international symposium on code generation and optimization (CGO), San Jose, CA, USA, pp 165\u2013175. https:\/\/doi.org\/10.1109\/cgo.2005.33","DOI":"10.1109\/cgo.2005.33"},{"issue":"4","key":"3057_CR22","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1016\/j.micpro.2009.02.002","volume":"33","author":"J Shin","year":"2009","unstructured":"Shin J, Hall MW, Chame J (2009) Evaluating compiler technology for control-flow optimizations for multimedia extension architectures. Microprocess Microsyst Embed Hardw Des 33(4):235\u2013243. https:\/\/doi.org\/10.1016\/j.micpro.2009.02.002","journal-title":"Microprocess Microsyst Embed Hardw Des"},{"key":"3057_CR23","doi-asserted-by":"publisher","first-page":"363","DOI":"10.1023\/A:1007559022013","volume":"28","author":"N Sreraman","year":"2000","unstructured":"Sreraman N, Govindarajan R (2000) A vectorizing compiler for multimedia extensions. Int J Parallel Program 28:363\u2013400. https:\/\/doi.org\/10.1023\/A:1007559022013","journal-title":"Int J Parallel Program"},{"key":"3057_CR24","unstructured":"Sujon MH, Whaley RC, Yi Q (2013) Vectorization past dependent branches through speculation. In: Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, PACT \u201913. IEEE Press, Piscataway, NJ, USA, pp 353\u2013362. http:\/\/dl.acm.org\/citation.cfm?id=2523721.2523769"},{"key":"3057_CR25","doi-asserted-by":"publisher","unstructured":"Sun H, Fey F, Zhao J, Gorlatch S (2019) WCCV: Improving the vectorization of IF-statements with warp-coherent conditions. In: Proceedings of the 2018 International Conference on Supercomputing, ICS \u201919. ACM, New York, NY, USA, pp 319\u2013329. https:\/\/doi.org\/10.1145\/3330345.3331059","DOI":"10.1145\/3330345.3331059"},{"key":"3057_CR26","doi-asserted-by":"publisher","unstructured":"Tanaka H, Ota Y, Matsumoto N, Hieda T, Takeuchi Y, Imai M (2010) A new compilation technique for SIMD code generation across basic block boundaries. In: 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC), pp 101\u2013106. https:\/\/doi.org\/10.1109\/ASPDAC.2010.5419911","DOI":"10.1109\/ASPDAC.2010.5419911"},{"key":"3057_CR27","volume-title":"A catalogue of optimizing transformations","author":"J Thomas","year":"1971","unstructured":"Thomas J, Allen F, Cocke J (1971) A catalogue of optimizing transformations. Prentice-Hall, Englewood Cliffs"},{"key":"3057_CR28","unstructured":"TOP500: https:\/\/www.top500.org\/lists\/2018\/11\/. Accessed 24 May 2019"}],"container-title":["The Journal of Supercomputing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-019-03057-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s11227-019-03057-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-019-03057-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,11,10]],"date-time":"2020-11-10T00:19:20Z","timestamp":1604967560000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s11227-019-03057-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,11,11]]},"references-count":28,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2020,6]]}},"alternative-id":["3057"],"URL":"https:\/\/doi.org\/10.1007\/s11227-019-03057-4","relation":{},"ISSN":["0920-8542","1573-0484"],"issn-type":[{"type":"print","value":"0920-8542"},{"type":"electronic","value":"1573-0484"}],"subject":[],"published":{"date-parts":[[2019,11,11]]},"assertion":[{"value":"11 November 2019","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}