{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T08:53:08Z","timestamp":1770281588477,"version":"3.49.0"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T00:00:00Z","timestamp":1624838400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001807","name":"FAPESP","doi-asserted-by":"crossref","award":["2013\/08293-7, 2016\/15337-9, 2019\/04536-9"],"award-info":[{"award-number":["2013\/08293-7, 2016\/15337-9, 2019\/04536-9"]}],"id":[{"id":"10.13039\/501100001807","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2021,9,30]]},"abstract":"<jats:p>\n            Well-crafted libraries deliver much higher performance than code generated by sophisticated application programmers using advanced optimizing compilers. When a code pattern for which a well-tuned library implementation exists is found in the source code of an application, the highest performing solution is to replace the pattern with a call to the library. Idiom-recognition solutions in the past either required pattern matching machinery that was outside of the compilation framework or provided a very brittle solution that would fail even for minor variants in the pattern source code. This article introduces Kernel Find &amp; Replacer (\n            <jats:sc>KernelFaRer<\/jats:sc>\n            ), an idiom recognizer implemented entirely in the existing LLVM compiler framework. The versatility of\n            <jats:sc>KernelFaRer<\/jats:sc>\n            is demonstrated by matching and replacing two linear algebra idioms, general matrix-matrix multiplication (GEMM), and symmetric rank-2k update (SYR2K). Both GEMM and SYR2K are used extensively in scientific computation, and GEMM is also a central building block for deep learning and computer graphics algorithms. The idiom recognition in\n            <jats:sc>KernelFaRer<\/jats:sc>\n            is much more robust than alternative solutions, has a much lower compilation overhead, and is fully integrated in the broadly used LLVM compilation tools.\n            <jats:sc>KernelFaRer<\/jats:sc>\n            replaces existing GEMM and SYR2K idioms with computations performed by BLAS, Eigen, MKL (Intel\u2019s x86), ESSL (IBM\u2019s PowerPC), and BLIS (AMD). Gains in performance that reach 2000\u00d7 over hand-crafted source code compiled at the highest optimization level demonstrate that replacing application code with library call is a performant solution.\n          <\/jats:p>","DOI":"10.1145\/3459010","type":"journal-article","created":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T16:13:38Z","timestamp":1624896818000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":19,"title":["KernelFaRer"],"prefix":"10.1145","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3476-184X","authenticated-orcid":false,"given":"Jo\u00e3o P. L.","family":"De Carvalho","sequence":"first","affiliation":[{"name":"University of Campinas (UNICAMP), Brazil"}]},{"given":"Braedy","family":"Kuzma","sequence":"additional","affiliation":[{"name":"University of Alberta, Canada"}]},{"given":"Ivan","family":"Korostelev","sequence":"additional","affiliation":[{"name":"University of Alberta, Canada"}]},{"given":"Jos\u00e9 Nelson","family":"Amaral","sequence":"additional","affiliation":[{"name":"University of Alberta, Canada"}]},{"given":"Christopher","family":"Barton","sequence":"additional","affiliation":[{"name":"IBM Corporation, Canada"}]},{"given":"Jos\u00e9","family":"Moreira","sequence":"additional","affiliation":[{"name":"IBM Corporation, USA"}]},{"given":"Guido","family":"Araujo","sequence":"additional","affiliation":[{"name":"University of Campinas (UNICAMP), Brazil"}]}],"member":"320","published-online":{"date-parts":[[2021,6,28]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1356052.1356053"},{"key":"e_1_2_1_2_1","first-page":"4","article-title":"Programming with idioms in APL","volume":"9","author":"Rugaber Alan","year":"1979","unstructured":"Perlis, Alan J. and Rugaber , Spencer. 1979 . Programming with idioms in APL . SIGAPL APL Quote Quad 9 , 4 - P1 (May 1979), 232\u2013235. DOI:http:\/\/dx.doi.org\/10.1145\/390009.804466 10.1145\/390009.804466 Perlis, Alan J. and Rugaber, Spencer. 1979. Programming with idioms in APL. SIGAPL APL Quote Quad 9, 4-P1 (May 1979), 232\u2013235. DOI:http:\/\/dx.doi.org\/10.1145\/390009.804466","journal-title":"SIGAPL APL Quote Quad"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the ACM SIGPLAN International Conference on Programming Language Design and Implementation","author":"Bondhugula Uday","unstructured":"Uday Bondhugula , Albert Hartono , J. Ramanujam , and P. Sadayappan . 2008. A practical automatic polyhedral parallelizer and locality optimizer . In Proceedings of the ACM SIGPLAN International Conference on Programming Language Design and Implementation . Tucson, AZ, USA, 101\u2013113. Uday Bondhugula, Albert Hartono, J. Ramanujam, and P. Sadayappan. 2008. A practical automatic polyhedral parallelizer and locality optimizer. In Proceedings of the ACM SIGPLAN International Conference on Programming Language Design and Implementation. Tucson, AZ, USA, 101\u2013113."},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS \u201918)","author":"Remmelg Philip","unstructured":"Ginsbach, Philip and Remmelg , Toomas and Steuwer , Michel and Bodin , Bruno and Dubach , Christophe and O\u2019Boyle , Michael F. P.2018. Automatic matching of legacy code to heterogeneous APIs: An idiomatic approach . In Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS \u201918) . Association for Computing Machinery, New York, NY, 139\u2013153. DOI:http:\/\/dx.doi.org\/10.1145\/3173162.3173182 10.1145\/3173162.3173182 Ginsbach, Philip and Remmelg, Toomas and Steuwer, Michel and Bodin, Bruno and Dubach, Christophe and O\u2019Boyle, Michael F. P.2018. Automatic matching of legacy code to heterogeneous APIs: An idiomatic approach. In Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS \u201918). Association for Computing Machinery, New York, NY, 139\u2013153. DOI:http:\/\/dx.doi.org\/10.1145\/3173162.3173182"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/567806.567807"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPADS.2012.97"},{"key":"e_1_2_1_8_1","volume-title":"Retrieved","author":"Developer S\u2013NVIDIA","year":"2020","unstructured":"cuBLA S\u2013NVIDIA Developer . Retrieved January 2020 from https:\/\/developer.nvidia.com\/cublas. cuBLAS\u2013NVIDIA Developer. Retrieved January 2020 from https:\/\/developer.nvidia.com\/cublas."},{"key":"e_1_2_1_9_1","article-title":"Declarative loop tactics for domain-specific optimization","volume":"16","author":"Zinenko Lorenzo","year":"2019","unstructured":"Chelini, Lorenzo and Zinenko , Oleksandr and Grosser , Tobias and Corp oraal, Henk. 2019 . Declarative loop tactics for domain-specific optimization . ACM Trans. Archit. Code Optim. 16 , 4, Article 55 (Dec. 2019), 25 pages. DOI:http:\/\/dx.doi.org\/10.1145\/3372266 10.1145\/3372266 Chelini, Lorenzo and Zinenko, Oleksandr and Grosser, Tobias and Corporaal, Henk. 2019. Declarative loop tactics for domain-specific optimization. ACM Trans. Archit. Code Optim. 16, 4, Article 55 (Dec. 2019), 25 pages. DOI:http:\/\/dx.doi.org\/10.1145\/3372266","journal-title":"ACM Trans. Archit. Code Optim."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/977395.977673"},{"key":"e_1_2_1_11_1","volume-title":"Languages and Compilers for Parallel Computing","unstructured":"Callahan, David. 1992. Recognizing and parallelizing bounded recurrences . In Languages and Compilers for Parallel Computing . Springer , Berlin , 169\u2013185. Callahan, David. 1992. Recognizing and parallelizing bounded recurrences. In Languages and Compilers for Parallel Computing. Springer, Berlin, 169\u2013185."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/177492.177494"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 13th International Conference on Supercomputing (ICS \u201999)","author":"Pingali Vijay","year":"1999","unstructured":"Menon, Vijay and Pingali , Keshav. 1999 . High-level semantic optimization of numerical codes . In Proceedings of the 13th International Conference on Supercomputing (ICS \u201999) . Association for Computing Machinery, New York, NY, 434\u2013443. DOI:http:\/\/dx.doi.org\/10.1145\/305138.305230 10.1145\/305138.305230 Menon, Vijay and Pingali, Keshav. 1999. High-level semantic optimization of numerical codes. In Proceedings of the 13th International Conference on Supercomputing (ICS \u201999). Association for Computing Machinery, New York, NY, 434\u2013443. DOI:http:\/\/dx.doi.org\/10.1145\/305138.305230"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3375555.3383586"},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the","author":"Iverson Kenneth E.","year":"1962","unstructured":"Kenneth E. Iverson . 1962 . A programming language . In Proceedings of the May 1-3, 1962, Spring Joint Computer Conference. ACM, 345\u2013351. Kenneth E. Iverson. 1962. A programming language. In Proceedings of the May 1-3, 1962, Spring Joint Computer Conference. ACM, 345\u2013351."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00264357"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/570406.570418"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies. IEEE, 272\u2013279","year":"2009","unstructured":"Hiroyuki, Sato. 2009 . Idiom recognition and program scheme recognition based program transformations for performance tuning\u2013beyond compiler optimizations . In Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies. IEEE, 272\u2013279 . Hiroyuki, Sato. 2009. Idiom recognition and program scheme recognition based program transformations for performance tuning\u2013beyond compiler optimizations. In Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies. IEEE, 272\u2013279."},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the IEEE International Parallel & Distributed Processing Symposium. IEEE, 118\u2013127","author":"He Jiahua","unstructured":"Jiahua He , Allan E. Snavely , Rob F. Van der Wijngaart, and Michael A. Frumkin. 2011. Automatic recognition of performance idioms in scientific applications . In Proceedings of the IEEE International Parallel & Distributed Processing Symposium. IEEE, 118\u2013127 . Jiahua He, Allan E. Snavely, Rob F. Van der Wijngaart, and Michael A. Frumkin. 2011. Automatic recognition of performance idioms in scientific applications. In Proceedings of the IEEE International Parallel & Distributed Processing Symposium. IEEE, 118\u2013127."},{"key":"e_1_2_1_20_1","article-title":"Idiom recognition framework using topological embedding","volume":"10","author":"Komatsu Motohiro","year":"2013","unstructured":"Kawahito, Motohiro and Komatsu , Hideaki and Moriyama , Takao and Inoue , Hiroshi and Nakatani , Toshio. 2013 . Idiom recognition framework using topological embedding . ACM Trans. Archit. Code Optim. 10 , 3, Article 13 (Sept. 2013), 34 pages. DOI:http:\/\/dx.doi.org\/10.1145\/2512431 10.1145\/2512431 Kawahito, Motohiro and Komatsu, Hideaki and Moriyama, Takao and Inoue, Hiroshi and Nakatani, Toshio. 2013. Idiom recognition framework using topological embedding. ACM Trans. Archit. Code Optim. 10, 3, Article 13 (Sept. 2013), 34 pages. DOI:http:\/\/dx.doi.org\/10.1145\/2512431","journal-title":"ACM Trans. Archit. Code Optim."},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the 22nd Annual International Computer Software and Applications Conference (Compsac\u201998)","author":"Palsberg Jens","unstructured":"Jens Palsberg and C. Barry Jay . 1998. The essence of the visitor pattern . In Proceedings of the 22nd Annual International Computer Software and Applications Conference (Compsac\u201998) . IEEE, 9\u201315. Jens Palsberg and C. Barry Jay. 1998. The essence of the visitor pattern. In Proceedings of the 22nd Annual International Computer Software and Applications Conference (Compsac\u201998). IEEE, 9\u201315."},{"key":"e_1_2_1_22_1","volume-title":"Ullman","author":"Aho Alfred V.","year":"1986","unstructured":"Alfred V. Aho , Ravi Sethi , and Jeffrey D . Ullman . 1986 . Compilers, Principles, Techniques ( 2 nd Ed.). Addison wesley. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers, Principles, Techniques (2nd Ed.). Addison wesley.","edition":"2"},{"key":"e_1_2_1_23_1","volume-title":"Johnson","author":"Horn Roger A.","year":"2012","unstructured":"Roger A. Horn and Charles R . Johnson . 2012 . Matrix Analysis. Cambridge University Press . Roger A. Horn and Charles R. Johnson. 2012. Matrix Analysis. Cambridge University Press."},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition. Universit\u00e9 de Rennes.","author":"Chellapilla Kumar","year":"2006","unstructured":"Kumar Chellapilla , Sidd Puri , and Patrice Simard . 2006 . High performance convolutional neural networks for document processing . In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition. Universit\u00e9 de Rennes. Kumar Chellapilla, Sidd Puri, and Patrice Simard. 2006. High performance convolutional neural networks for document processing. In Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition. Universit\u00e9 de Rennes."},{"key":"e_1_2_1_25_1","unstructured":"IBM. 2020. ESSL Guide and Reference (Version 5 Release 5).  IBM. 2020. ESSL Guide and Reference (Version 5 Release 5)."},{"key":"e_1_2_1_26_1","series-title":"Revision 26","volume-title":"Intel Math Kernel Library: Developer Reference Manual","unstructured":"Intel. 2020. Intel Math Kernel Library: Developer Reference Manual ( Revision 26 ) . Intel. 2020. Intel Math Kernel Library: Developer Reference Manual (Revision 26)."},{"key":"e_1_2_1_27_1","volume-title":"An Optimized BLAS Library. Retrieved","year":"2020","unstructured":"OpenBLAS : An Optimized BLAS Library. Retrieved January 2020 from https:\/\/www.openblas.net\/. OpenBLAS: An Optimized BLAS Library. Retrieved January 2020 from https:\/\/www.openblas.net\/."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2764454"},{"key":"e_1_2_1_29_1","volume-title":"et\u00a0al","author":"Guennebaud Ga\u00ebl","year":"2010","unstructured":"Ga\u00ebl Guennebaud , Beno\u00eet Jacob , et\u00a0al . 2010 . Eigen v3. Retrieved from http:\/\/eigen.tuxfamily.org. Ga\u00ebl Guennebaud, Beno\u00eet Jacob, et\u00a0al. 2010. Eigen v3. Retrieved from http:\/\/eigen.tuxfamily.org."},{"key":"e_1_2_1_30_1","volume-title":"A C language family frontend for LLVM. Retrieved","author":"Clang","year":"2020","unstructured":"Clang : A C language family frontend for LLVM. Retrieved January 2020 from https:\/\/clang.llvm.org. Clang: A C language family frontend for LLVM. Retrieved January 2020 from https:\/\/clang.llvm.org."},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the International Symposium on Code Generation and Optimization (CGO\u201907)","author":"Birkbeck N.","unstructured":"N. Birkbeck , J. Levesque , and J. N. Amaral . 2007. A dimension abstraction approach to vectorization in Matlab . In Proceedings of the International Symposium on Code Generation and Optimization (CGO\u201907) . 115\u2013130. N. Birkbeck, J. Levesque, and J. N. Amaral. 2007. A dimension abstraction approach to vectorization in Matlab. In Proceedings of the International Symposium on Code Generation and Optimization (CGO\u201907). 115\u2013130."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1006\/jagm.1996.0818"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3168812"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183895.3183900"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 29th International Conference on Compiler Construction (CC\u201920)","author":"Ginsbach Philip","unstructured":"Philip Ginsbach , Bruce Collie , and Michael F. P . O\u2019Boyle. 2020. Automatically harnessing sparse acceleration . In Proceedings of the 29th International Conference on Compiler Construction (CC\u201920) . Association for Computing Machinery, New York, NY, 179\u2013190. DOI:http:\/\/dx.doi.org\/10.1145\/3377555.3377893 10.1145\/3377555.3377893 Philip Ginsbach, Bruce Collie, and Michael F. P. O\u2019Boyle. 2020. Automatically harnessing sparse acceleration. In Proceedings of the 29th International Conference on Compiler Construction (CC\u201920). Association for Computing Machinery, New York, NY, 179\u2013190. DOI:http:\/\/dx.doi.org\/10.1145\/3377555.3377893"},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the International Workshop on Polyhedral Compilation Techniques.","author":"Guelton Sven","year":"2014","unstructured":"Verdoolaege, Sven and Guelton , Serge and Grosser , Tobias and Cohen , Albert. 2014 . Schedule trees . In Proceedings of the International Workshop on Polyhedral Compilation Techniques. Verdoolaege, Sven and Guelton, Serge and Grosser, Tobias and Cohen, Albert. 2014. Schedule trees. In Proceedings of the International Workshop on Polyhedral Compilation Techniques."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/7902.7904"},{"key":"e_1_2_1_38_1","unstructured":"Louis-No\u00ebl Pouchet and Tomofumi Yuki. 2019. PolyBench\/C 4.2.1: The Polyhedral Benchmark Suite. Retrieved from http:\/\/polybench.sf.net.  Louis-No\u00ebl Pouchet and Tomofumi Yuki. 2019. PolyBench\/C 4.2.1: The Polyhedral Benchmark Suite. Retrieved from http:\/\/polybench.sf.net."},{"key":"e_1_2_1_39_1","unstructured":"IBM. 2018. Power9 Processor User\u2019s Manual (Version 2.0).  IBM. 2018. Power9 Processor User\u2019s Manual (Version 2.0)."},{"key":"e_1_2_1_40_1","volume-title":"Software Optimization Guide for AMD Family 17th Models 30h and Greater Processors (Revision 3.01)","author":"AMD.","unstructured":"AMD. 2020. Software Optimization Guide for AMD Family 17th Models 30h and Greater Processors (Revision 3.01) . AMD. 2020. Software Optimization Guide for AMD Family 17th Models 30h and Greater Processors (Revision 3.01)."},{"key":"e_1_2_1_41_1","volume-title":"Instruction Tables: Lists of Instruction Latencies, Throughputs and Micro-operation Breakdowns for Intel, AMD and VIA CPUs","author":"Fog Agner","year":"2019","unstructured":"Agner Fog . 2019 . Instruction Tables: Lists of Instruction Latencies, Throughputs and Micro-operation Breakdowns for Intel, AMD and VIA CPUs . Technical University of Denmark ( 08 2019), 383. Retrieved March 2020 from https:\/\/www.agner.org\/optimize\/instruction_tables.pdf. Agner Fog. 2019. Instruction Tables: Lists of Instruction Latencies, Throughputs and Micro-operation Breakdowns for Intel, AMD and VIA CPUs. Technical University of Denmark (08 2019), 383. Retrieved March 2020 from https:\/\/www.agner.org\/optimize\/instruction_tables.pdf."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2491894.2464160"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IEEE, 1049\u20131059","author":"Smith Tyler M.","unstructured":"Tyler M. Smith , Robert Van De Geijn , Mikhail Smelyanskiy , Jeff R. Hammond , and Field G . Van Zee. 2014. Anatomy of high-performance many-threaded matrix multiplication . In Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IEEE, 1049\u20131059 . Tyler M. Smith, Robert Van De Geijn, Mikhail Smelyanskiy, Jeff R. Hammond, and Field G. Van Zee. 2014. Anatomy of high-performance many-threaded matrix multiplication. In Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IEEE, 1049\u20131059."},{"key":"e_1_2_1_44_1","volume-title":"Van de Geijn","author":"Huang Jianyu","year":"2016","unstructured":"Jianyu Huang and Robert A . Van de Geijn . 2016 . BLISlab: A sandbox for optimizing GEMM. arXiv:1609.00076. Retrieved from https:\/\/arxiv.org\/abs\/1609.00076. Jianyu Huang and Robert A. Van de Geijn. 2016. BLISlab: A sandbox for optimizing GEMM. arXiv:1609.00076. Retrieved from https:\/\/arxiv.org\/abs\/1609.00076."},{"key":"e_1_2_1_45_1","volume-title":"Retrieved","author":"The Science of High-Performance Computing Group","year":"2020","unstructured":"The Science of High-Performance Computing Group . Retrieved March 2020 https:\/\/shpc.oden.utexas.edu\/. The Science of High-Performance Computing Group. Retrieved March 2020 https:\/\/shpc.oden.utexas.edu\/."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2009.5306797"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/1186736.1186737"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3185768.3185771"},{"key":"e_1_2_1_49_1","volume-title":"NEKBONE: Thermal Hydraulics Mini-Application. Quick Starter Guide, Release 2.1","author":"Fischer P.","year":"2013","unstructured":"P. Fischer and K. Heisey . 2013 . NEKBONE: Thermal Hydraulics Mini-Application. Quick Starter Guide, Release 2.1 , 1 st ed., 2013. P. Fischer and K. Heisey. 2013. NEKBONE: Thermal Hydraulics Mini-Application. Quick Starter Guide, Release 2.1, 1st ed., 2013.","edition":"1"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3459010","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3459010","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:24:55Z","timestamp":1750195495000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3459010"}},"subtitle":["Replacing Native-Code Idioms with High-Performance Library Calls"],"short-title":[],"issued":{"date-parts":[[2021,6,28]]},"references-count":48,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2021,9,30]]}},"alternative-id":["10.1145\/3459010"],"URL":"https:\/\/doi.org\/10.1145\/3459010","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,6,28]]},"assertion":[{"value":"2020-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-06-28","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}