{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,27]],"date-time":"2026-06-27T00:18:35Z","timestamp":1782519515405,"version":"3.54.5"},"publisher-location":"New York, NY, USA","reference-count":73,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,6,18]],"date-time":"2021-06-18T00:00:00Z","timestamp":1623974400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Natural Science Foundation of China","award":["U20A20226,61702546"],"award-info":[{"award-number":["U20A20226,61702546"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,6,19]]},"DOI":"10.1145\/3453483.3454106","type":"proceedings-article","created":{"date-parts":[[2021,6,18]],"date-time":"2021-06-18T13:51:32Z","timestamp":1624024292000},"page":"1233-1248","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":71,"title":["AKG: automatic kernel generation for neural processing units using polyhedral transformations"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2303-9736","authenticated-orcid":false,"given":"Jie","family":"Zhao","sequence":"first","affiliation":[{"name":"State Key Laboratory of Mathematical Engineering and Advanced Computing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Bojie","family":"Li","sequence":"additional","affiliation":[{"name":"Huawei Technologies, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Wang","family":"Nie","sequence":"additional","affiliation":[{"name":"Huawei Technologies, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zhen","family":"Geng","sequence":"additional","affiliation":[{"name":"Huawei Technologies, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Renwei","family":"Zhang","sequence":"additional","affiliation":[{"name":"Huawei Technologies, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiong","family":"Gao","sequence":"additional","affiliation":[{"name":"Huawei Technologies, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Bin","family":"Cheng","sequence":"additional","affiliation":[{"name":"Huawei Technologies, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Chen","family":"Wu","sequence":"additional","affiliation":[{"name":"Huawei, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yun","family":"Cheng","sequence":"additional","affiliation":[{"name":"Huawei Technologies, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zheng","family":"Li","sequence":"additional","affiliation":[{"name":"Huawei Technologies, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Peng","family":"Di","sequence":"additional","affiliation":[{"name":"Huawei Technologies, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kun","family":"Zhang","sequence":"additional","affiliation":[{"name":"Huawei Technologies, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xuefeng","family":"Jin","sequence":"additional","affiliation":[{"name":"Huawei Technologies, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2021,6,18]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916)","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , Manjunath Kudlur , Josh Levenberg , Rajat Monga , Sherry Moore , Derek G. Murray , Benoit Steiner , Paul Tucker , Vijay Vasudevan , Pete Warden , Martin Wicke , Yuan Yu , and Xiaoqiang Zheng . 2016 . TensorFlow: A System for Large-scale Machine Learning . In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916) . USENIX Association, Berkeley, CA, USA. 265\u2013283. isbn:978-1-93 1971-33-1 http:\/\/dl.acm.org\/citation.cfm?id=3026877.3026899 Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-scale Machine Learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916). USENIX Association, Berkeley, CA, USA. 265\u2013283. isbn:978-1-931971-33-1 http:\/\/dl.acm.org\/citation.cfm?id=3026877.3026899"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3306346.3322967"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2628071.2628092"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2019.8661197"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1654059.1654119"},{"key":"e_1_3_2_1_6_1","unstructured":"Somashekaracharya G. Bhaskaracharya Julien Demouth and Vinod Grover. 2020. Automatic Kernel Generation for Volta Tensor Cores. arxiv:2006.12645.  Somashekaracharya G. Bhaskaracharya Julien Demouth and Vinod Grover. 2020. Automatic Kernel Generation for Volta Tensor Cores. arxiv:2006.12645."},{"key":"e_1_3_2_1_7_1","unstructured":"Uday Bondhugula. 2020. High Performance Code Generation in MLIR: An Early Case Study with GEMM. arxiv:2003.00532.  Uday Bondhugula. 2020. High Performance Code Generation in MLIR: An Early Case Study with GEMM. arxiv:2003.00532."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1854273.1854317"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1375581.1375595"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3410463.3414635"},{"key":"e_1_3_2_1_11_1","unstructured":"Tianqi Chen Mu Li Yutian Li Min Lin Naiyan Wang Minjie Wang Tianjun Xiao Bing Xu Chiyuan Zhang and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arxiv:1512.01274.  Tianqi Chen Mu Li Yutian Li Min Lin Naiyan Wang Minjie Wang Tianjun Xiao Bing Xu Chiyuan Zhang and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arxiv:1512.01274."},{"key":"e_1_3_2_1_12_1","volume-title":"Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201918)","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen , Thierry Moreau , Ziheng Jiang , Lianmin Zheng , Eddie Yan , Meghan Cowan , Haichen Shen , Leyuan Wang , Yuwei Hu , Luis Ceze , Carlos Guestrin , and Arvind Krishnamurthy . 2018 . TVM: An Automated End-to-end Optimizing Compiler for Deep Learning . In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201918) . USENIX Association, Berkeley, CA, USA. 579\u2013594. isbn:978-1-93 1971-47-8 http:\/\/dl.acm.org\/citation.cfm?id=3291168.3291211 Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-end Optimizing Compiler for Deep Learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201918). USENIX Association, Berkeley, CA, USA. 579\u2013594. isbn:978-1-931971-47-8 http:\/\/dl.acm.org\/citation.cfm?id=3291168.3291211"},{"key":"e_1_3_2_1_13_1","unstructured":"Tianqi Chen Lianmin Zheng Eddie Yan Ziheng Jiang Thierry Moreau Luis Ceze Carlos Guestrin and Arvind Krishnamurthy. 2018. Learning to optimize tensor programs. In Advances in Neural Information Processing Systems. 3389\u20133400.  Tianqi Chen Lianmin Zheng Eddie Yan Ziheng Jiang Thierry Moreau Luis Ceze Carlos Guestrin and Arvind Krishnamurthy. 2018. Learning to optimize tensor programs. In Advances in Neural Information Processing Systems. 3389\u20133400."},{"key":"e_1_3_2_1_14_1","volume-title":"Adam Procter, and Tristan J. Webb.","author":"Cyphers Scott","year":"2018","unstructured":"Scott Cyphers , Arjun K. Bansal , Anahita Bhiwandiwalla , Jayaram Bobba , Matthew Brookhart , Avijit Chakraborty , Will Constable , Christian Convey , Leona Cook , Omar Kanawi , Robert Kimball , Jason Knight , Nikolay Korovaiko , Varun Kumar , Yixing Lao , Christopher R. Lishka , Jaikrishnan Menon , Jennifer Myers , Sandeep Aswath Narayana , Adam Procter, and Tristan J. Webb. 2018 . Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning . arxiv:1801.08058. Scott Cyphers, Arjun K. Bansal, Anahita Bhiwandiwalla, Jayaram Bobba, Matthew Brookhart, Avijit Chakraborty, Will Constable, Christian Convey, Leona Cook, Omar Kanawi, Robert Kimball, Jason Knight, Nikolay Korovaiko, Varun Kumar, Yixing Lao, Christopher R. Lishka, Jaikrishnan Menon, Jennifer Myers, Sandeep Aswath Narayana, Adam Procter, and Tristan J. Webb. 2018. Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning. arxiv:1801.08058."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1423"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3211346.3211354"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01379404"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-09766-4_502"},{"key":"e_1_3_2_1_19_1","volume-title":"Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP\u201998 (Cat. No.98CH36181)","volume":"3","author":"Frigo M.","unstructured":"M. Frigo and S. G. Johnson . 1998. FFTW: an adaptive software architecture for the FFT . In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP\u201998 (Cat. No.98CH36181) . 3, 1381\u20131384 vol. 3 . M. Frigo and S. G. Johnson. 1998. FFTW: an adaptive software architecture for the FFT. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP\u201998 (Cat. No.98CH36181). 3, 1381\u20131384 vol.3."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2743016"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2909437.2909443"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3410463.3414632"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1542275.1542301"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_1_25_1","unstructured":"Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand Marco Andreetto and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arxiv:1704.04861.  Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand Marco Andreetto and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arxiv:1704.04861."},{"key":"e_1_3_2_1_26_1","unstructured":"Huawei. 2021. MindSpore. https:\/\/www.mindspore.cn\/en  Huawei. 2021. MindSpore. https:\/\/www.mindspore.cn\/en"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/73560.73588"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178487.3178507"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3410463.3414649"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2654889"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341301.3359630"},{"key":"e_1_3_2_1_32_1","unstructured":"Zhe Jia Blake Tillman Marco Maggioni and Daniele Paolo Scarpazza. 2019. Dissecting the Graphcore IPU Architecture via Microbenchmarking. arxiv:1912.03413.  Zhe Jia Blake Tillman Marco Maggioni and Daniele Paolo Scarpazza. 2019. Dissecting the Graphcore IPU Architecture via Microbenchmarking. arxiv:1912.03413."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_3_2_1_34_1","volume-title":"Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing. Springer-Verlag","author":"Kennedy Ken","unstructured":"Ken Kennedy and Kathryn S . McKinley. 1993. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution . In Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing. Springer-Verlag , Berlin, Heidelberg. 301\u2013320. isbn:3540576592 Ken Kennedy and Kathryn S. McKinley. 1993. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution. In Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing. Springer-Verlag, Berlin, Heidelberg. 301\u2013320. isbn:3540576592"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1362622.1362691"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3314221.3314653"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2491956.2462187"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250734.1250761"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3065386"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGO51591.2021.9370308"},{"key":"e_1_3_2_1_41_1","volume-title":"XLA: TensorFlow, compiled. TensorFlow Dev Summit.","author":"Leary Chris","year":"2017","unstructured":"Chris Leary and Todd Wang . 2017 . XLA: TensorFlow, compiled. TensorFlow Dev Summit. Chris Leary and Todd Wang. 2017. XLA: TensorFlow, compiled. TensorFlow Dev Summit."},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201383"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/HOTCHIPS.2019.8875654"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.42"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"e_1_3_2_1_46_1","volume-title":"Optimizing CNN Model Inference on CPUs. In 2019 USENIX Annual Technical Conference (USENIX ATC 19)","author":"Liu Yizhi","year":"2019","unstructured":"Yizhi Liu , Yao Wang , Ruofei Yu , Mu Li , Vin Sharma , and Yida Wang . 2019 . Optimizing CNN Model Inference on CPUs. In 2019 USENIX Annual Technical Conference (USENIX ATC 19) . USENIX Association, Renton, WA. 1025\u20131040. isbn:978-1-939133-03-8 https:\/\/www.usenix.org\/conference\/atc19\/presentation\/liu-yizhi Yizhi Liu, Yao Wang, Ruofei Yu, Mu Li, Vin Sharma, and Yida Wang. 2019. Optimizing CNN Model Inference on CPUs. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). USENIX Association, Renton, WA. 1025\u20131040. isbn:978-1-939133-03-8 https:\/\/www.usenix.org\/conference\/atc19\/presentation\/liu-yizhi"},{"key":"e_1_3_2_1_47_1","volume-title":"14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20)","author":"Ma Lingxiao","year":"2020","unstructured":"Lingxiao Ma , Zhiqiang Xie , Zhi Yang , Jilong Xue , Youshan Miao , Wei Cui , Wenxiang Hu , Fan Yang , Lintao Zhang , and Lidong Zhou . 2020 . Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasks . In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20) . USENIX Association, 881\u2013897. isbn:978-1-939133-19-9 https:\/\/www.usenix.org\/conference\/osdi20\/presentation\/ma Lingxiao Ma, Zhiqiang Xie, Zhi Yang, Jilong Xue, Youshan Miao, Wei Cui, Wenxiang Hu, Fan Yang, Lintao Zhang, and Lidong Zhou. 2020. Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasks. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 881\u2013897. isbn:978-1-939133-19-9 https:\/\/www.usenix.org\/conference\/osdi20\/presentation\/ma"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/233561.233564"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2555243.2555250"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897824.2925952"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2694344.2694364"},{"key":"e_1_3_2_1_52_1","volume-title":"11th International Workshop on Polyhedral Compilation Techniques (IMPACT","author":"Parashar Angshuman","year":"2021","unstructured":"Angshuman Parashar , Prasanth Chatarasi , and Po-An Tsai . 2021 . Hardware Abstractions for targeting EDDO Architectures with the Polyhedral Model . In 11th International Workshop on Polyhedral Compilation Techniques (IMPACT 2021). Angshuman Parashar, Prasanth Chatarasi, and Po-An Tsai. 2021. Hardware Abstractions for targeting EDDO Architectures with the Polyhedral Model. In 11th International Workshop on Polyhedral Compilation Techniques (IMPACT 2021)."},{"key":"e_1_3_2_1_53_1","volume-title":"Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems. 8026\u20138037.","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke , Sam Gross , Francisco Massa , Adam Lerer , James Bradbury , Gregory Chanan , Trevor Killeen , Zeming Lin , Natalia Gimelshein , and Luca Antiga . 2019 . Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems. 8026\u20138037. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, and Luca Antiga. 2019. Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems. 8026\u20138037."},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2016.56"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/2491956.2462176"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.5555\/3433701.3433778"},{"key":"e_1_3_2_1_57_1","volume-title":"Glow: Graph Lowering Compiler Techniques for Neural Networks. arxiv:1805.00907.","author":"Rotem Nadav","year":"2018","unstructured":"Nadav Rotem , Jordan Fix , Saleem Abdulrasool , Garret Catron , Summer Deng , Roman Dzhabarov , Nick Gibson , James Hegeman , Meghan Lele , Roman Levenstein , Jack Montgomery , Bert Maher , Satish Nadathur , Jakob Olesen , Jongsoo Park , Artem Rakhov , Misha Smelyanskiy , and Man Wang . 2018 . Glow: Graph Lowering Compiler Techniques for Neural Networks. arxiv:1805.00907. Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Garret Catron, Summer Deng, Roman Dzhabarov, Nick Gibson, James Hegeman, Meghan Lele, Roman Levenstein, Jack Montgomery, Bert Maher, Satish Nadathur, Jakob Olesen, Jongsoo Park, Artem Rakhov, Misha Smelyanskiy, and Man Wang. 2018. Glow: Graph Lowering Compiler Techniques for Neural Networks. arxiv:1805.00907."},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/2451116.2451150"},{"key":"e_1_3_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2945397"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.5555\/800048.801719"},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/2908080.2908105"},{"key":"e_1_3_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/3355606"},{"key":"e_1_3_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15582-6_49"},{"key":"e_1_3_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/2400682.2400713"},{"key":"e_1_3_2_1_65_1","volume-title":"Scheduling for PPCG. Report CW, 706","author":"Verdoolaege Sven","year":"2017","unstructured":"Sven Verdoolaege and Gerda Janssens . 2017. Scheduling for PPCG. Report CW, 706 ( 2017 ). Sven Verdoolaege and Gerda Janssens. 2017. Scheduling for PPCG. Report CW, 706 (2017)."},{"key":"e_1_3_2_1_66_1","volume-title":"Proceedings of the 1998 ACM\/IEEE Conference on Supercomputing (SC\u201998)","author":"Clint Whaley R.","year":"1984","unstructured":"R. Clint Whaley and Jack J. Dongarra . 1998. Automatically Tuned Linear Algebra Software . In Proceedings of the 1998 ACM\/IEEE Conference on Supercomputing (SC\u201998) . IEEE Computer Society, USA. 1\u201327. isbn:08979 1984 X R. Clint Whaley and Jack J. Dongarra. 1998. Automatically Tuned Linear Algebra Software. In Proceedings of the 1998 ACM\/IEEE Conference on Supercomputing (SC\u201998). IEEE Computer Society, USA. 1\u201327. isbn:089791984X"},{"key":"e_1_3_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/2964284.2967243"},{"key":"e_1_3_2_1_68_1","volume-title":"Stripe: Tensor Compilation via the Nested Polyhedral Model. arxiv:1903.06498.","author":"Zerrell Tim","year":"2019","unstructured":"Tim Zerrell and Jeremy Bruestle . 2019 . Stripe: Tensor Compilation via the Nested Polyhedral Model. arxiv:1903.06498. Tim Zerrell and Jeremy Bruestle. 2019. Stripe: Tensor Compilation via the Nested Polyhedral Model. arxiv:1903.06498."},{"key":"e_1_3_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/3369382"},{"key":"e_1_3_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO50266.2020.00044"},{"key":"e_1_3_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/3307650.3322226"},{"key":"e_1_3_2_1_72_1","volume-title":"Ansor: Generating High-Performance Tensor Programs for Deep Learning. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20)","author":"Zheng Lianmin","year":"2020","unstructured":"Lianmin Zheng , Chengfan Jia , Minmin Sun , Zhao Wu , Cody Hao Yu , Ameer Haj-Ali , Yida Wang , Jun Yang , Danyang Zhuo , Koushik Sen , Joseph E. Gonzalez , and Ion Stoica . 2020 . Ansor: Generating High-Performance Tensor Programs for Deep Learning. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20) . USENIX Association, 863\u2013879. isbn:978-1-939133-19-9 https:\/\/www.usenix.org\/conference\/osdi20\/presentation\/zheng Lianmin Zheng, Chengfan Jia, Minmin Sun, Zhao Wu, Cody Hao Yu, Ameer Haj-Ali, Yida Wang, Jun Yang, Danyang Zhuo, Koushik Sen, Joseph E. Gonzalez, and Ion Stoica. 2020. Ansor: Generating High-Performance Tensor Programs for Deep Learning. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 863\u2013879. isbn:978-1-939133-19-9 https:\/\/www.usenix.org\/conference\/osdi20\/presentation\/zheng"},{"key":"e_1_3_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1145\/3373376.3378508"}],"event":{"name":"PLDI '21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","location":"Virtual Canada","acronym":"PLDI '21","sponsor":["SIGPLAN ACM Special Interest Group on Programming Languages"]},"container-title":["Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3453483.3454106","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3453483.3454106","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:47:48Z","timestamp":1750193268000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3453483.3454106"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,18]]},"references-count":73,"alternative-id":["10.1145\/3453483.3454106","10.1145\/3453483"],"URL":"https:\/\/doi.org\/10.1145\/3453483.3454106","relation":{},"subject":[],"published":{"date-parts":[[2021,6,18]]},"assertion":[{"value":"2021-06-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}