{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T10:19:15Z","timestamp":1770545955753,"version":"3.49.0"},"reference-count":114,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2020,11,25]],"date-time":"2020-11-25T00:00:00Z","timestamp":1606262400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000185","name":"Defense Advanced Research Projects Agency","doi-asserted-by":"publisher","award":["FA8750-19C-0003"],"award-info":[{"award-number":["FA8750-19C-0003"]}],"id":[{"id":"10.13039\/100000185","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Science Foundation","award":["CCF-1618039, SHF-1652132, CCF-1908633"],"award-info":[{"award-number":["CCF-1618039, SHF-1652132, CCF-1908633"]}]},{"DOI":"10.13039\/100006602","name":"Air Force Research Laboratory","doi-asserted-by":"publisher","award":["FA8750-19-1-0501"],"award-info":[{"award-number":["FA8750-19-1-0501"]}],"id":[{"id":"10.13039\/100006602","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2020,12,31]]},"abstract":"<jats:p>GPUs are a key enabler of the revolution in machine learning and high-performance computing, functioning as de facto co-processors to accelerate large-scale computation. As the programming stack and tool support have matured, GPUs have also become accessible to programmers, who may lack detailed knowledge of the underlying architecture and fail to fully leverage the GPU\u2019s computation power. GEVO (Gpu optimization using EVOlutionary computation) is a tool for automatically discovering optimization opportunities and tuning the performance of GPU kernels in the LLVM representation. GEVO uses population-based search to find edits to GPU code compiled to LLVM-IR and improves performance on desired criteria while retaining required functionality. We demonstrate that GEVO improves the execution time of general-purpose GPU programs and machine learning (ML) models on NVIDIA Tesla P100. For the Rodinia benchmarks, GEVO improves GPU kernel runtime performance by an average of 49.48% and by as much as 412% over the fully compiler-optimized baseline. If kernel output accuracy is relaxed to tolerate up to 1% error, GEVO can find kernel variants that outperform the baseline by an average of 51.08%. For the ML workloads, GEVO achieves kernel performance improvement for SVM on the MNIST handwriting recognition (3.24\u00d7) and the a9a income prediction (2.93\u00d7) datasets with no loss of model accuracy. GEVO achieves 1.79\u00d7 kernel performance improvement on image classification using ResNet18\/CIFAR-10, with less than 1% model accuracy reduction.<\/jats:p>","DOI":"10.1145\/3418055","type":"journal-article","created":{"date-parts":[[2020,11,26]],"date-time":"2020-11-26T19:02:10Z","timestamp":1606417330000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["GEVO"],"prefix":"10.1145","volume":"17","author":[{"given":"Jhe-Yu","family":"Liou","sequence":"first","affiliation":[{"name":"Arizona State University, Tempe, AZ"}]},{"given":"Xiaodong","family":"Wang","sequence":"additional","affiliation":[{"name":"Facebook, Menlo Park, CA"}]},{"given":"Stephanie","family":"Forrest","sequence":"additional","affiliation":[{"name":"Arizona State University and Santa Fe Institute, Santa Fe, NM"}]},{"given":"Carole-Jean","family":"Wu","sequence":"additional","affiliation":[{"name":"Arizona State University and Facebook, Menlo Park, CA"}]}],"member":"320","published-online":{"date-parts":[[2020,11,25]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"TensorFlow. 2018. XLA is a compiler that optimizes TensorFlow computations. Retrieved from https:\/\/www.tensorflow.org\/xla\/.  TensorFlow. 2018. XLA is a compiler that optimizes TensorFlow computations. Retrieved from https:\/\/www.tensorflow.org\/xla\/."},{"key":"e_1_2_1_2_1","unstructured":"Advanced Micro Devices Inc. 2020. AMD Exascale Supercomputer. Retrieved from https:\/\/www.amd.com\/en\/products\/exascale-era.  Advanced Micro Devices Inc. 2020. AMD Exascale Supercomputer. Retrieved from https:\/\/www.amd.com\/en\/products\/exascale-era."},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation.","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , Manjunath Kudlur , Josh Levenberg , Rajat Monga , Sherry Moore , Derek G. Murray , Benoit Steiner , Paul Tucker , Vijay Vasudevan , Pete Warden , Martin Wicke , Yuan Yu , and Xiaoqiang Zheng . 2016 . TensorFlow: A system for large-scale machine learning . In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2858788.2688523"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2008.01.047"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3140659.3080231"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2016.7581276"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1016\/B978-1-55860-377-6.50014-1"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1168919.1168906"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-46669-8_12"},{"key":"e_1_2_1_11_1","volume-title":"Automatic software diversity in the light of test suites. arXiv preprint arXiv:1509.00144","author":"Baudry Benoit","year":"2015","unstructured":"Benoit Baudry , Simon Allier , Marcelino Rodriguez-Cancio , and Martin Monperrus . 2015. Automatic software diversity in the light of test suites. arXiv preprint arXiv:1509.00144 ( 2015 ). Benoit Baudry, Simon Allier, Marcelino Rodriguez-Cancio, and Martin Monperrus. 2015. Automatic software diversity in the light of test suites. arXiv preprint arXiv:1509.00144 (2015)."},{"key":"e_1_2_1_12_1","volume-title":"Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940","author":"Bello Irwan","year":"2016","unstructured":"Irwan Bello , Hieu Pham , Quoc V. Le , Mohammad Norouzi , and Samy Bengio . 2016. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 ( 2016 ). Irwan Bello, Hieu Pham, Quoc V. Le, Mohammad Norouzi, and Samy Bengio. 2016. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 (2016)."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the International Conference on Machine Learning.","author":"Bender Gabriel","year":"2018","unstructured":"Gabriel Bender , Pieter-Jan Kindermans , Barret Zoph , Vijay Vasudevan , and Quoc Le . 2018 . Understanding and simplifying one-shot architecture search . In Proceedings of the International Conference on Machine Learning. Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, and Quoc Le. 2018. Understanding and simplifying one-shot architecture search. In Proceedings of the International Conference on Machine Learning."},{"key":"e_1_2_1_14_1","first-page":"281","article-title":"Random search for hyper-parameter optimization","author":"Bergstra James","year":"2012","unstructured":"James Bergstra and Yoshua Bengio . 2012 . Random search for hyper-parameter optimization . J. Mach. Learn. Res. 13 , Feb. (2012), 281 -- 305 . James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, Feb. (2012), 281--305.","journal-title":"J. Mach. Learn. Res. 13"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2739480.2754752"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2018.2827066"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.1999.782671"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-22183-0_20"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2019.00046"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1961189.1961199"},{"key":"e_1_2_1_21_1","volume-title":"MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274","author":"Chen Tianqi","year":"2015","unstructured":"Tianqi Chen , Mu Li , Yutian Li , Min Lin , Naiyan Wang , Minjie Wang , Tianjun Xiao , Bing Xu , Chiyuan Zhang , and Zheng Zhang . 2015. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 ( 2015 ). Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015)."},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of 13th USENIX Symposium on Operating Systems Design and Implementation.","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen , Thierry Moreau , Ziheng Jiang , Lianmin Zheng , Eddie Yan , Haichen Shen , Meghan Cowan , Leyuan Wang , Yuwei Hu , Luis Ceze , et\u00a0al. 2018 . TVM: An automated end-to-end optimizing compiler for deep learning . In Proceedings of 13th USENIX Symposium on Operating Systems Design and Implementation. Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, et\u00a0al. 2018. TVM: An automated end-to-end optimizing compiler for deep learning. In Proceedings of 13th USENIX Symposium on Operating Systems Design and Implementation."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.40"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3093337.3037754"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2330784.2330799"},{"key":"e_1_2_1_26_1","volume-title":"A fast and elitist multiobjective genetic algorithm: NSGA-II","author":"Deb Kalyanmoy","year":"2002","unstructured":"Kalyanmoy Deb , Samir Agrawal , Amrit Pratap , and Tanaka Meyarivan . 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II . IEEE Trans. Evol. Comput . ( 2002 ). Kalyanmoy Deb, Samir Agrawal, Amrit Pratap, and Tanaka Meyarivan. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. (2002)."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of 3rd International Conference on Software Testing, Verification and Validation.","author":"Debroy Vidroha","unstructured":"Vidroha Debroy and W. Eric Wong . 2010. Using mutation to automatically suggest fixes for faulty programs . In Proceedings of 3rd International Conference on Software Testing, Verification and Validation. Vidroha Debroy and W. Eric Wong. 2010. Using mutation to automatically suggest fixes for faulty programs. In Proceedings of 3rd International Conference on Software Testing, Verification and Validation."},{"key":"e_1_2_1_28_1","volume-title":"Modha","author":"Dhillon Inderjit S.","year":"2002","unstructured":"Inderjit S. Dhillon and Dharmendra S . Modha . 2002 . A data-clustering algorithm on distributed memory multiprocessors. In Large-scale Parallel Data Mining. Springer , 245--260. Inderjit S. Dhillon and Dharmendra S. Modha. 2002. A data-clustering algorithm on distributed memory multiprocessors. In Large-scale Parallel Data Mining. Springer, 245--260."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2017.2775634"},{"key":"e_1_2_1_30_1","unstructured":"Facebook. 2018. Finding and Fixing Software Bugs Automatically with Sapfix and Sapienz. Retrieved from https:\/\/code.fb.com\/developer-tools\/finding-and-fixing-software-bugs-automatically-with-sapfix-and-sapienz\/.  Facebook. 2018. Finding and Fixing Software Bugs Automatically with Sapfix and Sapienz. Retrieved from https:\/\/code.fb.com\/developer-tools\/finding-and-fixing-software-bugs-automatically-with-sapfix-and-sapienz\/."},{"key":"e_1_2_1_31_1","unstructured":"Facebook. 2019. Caffe2. Retrieved from https:\/\/caffe2.ai\/.  Facebook. 2019. Caffe2. Retrieved from https:\/\/caffe2.ai\/."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1569901.1570031"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2011.104"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1993316.1993506"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 3rd Conference on Machine Learning and Systems (ML-Sys\u201920)","author":"Haj-Ali Ameer","year":"2020","unstructured":"Ameer Haj-Ali , Qijing Huang , William Moses , John Xiang , John Wawrzynek , Krste Asanovic , and Ion Stoica . 2020 . AutoPhase: Juggling HLS phase orderings in random forests with deep reinforcement learning . In Proceedings of the 3rd Conference on Machine Learning and Systems (ML-Sys\u201920) . Ameer Haj-Ali, Qijing Huang, William Moses, John Xiang, John Wawrzynek, Krste Asanovic, and Ion Stoica. 2020. AutoPhase: Juggling HLS phase orderings in random forests with deep reinforcement learning. In Proceedings of the 3rd Conference on Machine Learning and Systems (ML-Sys\u201920)."},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the Genetic and Evolutionary Computation Conference.","author":"Haraldsson Saemundur O.","unstructured":"Saemundur O. Haraldsson , John R. Woodward , Alexander, E. I. Brownlee , A. V. Smith , and V. Gudnason . 2017. Genetic improvement of runtime and its fitness landscape in a bioinformatics application . In Proceedings of the Genetic and Evolutionary Computation Conference. Saemundur O. Haraldsson, John R. Woodward, Alexander, E. I. Brownlee, A. V. Smith, and V. Gudnason. 2017. Genetic improvement of runtime and its fitness landscape in a bioinformatics application. In Proceedings of the Genetic and Evolutionary Computation Conference."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3067695.3082517"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2018.00059"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.351"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2304576.2304582"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341301.3359630"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.709614"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the International Conference on Advances in Neural Information Processing Systems.","author":"Kandasamy Kirthevasan","unstructured":"Kirthevasan Kandasamy , Willie Neiswanger , Jeff Schneider , Barnabas Poczos , and Eric P. Xing . 2018. Neural architecture search with Bayesian optimisation and optimal transport . In Proceedings of the International Conference on Advances in Neural Information Processing Systems. Kirthevasan Kandasamy, Willie Neiswanger, Jeff Schneider, Barnabas Poczos, and Eric P. Xing. 2018. Neural architecture search with Bayesian optimisation and optimal transport. In Proceedings of the International Conference on Advances in Neural Information Processing Systems."},{"key":"e_1_2_1_45_1","volume-title":"Proceedings of the International Conference on Advances in Neural Information Processing Systems. 1097--1105","author":"Krizhevsky Alex","unstructured":"Alex Krizhevsky , Ilya Sutskever , and Geoffrey E. Hinton . 2012. Imagenet classification with deep convolutional neural networks . In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 1097--1105 . Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 1097--1105."},{"key":"e_1_2_1_46_1","volume-title":"Proceedings of the IEEE Congress on Evolutionary Computation.","author":"William","unstructured":"William B. Langdon and Mark Harman. 2010. Evolving a CUDA kernel from an nVidia template . In Proceedings of the IEEE Congress on Evolutionary Computation. William B. Langdon and Mark Harman. 2010. Evolving a CUDA kernel from an nVidia template. In Proceedings of the IEEE Congress on Evolutionary Computation."},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of 17th European Conference on Genetic Programming.","author":"William","unstructured":"William B. Langdon and Mark Harman. 2014. Genetically improved CUDA C++ software . In Proceedings of 17th European Conference on Genetic Programming. William B. Langdon and Mark Harman. 2014. Genetically improved CUDA C++ software. In Proceedings of 17th European Conference on Genetic Programming."},{"key":"e_1_2_1_48_1","volume-title":"Proceedings of the 17th Conference on Genetic and Evolutionary Computation.","author":"William","unstructured":"William B. Langdon and Mark Harman. 2015. Grow and graft a better CUDA pknotsRG for RNA Pseudoknot free energy calculation . In Proceedings of the 17th Conference on Genetic and Evolutionary Computation. William B. Langdon and Mark Harman. 2015. Grow and graft a better CUDA pknotsRG for RNA Pseudoknot free energy calculation. In Proceedings of the 17th Conference on Genetic and Evolutionary Computation."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2739480.2754652"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/1273496.1273556"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2012.6227211"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2009.21"},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the 2nd Conference on the Genetic and Evolutionary Computation Conference.","author":"Lee C.-Y.","unstructured":"C.-Y. Lee and E. K. Antonsson . 2000. Variable length genomes for evolutionary algorithms . In Proceedings of the 2nd Conference on the Genetic and Evolutionary Computation Conference. C.-Y. Lee and E. K. Antonsson. 2000. Variable length genomes for evolutionary algorithms. In Proceedings of the 2nd Conference on the Genetic and Evolutionary Computation Conference."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCD.2016.7753271"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/GI.2019.00014"},{"key":"e_1_2_1_57_1","volume-title":"Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems: Wild and Crazy Idea session.","author":"Liou Jhe-Yu","year":"2019","unstructured":"Jhe-Yu Liou , Stephanie Forrest , and Carole-Jean Wu . 2019 . Uncovering Performance Opportunities by Relaxing Program Semantics of GPGPU Kernels . In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems: Wild and Crazy Idea session. Jhe-Yu Liou, Stephanie Forrest, and Carole-Jean Wu. 2019. Uncovering Performance Opportunities by Relaxing Program Semantics of GPGPU Kernels. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems: Wild and Crazy Idea session."},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01246-5_2"},{"key":"e_1_2_1_59_1","volume-title":"Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436","author":"Liu Hanxiao","year":"2017","unstructured":"Hanxiao Liu , Karen Simonyan , Oriol Vinyals , Chrisantha Fernando , and Koray Kavukcuoglu . 2017. Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436 ( 2017 ). Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, and Koray Kavukcuoglu. 2017. Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436 (2017)."},{"key":"e_1_2_1_60_1","volume-title":"DARTS: Differentiable architecture search. arXiv preprint arXiv:1806.09055","author":"Liu Hanxiao","year":"2018","unstructured":"Hanxiao Liu , Karen Simonyan , and Yiming Yang . 2018 . DARTS: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018). Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. DARTS: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018)."},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.298"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2013.44"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1145\/2568225.2568297"},{"key":"e_1_2_1_64_1","volume-title":"Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems.","author":"Massalin Henry","year":"1987","unstructured":"Henry Massalin . 1987 . Superoptimizer: A look at the smallest program . In Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems. Henry Massalin. 1987. Superoptimizer: A look at the smallest program. In Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems."},{"key":"e_1_2_1_65_1","volume-title":"Taylor Robie, Tom St. John, Carole-Jean Wu, Lingjie Xu, Cliff Young, and Matei Zaharia.","author":"Mattson Peter","year":"2019","unstructured":"Peter Mattson , Christine Cheng , Cody Coleman , Greg Diamos , Paulius Micikevicius , David Patterson , Hanlin Tang , Gu-Yeon Wei , Peter Bailis , Victor Bittorf , David Brooks , Dehao Chen , Debojyoti Dutta , Udit Gupta , Kim Hazelwood , Andrew Hock , Xinyuan Huang , Bill Jia , Daniel Kang , David Kanter , Naveen Kumar , Jeffery Liao , Deepak Narayanan , Tayo Oguntebi , Gennady Pekhimenko , Lillian Pentecost , Vijay Janapa Reddi , Taylor Robie, Tom St. John, Carole-Jean Wu, Lingjie Xu, Cliff Young, and Matei Zaharia. 2019 . MLPerf training benchmark. arXiv preprint arXiv:1910.01500 (2019). Peter Mattson, Christine Cheng, Cody Coleman, Greg Diamos, Paulius Micikevicius, David Patterson, Hanlin Tang, Gu-Yeon Wei, Peter Bailis, Victor Bittorf, David Brooks, Dehao Chen, Debojyoti Dutta, Udit Gupta, Kim Hazelwood, Andrew Hock, Xinyuan Huang, Bill Jia, Daniel Kang, David Kanter, Naveen Kumar, Jeffery Liao, Deepak Narayanan, Tayo Oguntebi, Gennady Pekhimenko, Lillian Pentecost, Vijay Janapa Reddi, Taylor Robie, Tom St. John, Carole-Jean Wu, Lingjie Xu, Cliff Young, and Matei Zaharia. 2019. MLPerf training benchmark. arXiv preprint arXiv:1910.01500 (2019)."},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2020.2974843"},{"key":"e_1_2_1_67_1","first-page":"148","article-title":"An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix","volume":"31","author":"Andvandervorst Meijerink J.","year":"1977","unstructured":"J. Andvandervorst Meijerink and Henk A. Van Der Vorst . 1977 . An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix . Math. of Comput. 31 , 137 (1977), 148 -- 162 . DOI:https:\/\/doi.org\/10.2307\/2005786 J. Andvandervorst Meijerink and Henk A. Van Der Vorst. 1977. An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix. Math. of Comput. 31, 137 (1977), 148--162. DOI:https:\/\/doi.org\/10.2307\/2005786","journal-title":"Math. of Comput."},{"key":"e_1_2_1_68_1","volume-title":"Proceedings of the International Conference on Advances in Neural Information Processing Systems. 14598--14609","author":"Mendis Charith","year":"2019","unstructured":"Charith Mendis , Cambridge Yang , Yewen Pu , Saman Amarasinghe , and Michael Carbin . 2019 . Compiler auto-vectorization with imitation learning . In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 14598--14609 . Charith Mendis, Cambridge Yang, Yewen Pu, Saman Amarasinghe, and Michael Carbin. 2019. Compiler auto-vectorization with imitation learning. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 14598--14609."},{"key":"e_1_2_1_69_1","first-page":"193","article-title":"Genetic algorithms, tournament selection, and the effects of noise","volume":"9","author":"Miller Brad L.","year":"1995","unstructured":"Brad L. Miller , David E. Goldberg , et\u00a0al. 1995 . Genetic algorithms, tournament selection, and the effects of noise . Complex Systems 9 , 3 (1995), 193 -- 212 . Retrieved from https:\/\/www.complex-systems.com\/abstracts\/v09_i03_a02\/. Brad L. Miller, David E. Goldberg, et\u00a0al. 1995. Genetic algorithms, tournament selection, and the effects of noise. Complex Systems 9, 3 (1995), 193--212. Retrieved from https:\/\/www.complex-systems.com\/abstracts\/v09_i03_a02\/.","journal-title":"Complex Systems"},{"key":"e_1_2_1_70_1","volume-title":"Proceedings of International Conference on Learning Representations.","author":"Molchanov Pavlo","year":"2016","unstructured":"Pavlo Molchanov , Stephen Tyree , Tero Karras , Timo Aila , and Jan Kautz . 2016 . Pruning convolutional neural networks for resource efficient inference . In Proceedings of International Conference on Learning Representations. Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2016. Pruning convolutional neural networks for resource efficient inference. In Proceedings of International Conference on Learning Representations."},{"key":"e_1_2_1_71_1","volume-title":"Proceedings of the International Joint Conferences on Artificial Intelligence.","author":"David","unstructured":"David J. Montana and Lawrence Davis. 1989. Training feedforward neural networks using genetic algorithms . In Proceedings of the International Joint Conferences on Artificial Intelligence. David J. Montana and Lawrence Davis. 1989. Training feedforward neural networks using genetic algorithms. In Proceedings of the International Joint Conferences on Artificial Intelligence."},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-78800-3_24"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1145\/2517349.2522739"},{"key":"e_1_2_1_74_1","volume-title":"NeurIPS Autodiff Workshop.","author":"Paszke Adam","year":"2017","unstructured":"Adam Paszke , Sam Gross , Soumith Chintala , Gregory Chanan , Edward Yang , Zachary DeVito , Zeming Lin , Alban Desmaison , Luca Antiga , and Adam Lerer . 2017 . Automatic differentiation in PyTorch . In NeurIPS Autodiff Workshop. Retrieved from https:\/\/openreview.net\/forum?id&equals;BJJsrmfCZ. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In NeurIPS Autodiff Workshop. Retrieved from https:\/\/openreview.net\/forum?id&equals;BJJsrmfCZ."},{"key":"e_1_2_1_75_1","volume-title":"Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation.","author":"Pettis Karl","unstructured":"Karl Pettis and Robert C. Hansen . 1990. Profile guided code positioning . In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. Karl Pettis and Robert C. Hansen. 1990. Profile guided code positioning. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation."},{"key":"e_1_2_1_76_1","volume-title":"Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268","author":"Pham Hieu","year":"2018","unstructured":"Hieu Pham , Melody Y. Guan , Barret Zoph , Quoc V. Le , and Jeff Dean . 2018. Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 ( 2018 ). Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, and Jeff Dean. 2018. Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 (2018)."},{"key":"e_1_2_1_77_1","volume-title":"Advances in Kernel Methods: Support Vector Learning","author":"Platt John C.","unstructured":"John C. Platt . 1999. Fast training of support vector machines using sequential minimal optimization . In Advances in Kernel Methods: Support Vector Learning . MIT Press , Cambridge, MA, USA , 185--208. https:\/\/dl.acm.org\/doi\/10.5555\/299094.299105 John C. Platt. 1999. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods: Support Vector Learning. MIT Press, Cambridge, MA, USA, 185--208. https:\/\/dl.acm.org\/doi\/10.5555\/299094.299105"},{"key":"e_1_2_1_78_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"33","author":"Real Esteban","unstructured":"Esteban Real , Alok Aggarwal , Yanping Huang , and Quoc V. Le . 2019. Regularized evolution for image classifier architecture search . In Proceedings of the AAAI Conference on Artificial Intelligence , Vol. 33 . 4780--4789. Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V. Le. 2019. Regularized evolution for image classifier architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4780--4789."},{"key":"e_1_2_1_79_1","volume-title":"Proceedings of the 34th International Conference on Machine Learning","volume":"70","author":"Real Esteban","year":"2017","unstructured":"Esteban Real , Sherry Moore , Andrew Selle , Saurabh Saxena , Yutaka Leon Suematsu , Jie Tan , Quoc V. Le , and Alexey Kurakin . 2017 . Large-scale evolution of image classifiers . In Proceedings of the 34th International Conference on Machine Learning , Vol. 70 . JMLR. org. Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V. Le, and Alexey Kurakin. 2017. Large-scale evolution of image classifiers. In Proceedings of the 34th International Conference on Machine Learning, Vol. 70. JMLR. org."},{"key":"e_1_2_1_80_1","volume-title":"Proceedings of the International Conference on Advances in Neural Information Processing Systems.","author":"Recht Benjamin","year":"2011","unstructured":"Benjamin Recht , Christopher Re , Stephen Wright , and Feng Niu . 2011 . Hogwild: A lock-free approach to parallelizing stochastic gradient descent . In Proceedings of the International Conference on Advances in Neural Information Processing Systems. Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. 2011. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Proceedings of the International Conference on Advances in Neural Information Processing Systems."},{"key":"e_1_2_1_81_1","unstructured":"Vijay Janapa Reddi Christine Cheng David Kanter Peter Mattson Guenther Schmuelling Carole-Jean Wu Brian Anderson Maximilien Breughe Mark Charlebois William Chou Ramesh Chukka Cody Coleman Sam Davis Pan Deng Greg Diamos Jared Duke Dave Fick J. Scott Gardner Itay Hubara Sachin Idgunji Thomas B. Jablin Jeff Jiao Tom St. John Pankaj Kanwar David Lee Jeffery Liao Anton Lokhmotov Francisco Massa Peng Meng Paulius Micikevicius Colin Osborne Gennady Pekhimenko Arun Tejusve Raghunath Rajan Dilip Sequeira Ashish Sirasao Fei Sun Hanlin Tang Michael Thomson Frank Wei Ephrem Wu Lingjie Xu Koichi Yamada Bing Yu George Yuan Aaron Zhong Peizhao Zhang and Yuchen Zhou. MLPerf inference benchmark. arXiv preprint arXiv:1911.02549 (2019).  Vijay Janapa Reddi Christine Cheng David Kanter Peter Mattson Guenther Schmuelling Carole-Jean Wu Brian Anderson Maximilien Breughe Mark Charlebois William Chou Ramesh Chukka Cody Coleman Sam Davis Pan Deng Greg Diamos Jared Duke Dave Fick J. Scott Gardner Itay Hubara Sachin Idgunji Thomas B. Jablin Jeff Jiao Tom St. John Pankaj Kanwar David Lee Jeffery Liao Anton Lokhmotov Francisco Massa Peng Meng Paulius Micikevicius Colin Osborne Gennady Pekhimenko Arun Tejusve Raghunath Rajan Dilip Sequeira Ashish Sirasao Fei Sun Hanlin Tang Michael Thomson Frank Wei Ephrem Wu Lingjie Xu Koichi Yamada Bing Yu George Yuan Aaron Zhong Peizhao Zhang and Yuchen Zhou. MLPerf inference benchmark. arXiv preprint arXiv:1911.02549 (2019)."},{"key":"e_1_2_1_82_1","volume-title":"et\u00a0al","author":"Rotem Nadav","year":"2018","unstructured":"Nadav Rotem , Jordan Fix , Saleem Abdulrasool , Summer Deng , Roman Dzhabarov , James Hegeman , Roman Levenstein , Bert Maher , Satish Nadathur , Jakob Olesen , et\u00a0al . 2018 . Glow : Graph lowering compiler techniques for neural networks. arXiv preprint arXiv:1805.00907 (2018). Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Summer Deng, Roman Dzhabarov, James Hegeman, Roman Levenstein, Bert Maher, Satish Nadathur, Jakob Olesen, et\u00a0al. 2018. Glow: Graph lowering compiler techniques for neural networks. arXiv preprint arXiv:1805.00907 (2018)."},{"key":"e_1_2_1_83_1","volume-title":"Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 73--82","author":"Ryoo Shane","unstructured":"Shane Ryoo , Christopher I. Rodrigues , Sara S. Baghsorkhi , Sam S. Stone , David B. Kirk , and Wen-mei W. Hwu . 2008. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA . In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 73--82 . Shane Ryoo, Christopher I. Rodrigues, Sara S. Baghsorkhi, Sam S. Stone, David B. Kirk, and Wen-mei W. Hwu. 2008. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 73--82."},{"key":"e_1_2_1_84_1","first-page":"6","article-title":"Online algorithms and stochastic approximations","volume":"5","author":"Saad David","year":"1998","unstructured":"David Saad . 1998 . Online algorithms and stochastic approximations . Online Learn. 5 (1998), 6 -- 3 . David Saad. 1998. Online algorithms and stochastic approximations. Online Learn. 5 (1998), 6--3.","journal-title":"Online Learn."},{"key":"e_1_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1145\/2490301.2451150"},{"key":"e_1_2_1_86_1","doi-asserted-by":"publisher","DOI":"10.1145\/2666356.2594302"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.1145\/2451116.2451151"},{"key":"e_1_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541980"},{"key":"e_1_2_1_90_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10710-013-9195-8"},{"key":"e_1_2_1_91_1","doi-asserted-by":"publisher","DOI":"10.1145\/2739482.2768427"},{"key":"e_1_2_1_92_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2015.71"},{"key":"e_1_2_1_93_1","doi-asserted-by":"publisher","DOI":"10.1145\/2814270.2814278"},{"key":"e_1_2_1_94_1","doi-asserted-by":"publisher","DOI":"10.1145\/2025113.2025133"},{"key":"e_1_2_1_95_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024156.2024186"},{"key":"e_1_2_1_96_1","doi-asserted-by":"publisher","DOI":"10.1162\/artl.2009.15.2.15202"},{"key":"e_1_2_1_97_1","doi-asserted-by":"publisher","DOI":"10.1162\/106365602320169811"},{"key":"e_1_2_1_98_1","doi-asserted-by":"publisher","DOI":"10.1093\/genetics\/144.1.419"},{"key":"e_1_2_1_99_1","doi-asserted-by":"publisher","DOI":"10.1145\/2487575.2487629"},{"key":"e_1_2_1_100_1","doi-asserted-by":"publisher","DOI":"10.1145\/2509578.2509586"},{"key":"e_1_2_1_101_1","doi-asserted-by":"publisher","DOI":"10.1145\/2594291.2594340"},{"key":"e_1_2_1_102_1","volume-title":"Proceedings of the 5th IEEE International Symposium on Signal Processing and Information Technology.","author":"Put Ludo Van","year":"2005","unstructured":"Ludo Van Put , Dominique Chanet , Bruno De Bus , Bjorn De Sutter , and Koen De Bosschere . 2005 . DIABLO: A reliable, retargetable and extensible link-time rewriting framework . In Proceedings of the 5th IEEE International Symposium on Signal Processing and Information Technology. Ludo Van Put, Dominique Chanet, Bruno De Bus, Bjorn De Sutter, and Koen De Bosschere. 2005. DIABLO: A reliable, retargetable and extensible link-time rewriting framework. In Proceedings of the 5th IEEE International Symposium on Signal Processing and Information Technology."},{"key":"e_1_2_1_103_1","doi-asserted-by":"publisher","DOI":"10.1145\/3067695.3082518"},{"key":"e_1_2_1_104_1","volume-title":"Proceedings of the 13th Conference on Genetic and Evolutionary Computation. ACM.","author":"Verbancsics Phillip","unstructured":"Phillip Verbancsics and Kenneth O. Stanley . 2011. Constraining connectivity to encourage modularity in HyperNEAT . In Proceedings of the 13th Conference on Genetic and Evolutionary Computation. ACM. Phillip Verbancsics and Kenneth O. Stanley. 2011. Constraining connectivity to encourage modularity in HyperNEAT. In Proceedings of the 13th Conference on Genetic and Evolutionary Computation. ACM."},{"key":"e_1_2_1_105_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCC.2008.38"},{"key":"e_1_2_1_106_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2009.5070536"},{"key":"e_1_2_1_107_1","first-page":"1","article-title":"ThunderSVM: A fast SVM library on GPUs and CPUs","volume":"19","author":"Wen Zeyi","year":"2018","unstructured":"Zeyi Wen , Jiashuai Shi , Qinbin Li , Bingsheng He , and Jian Chen . 2018 . ThunderSVM: A fast SVM library on GPUs and CPUs . J. Mach. Learn. Res. 19 , 21 (2018), 1 -- 5 . Retrieved from http:\/\/jmlr.org\/papers\/v19\/17-740.html. Zeyi Wen, Jiashuai Shi, Qinbin Li, Bingsheng He, and Jian Chen. 2018. ThunderSVM: A fast SVM library on GPUs and CPUs. J. Mach. Learn. Res. 19, 21 (2018), 1--5. Retrieved from http:\/\/jmlr.org\/papers\/v19\/17-740.html.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_1_108_1","doi-asserted-by":"publisher","DOI":"10.1109\/TEVC.2010.2083669"},{"key":"e_1_2_1_109_1","doi-asserted-by":"publisher","DOI":"10.1145\/2854038.2854041"},{"key":"e_1_2_1_110_1","volume-title":"Proceedings of the IEEE International Symposium on Parallel 8 Distributed Processing (IPDPS\u201910)","author":"Xiao Shucai","year":"2010","unstructured":"Shucai Xiao and Wu-chun Feng. 2010 . Inter-block GPU communication via fast barrier synchronization . In Proceedings of the IEEE International Symposium on Parallel 8 Distributed Processing (IPDPS\u201910) . IEEE. Shucai Xiao and Wu-chun Feng. 2010. Inter-block GPU communication via fast barrier synchronization. In Proceedings of the IEEE International Symposium on Parallel 8 Distributed Processing (IPDPS\u201910). IEEE."},{"key":"e_1_2_1_111_1","volume-title":"Genetic CNN. In Proceedings of the IEEE International Conference on Computer Vision.","author":"Xie Lingxi","year":"2017","unstructured":"Lingxi Xie and Alan Yuille . 2017 . Genetic CNN. In Proceedings of the IEEE International Conference on Computer Vision. Lingxi Xie and Alan Yuille. 2017. Genetic CNN. In Proceedings of the IEEE International Conference on Computer Vision."},{"key":"e_1_2_1_112_1","doi-asserted-by":"publisher","DOI":"10.1109\/MDAT.2016.2630270"},{"key":"e_1_2_1_113_1","volume-title":"Proceedings of the ACM\/IEEE 45th International Symposium on Computer Architecture (ISCA\u201918)","author":"Yin Jieming","unstructured":"Jieming Yin , Zhifeng Lin , Onur Kayiran , Matthew Poremba , Muhammad Shoaib Bin Altaf , Natalie Enright Jerger , and Gabriel H. Loh . 2018. Modular routing design for chiplet-based systems . In Proceedings of the ACM\/IEEE 45th International Symposium on Computer Architecture (ISCA\u201918) . Jieming Yin, Zhifeng Lin, Onur Kayiran, Matthew Poremba, Muhammad Shoaib Bin Altaf, Natalie Enright Jerger, and Gabriel H. Loh. 2018. Modular routing design for chiplet-based systems. In Proceedings of the ACM\/IEEE 45th International Symposium on Computer Architecture (ISCA\u201918)."},{"key":"e_1_2_1_114_1","volume-title":"Proceedings of the International Conference on Advances in Neural Information Processing Systems. 685--693","author":"Zhang Sixin","year":"2015","unstructured":"Sixin Zhang , Anna E. Choromanska , and Yann LeCun . 2015 . Deep learning with elastic averaging SGD . In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 685--693 . Sixin Zhang, Anna E. Choromanska, and Yann LeCun. 2015. Deep learning with elastic averaging SGD. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 685--693."},{"key":"e_1_2_1_115_1","volume-title":"Le","author":"Zoph Barret","year":"2016","unstructured":"Barret Zoph and Quoc V . Le . 2016 . Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016). Barret Zoph and Quoc V. Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016)."},{"key":"e_1_2_1_116_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.","author":"Zoph Barret","unstructured":"Barret Zoph , Vijay Vasudevan , Jonathon Shlens , and Quoc V. Le . 2018. Learning transferable architectures for scalable image recognition . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V. Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition."}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3418055","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3418055","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3418055","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:31:35Z","timestamp":1750195895000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3418055"}},"subtitle":["GPU Code Optimization Using Evolutionary Computation"],"short-title":[],"issued":{"date-parts":[[2020,11,25]]},"references-count":114,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,12,31]]}},"alternative-id":["10.1145\/3418055"],"URL":"https:\/\/doi.org\/10.1145\/3418055","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,11,25]]},"assertion":[{"value":"2019-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-11-25","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}