{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T00:55:35Z","timestamp":1768870535113,"version":"3.49.0"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2021,4,23]],"date-time":"2021-04-23T00:00:00Z","timestamp":1619136000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Model. Comput. Simul."],"published-print":{"date-parts":[[2021,4,30]]},"abstract":"<jats:p>Hardware architectures become increasingly complex as the compute capabilities grow to exascale. We present the Analytical Memory Model with Pipelines (AMMP) of the Performance Prediction Toolkit (PPT). PPT-AMMP takes high-level source code and hardware architecture parameters as input and predicts runtime of that code on the target hardware platform, which is defined in the input parameters. PPT-AMMP transforms the code to an (architecture-independent) intermediate representation, then (i) analyzes the basic block structure of the code, (ii) processes architecture-independent virtual memory access patterns that it uses to build memory reuse distance distribution models for each basic block, and (iii) runs detailed basic-block level simulations to determine hardware pipeline usage.<\/jats:p>\n          <jats:p>PPT-AMMP uses machine learning and regression techniques to build the prediction models based on small instances of the input code, then integrates into a higher-order discrete-event simulation model of PPT running on Simian PDES engine. We validate PPT-AMMP on four standard computational physics benchmarks and present a use case of hardware parameter sensitivity analysis to identify bottleneck hardware resources on different code inputs. We further extend PPT-AMMP to predict the performance of a scientific application code, namely, the radiation transport mini-app SNAP. To this end, we analyze multi-variate regression models that accurately predict the reuse profiles and the basic block counts. We validate predicted SNAP runtimes against actual measured times.<\/jats:p>","DOI":"10.1145\/3450264","type":"journal-article","created":{"date-parts":[[2021,4,23]],"date-time":"2021-04-23T16:40:24Z","timestamp":1619196024000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Machine Learning\u2013enabled Scalable Performance Prediction of Scientific Codes"],"prefix":"10.1145","volume":"31","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6223-8570","authenticated-orcid":false,"given":"Gopinath","family":"Chennupati","sequence":"first","affiliation":[{"name":"Los Alamos National Laboratory, NM"}]},{"given":"Nandakishore","family":"Santhi","sequence":"additional","affiliation":[{"name":"Los Alamos National Laboratory, NM"}]},{"given":"Phill","family":"Romero","sequence":"additional","affiliation":[{"name":"Los Alamos National Laboratory, NM"}]},{"given":"Stephan","family":"Eidenbenz","sequence":"additional","affiliation":[{"name":"Los Alamos National Laboratory, NM"}]}],"member":"320","published-online":{"date-parts":[[2021,4,23]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Instruction Tables: Lists of Instruction Latencies, Throughputs and Micro-operation Breakdowns for Intel, AMD and VIA CPUs","author":"Agner Fog","year":"2016","unstructured":"Fog Agner . 2016 . Instruction Tables: Lists of Instruction Latencies, Throughputs and Micro-operation Breakdowns for Intel, AMD and VIA CPUs . Technical University of Denmark , Copenhagen, Denmark . Fog Agner. 2016. Instruction Tables: Lists of Instruction Latencies, Throughputs and Micro-operation Breakdowns for Intel, AMD and VIA CPUs. Technical University of Denmark, Copenhagen, Denmark."},{"key":"e_1_2_1_2_1","first-page":"1","article-title":"A brief history of HPC simulation and future challenges. In Proceedings of the Winter Simulation Conference (WSC\u201917)","volume":"27","author":"Ahmed Kishwar","year":"2017","unstructured":"Kishwar Ahmed , Jason Liu , Abdel-Hameed Badawy , and Stephan Eidenbenz . 2017 . A brief history of HPC simulation and future challenges. In Proceedings of the Winter Simulation Conference (WSC\u201917) . IEEE , 27 : 1 -- 27 :12. Kishwar Ahmed, Jason Liu, Abdel-Hameed Badawy, and Stephan Eidenbenz. 2017. A brief history of HPC simulation and future challenges. In Proceedings of the Winter Simulation Conference (WSC\u201917). IEEE, 27:1--27:12.","journal-title":"IEEE"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2901378.2901396"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/215399.215427"},{"key":"e_1_2_1_5_1","first-page":"1","article-title":"Fast, accurate, and scalable memory modeling of GPGPUs using reuse profiles","volume":"31","author":"Arafa Yehia","year":"2020","unstructured":"Yehia Arafa , Abdel-Hameed A. Badawy , Gopinath Chennupati , Atanu Barai , Nandakishore Santhi , and Stephan J. Eidenbenz . 2020 . Fast, accurate, and scalable memory modeling of GPGPUs using reuse profiles . In Proceedings of the International Conference on Supercomputing. ACM , 31 : 1 \u2013 31 :12. Yehia Arafa, Abdel-Hameed A. Badawy, Gopinath Chennupati, Atanu Barai, Nandakishore Santhi, and Stephan J. Eidenbenz. 2020. Fast, accurate, and scalable memory modeling of GPGPUs using reuse profiles. In Proceedings of the International Conference on Supercomputing. ACM, 31:1\u201331:12.","journal-title":"Proceedings of the International Conference on Supercomputing. ACM"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the 6th International Symposium on Memory Systems (MEMSYS\u201920)","author":"Barai Atanu","unstructured":"Atanu Barai , Gopinath Chennupati , Nandakishore Santhi , Abdel-Hameed A. Badawy , Yehia Arafa , and Stephan J. Eidenbenz . 2020. PPT-SASMM: Scalable analytical shared memory model . In Proceedings of the 6th International Symposium on Memory Systems (MEMSYS\u201920) . ACM. https:\/\/doi.org\/10.1145\/3422575.3422806. Atanu Barai, Gopinath Chennupati, Nandakishore Santhi, Abdel-Hameed A. Badawy, Yehia Arafa, and Stephan J. Eidenbenz. 2020. PPT-SASMM: Scalable analytical shared memory model. In Proceedings of the 6th International Symposium on Memory Systems (MEMSYS\u201920). ACM. https:\/\/doi.org\/10.1145\/3422575.3422806."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2012.71"},{"key":"e_1_2_1_8_1","volume-title":"IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201904)","author":"Berg E.","unstructured":"E. Berg and E. Hagersten . 2004. StatCache: A probabilistic approach to efficient and accurate data locality analysis . In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201904) . 20--27. E. Berg and E. Hagersten. 2004. StatCache: A probabilistic approach to efficient and accurate data locality analysis. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201904). 20--27."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3309684"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3064911.3064923"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2014.07.003"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 17th Annual International Conference on Supercomputing (ICS\u201903)","author":"Ca\u00dfcaval Calin","unstructured":"Calin Ca\u00dfcaval and David A. Padua . 2003. Estimating cache misses and locality using stack distances . In Proceedings of the 17th Annual International Conference on Supercomputing (ICS\u201903) . ACM, 150--159. Calin Ca\u00dfcaval and David A. Padua. 2003. Estimating cache misses and locality using stack distances. In Proceedings of the 17th Annual International Conference on Supercomputing (ICS\u201903). ACM, 150--159."},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the Companion Publication of the Annual Conference on Genetic and Evolutionary Computation. 1353--1360","author":"Chennupati Gopinath","unstructured":"Gopinath Chennupati , R. Muhammad Atif Azad, and Conor Ryan. 2014. Predict the performance of GE with an ACO-based machine learning algorithm . In Proceedings of the Companion Publication of the Annual Conference on Genetic and Evolutionary Computation. 1353--1360 . Gopinath Chennupati, R. Muhammad Atif Azad, and Conor Ryan. 2014. Predict the performance of GE with an ACO-based machine learning algorithm. In Proceedings of the Companion Publication of the Annual Conference on Genetic and Evolutionary Computation. 1353--1360."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/WSC.2018.8632406"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 8th International Workshop on High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation (PMBS\u201917)","author":"Chennupati Gopinath","year":"2017","unstructured":"Gopinath Chennupati , Nandakishore Santhi , Robert Bird , Sunil Thulasidasan , Abdel-Hameed A. Badawy , Satyajayant Misra , and Stephan Eidenbenz . 2017 a. A scalable analytical memory model for CPU performance prediction . In Proceedings of the 8th International Workshop on High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation (PMBS\u201917) , Stephen Jarvis et al. (Ed.). Denver, CO, 114--135. Gopinath Chennupati, Nandakishore Santhi, Robert Bird, Sunil Thulasidasan, Abdel-Hameed A. Badawy, Satyajayant Misra, and Stephan Eidenbenz. 2017a. A scalable analytical memory model for CPU performance prediction. In Proceedings of the 8th International Workshop on High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation (PMBS\u201917), Stephen Jarvis et al. (Ed.). Denver, CO, 114--135."},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the Winter Simulation Conference (WSC\u201917)","author":"Chennupati G.","unstructured":"G. Chennupati , N. Santhi , S. Eidenbenz , and S. Thulasidasan . 2017. An analytical memory hierarchy model for performance prediction . In Proceedings of the Winter Simulation Conference (WSC\u201917) . IEEE, 908--919. G. Chennupati, N. Santhi, S. Eidenbenz, and S. Thulasidasan. 2017. An analytical memory hierarchy model for performance prediction. In Proceedings of the Winter Simulation Conference (WSC\u201917). IEEE, 908--919."},{"key":"e_1_2_1_19_1","volume-title":"Massimiliano Rosa, Richard James Zamora, Eun Jung Park, Balasubramanya T. Nadiga, Jason Liu, Kishwar Ahmed, and Mohammad Abu Obaida.","author":"Chennupati Gopinath","year":"2017","unstructured":"Gopinath Chennupati , Nanadakishore Santhi , Stephen Eidenbenz , Robert Joseph Zerr , Massimiliano Rosa, Richard James Zamora, Eun Jung Park, Balasubramanya T. Nadiga, Jason Liu, Kishwar Ahmed, and Mohammad Abu Obaida. 2017 c. Performance Prediction Toolkit (PPT). Los Alamos National Laboratory (LANL). Retrieved from https:\/\/github.com\/lanl\/PPT. Gopinath Chennupati, Nanadakishore Santhi, Stephen Eidenbenz, Robert Joseph Zerr, Massimiliano Rosa, Richard James Zamora, Eun Jung Park, Balasubramanya T. Nadiga, Jason Liu, Kishwar Ahmed, and Mohammad Abu Obaida. 2017c. Performance Prediction Toolkit (PPT). Los Alamos National Laboratory (LANL). Retrieved from https:\/\/github.com\/lanl\/PPT."},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the Workshop on Emerging Supercomputing Technologies","volume":"2011","author":"Cope Jason","year":"2011","unstructured":"Jason Cope , Ning Liu , Sam Lang , Phil Carns , Chris Carothers , and Robert Ross . 2011 . Codes: Enabling co-design of multilayer exascale storage architectures . In Proceedings of the Workshop on Emerging Supercomputing Technologies , Vol. 2011 . Jason Cope, Ning Liu, Sam Lang, Phil Carns, Chris Carothers, and Robert Ross. 2011. Codes: Enabling co-design of multilayer exascale storage architectures. In Proceedings of the Workshop on Emerging Supercomputing Technologies, Vol. 2011."},{"key":"e_1_2_1_21_1","volume-title":"Eunice Santos, Ramesh Subramonian, and Thorsten Von Eicken.","author":"Culler David","year":"1993","unstructured":"David Culler , Richard Karp , David Patterson , Abhijit Sahay , Klaus Erik Schauser , Eunice Santos, Ramesh Subramonian, and Thorsten Von Eicken. 1993 . LogP: Towards a Realistic Model of Parallel Computation. Vol. 28 . ACM. David Culler, Richard Karp, David Patterson, Abhijit Sahay, Klaus Erik Schauser, Eunice Santos, Ramesh Subramonian, and Thorsten Von Eicken. 1993. LogP: Towards a Realistic Model of Parallel Computation. Vol. 28. ACM."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2009.385"},{"key":"e_1_2_1_23_1","first-page":"3537","article-title":"Analytical processor performance and power modeling using micro\u2013architecture independent characteristics","volume":"65","author":"den Steen S. Van","year":"2016","unstructured":"S. Van den Steen , S. Eyerman , S. De Pestel , M. Mechri , T. E. Carlson , D. Black-Schaffer , E. Hagersten , and L. Eeckhout . 2016 . Analytical processor performance and power modeling using micro\u2013architecture independent characteristics . IEEE Trans. Comput. 65 , 12 (2016), 3537 -- 3551 . S. Van den Steen, S. Eyerman, S. De Pestel, M. Mechri, T. E. Carlson, D. Black-Schaffer, E. Hagersten, and L. Eeckhout. 2016. Analytical processor performance and power modeling using micro\u2013architecture independent characteristics. IEEE Trans. Comput. 65, 12 (2016), 3537--3551.","journal-title":"IEEE Trans. Comput."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/780822.781159"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201900)","author":"Eeckhout L.","unstructured":"L. Eeckhout , K. de Bosschere , and H. Neefs . 2000. Performance analysis through synthetic trace generation . In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201900) . IEEE, Washington, DC, 1--6. L. Eeckhout, K. de Bosschere, and H. Neefs. 2000. Performance analysis through synthetic trace generation. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS\u201900). IEEE, Washington, DC, 1--6."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065895.1065906"},{"key":"e_1_2_1_27_1","volume-title":"Keasler","author":"Hornung Richard D.","year":"2014","unstructured":"Richard D. Hornung and Jeffrey A . Keasler . 2014 . The RAJA Portability Layer: Overview and Status. Technical Report. Lawrence Livermore National Lab. (LLNL), Livermore, CA. Richard D. Hornung and Jeffrey A. Keasler. 2014. The RAJA Portability Layer: Overview and Status. Technical Report. Lawrence Livermore National Lab. (LLNL), Livermore, CA."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-11970-5_15"},{"key":"e_1_2_1_29_1","volume-title":"SNAP: SN (Discrete Ordinates) Application Proxy. Los Alamos National Laboratory (LANL).","author":"Joe Zerr","year":"2015","unstructured":"Zerr Joe and Baker Randal . 2015 . SNAP: SN (Discrete Ordinates) Application Proxy. Los Alamos National Laboratory (LANL). Retrieved from https:\/\/github.com\/lanl\/SNAP. Zerr Joe and Baker Randal. 2015. SNAP: SN (Discrete Ordinates) Application Proxy. Los Alamos National Laboratory (LANL). Retrieved from https:\/\/github.com\/lanl\/SNAP."},{"key":"e_1_2_1_30_1","volume-title":"Genetic Programming: On the Programming of Computers by Means of Natural Selection","author":"Koza John R.","year":"1992","unstructured":"John R. Koza . 1992 . Genetic Programming: On the Programming of Computers by Means of Natural Selection . MIT Press , Cambridge, MA . John R. Koza. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA."},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 185--194","author":"Benjamin","unstructured":"Benjamin C. Lee and David M. Brooks. 2006. Accurate and efficient regression modeling for microarchitectural performance and power prediction . In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 185--194 . Benjamin C. Lee and David M. Brooks. 2006. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 185--194."},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the 29th ACM on International Conference on Supercomputing (ICS\u201915)","author":"Lee Seyong","unstructured":"Seyong Lee , Jeremy S. Meredith , and Jeffrey S. Vetter . 2015. COMPASS: A framework for automated performance modeling and prediction . In Proceedings of the 29th ACM on International Conference on Supercomputing (ICS\u201915) . ACM, 405--414. Seyong Lee, Jeremy S. Meredith, and Jeffrey S. Vetter. 2015. COMPASS: A framework for automated performance modeling and prediction. In Proceedings of the 29th ACM on International Conference on Supercomputing (ICS\u201915). ACM, 405--414."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1147\/sj.92.0078"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the Department of Defense HPCMP Users Group Conference.","author":"Mucci Philip J.","year":"1999","unstructured":"Philip J. Mucci , Shirley Browne , Christine Deane , and George Ho . 1999 . PAPI: A portable interface to hardware performance counters . Proceedings of the Department of Defense HPCMP Users Group Conference. Philip J. Mucci, Shirley Browne, Christine Deane, and George Ho. 1999. PAPI: A portable interface to hardware performance counters. Proceedings of the Department of Defense HPCMP Users Group Conference."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/s12293-018-0274-5"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2012.117"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3200921.3200937"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2013.6704676"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024724.2024954"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0129626400000214"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1964218.1964225"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/WSC.2015.7408405"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201910)","author":"Schuff Derek L.","unstructured":"Derek L. Schuff , Milind Kulkarni , and Vijay S. Pai . 2010. Accelerating multicore reuse distance analysis with sampling and parallelization . In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201910) . ACM, 53--64. Derek L. Schuff, Milind Kulkarni, and Vijay S. Pai. 2010. Accelerating multicore reuse distance analysis with sampling and parallelization. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201910). ACM, 53--64."},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201910)","author":"Schuff Derek L.","unstructured":"Derek L. Schuff , Milind Kulkarni , and Vijay S. Pai . 2010. Accelerating multicore reuse distance analysis with sampling and parallelization . In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201910) . ACM, 53--64. Derek L. Schuff, Milind Kulkarni, and Vijay S. Pai. 2010. Accelerating multicore reuse distance analysis with sampling and parallelization. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT\u201910). ACM, 53--64."},{"key":"e_1_2_1_45_1","volume-title":"Proceedings of the International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum (IPDPSW\u201910)","author":"Schuff Derek L.","unstructured":"Derek L. Schuff , Benjamin S. Parsons , and Vijay S. Pai . 2010. Multicore-aware reuse distance analysis . In Proceedings of the International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum (IPDPSW\u201910) . IEEE, 1--8. Derek L. Schuff, Benjamin S. Parsons, and Vijay S. Pai. 2010. Multicore-aware reuse distance analysis. In Proceedings of the International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum (IPDPSW\u201910). IEEE, 1--8."},{"key":"e_1_2_1_46_1","volume-title":"Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. 84","author":"Kyle","unstructured":"Kyle L. Spafford and Jeffrey S. Vetter. 2012. Aspen: A domain specific language for performance modeling . In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. 84 . Kyle L. Spafford and Jeffrey S. Vetter. 2012. Aspen: A domain specific language for performance modeling. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. 84."},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of the 28th ACM International Conference on Supercomputing. 221--230","author":"Nathan","unstructured":"Nathan R. Tallent and Adolfy Hoisie. 2014. Palm: Easing the burden of analytical performance modeling . In Proceedings of the 28th ACM International Conference on Supercomputing. 221--230 . Nathan R. Tallent and Adolfy Hoisie. 2014. Palm: Easing the burden of analytical performance modeling. In Proceedings of the 28th ACM International Conference on Supercomputing. 221--230."},{"key":"e_1_2_1_48_1","volume-title":"Advances in Neural Information Processing Systems","author":"Trask Andrew","unstructured":"Andrew Trask , Felix Hill , Scott E. Reed , Jack Rae , Chris Dyer , and Phil Blunsom . 2018. Neural arithmetic logic units . In Advances in Neural Information Processing Systems . MIT Press , 8035--8044. Andrew Trask, Felix Hill, Scott E. Reed, Jack Rae, Chris Dyer, and Phil Blunsom. 2018. Neural arithmetic logic units. In Advances in Neural Information Processing Systems. MIT Press, 8035--8044."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/TEVC.2008.926486"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.2172\/1407078"},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of the 2012 45th Annual IEEE\/ACM International Symposium on Microarchitecture. IEEE, 413--424","author":"Wu Weidan","unstructured":"Weidan Wu and Benjamin C. Lee . 2012. Inferred models for dynamic and sparse hardware-software spaces . In Proceedings of the 2012 45th Annual IEEE\/ACM International Symposium on Microarchitecture. IEEE, 413--424 . Weidan Wu and Benjamin C. Lee. 2012. Inferred models for dynamic and sparse hardware-software spaces. In Proceedings of the 2012 45th Annual IEEE\/ACM International Symposium on Microarchitecture. IEEE, 413--424."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1177\/0037549716674806"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2003.1238004"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/1552309.1552310"}],"container-title":["ACM Transactions on Modeling and Computer Simulation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3450264","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3450264","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:46:58Z","timestamp":1750193218000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3450264"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,23]]},"references-count":53,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,4,30]]}},"alternative-id":["10.1145\/3450264"],"URL":"https:\/\/doi.org\/10.1145\/3450264","relation":{},"ISSN":["1049-3301","1558-1195"],"issn-type":[{"value":"1049-3301","type":"print"},{"value":"1558-1195","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,4,23]]},"assertion":[{"value":"2020-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-04-23","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}