{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T18:30:28Z","timestamp":1775068228773,"version":"3.50.1"},"reference-count":105,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2020,2,6]],"date-time":"2020-02-06T00:00:00Z","timestamp":1580947200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","award":["PTDC\/EEI-HAC\/30848\/2017"],"award-info":[{"award-number":["PTDC\/EEI-HAC\/30848\/2017"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Comput. Surv."],"published-print":{"date-parts":[[2021,1,31]]},"abstract":"<jats:p>The breakdown of Dennard scaling has resulted in a decade-long stall of the maximum operating clock frequencies of processors. To mitigate this issue, computing shifted to multi-core devices. This introduced the need for programming flows and tools that facilitate the expression of workload parallelism at high abstraction levels. However, not all workloads are easily parallelizable, and the minor improvements to processor cores have not significantly increased single-threaded performance. Simultaneously, Instruction Level Parallelism in applications is considerably underexplored. This article reviews notable approaches that focus on exploiting this potential parallelism via automatic generation of specialized hardware from binary code. Although research on this topic spans over more than 20 years, automatic acceleration of software via translation to hardware has gained new importance with the recent trend toward reconfigurable heterogeneous platforms. We characterize this kind of binary acceleration approach and the accelerator architectures on which it relies. We summarize notable state-of-the-art approaches individually and present a taxonomy and comparison. Performance gains from 2.6\u00d7 to 5.6\u00d7 are reported, mostly considering bare-metal embedded applications, along with power consumption reductions between 1.3\u00d7 and 3.9\u00d7. We believe the methodologies and results achievable by automatic hardware generation approaches are promising in the context of emergent reconfigurable devices.<\/jats:p>","DOI":"10.1145\/3369764","type":"journal-article","created":{"date-parts":[[2020,2,6]],"date-time":"2020-02-06T21:54:04Z","timestamp":1581026044000},"page":"1-36","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["Improving Performance and Energy Consumption in Embedded Systems via Binary Acceleration: A Survey"],"prefix":"10.1145","volume":"53","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5547-0323","authenticated-orcid":false,"given":"Nuno","family":"Paulino","sequence":"first","affiliation":[{"name":"INESC TEC and Faculty of Engineering of the University of Porto"}]},{"given":"Jo\u00e3o Canas","family":"Ferreira","sequence":"additional","affiliation":[{"name":"INESC TEC and Faculty of Engineering of the University of Porto"}]},{"given":"Jo\u00e3o M. P.","family":"Cardoso","sequence":"additional","affiliation":[{"name":"INESC TEC and Faculty of Engineering of the University of Porto"}]}],"member":"320","published-online":{"date-parts":[[2020,2,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.825694"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the Southern Conference on Programmable Logic. 51--56","author":"Alves Jos\u00e9 Carlos","unstructured":"Jos\u00e9 Carlos Alves and Pedro C. Diniz . 2011. Custom FPGA-based micro-architecture for streaming computing . In Proceedings of the Southern Conference on Programmable Logic. 51--56 . Jos\u00e9 Carlos Alves and Pedro C. Diniz. 2011. Custom FPGA-based micro-architecture for streaming computing. In Proceedings of the Southern Conference on Programmable Logic. 51--56."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2015.7293940"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1735688.1735706"},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the Workshop on Application Specific Processors, Held in Conjunction with the International Symposium on Microarchitecture.","author":"Bansal Nikhil","year":"2003","unstructured":"Nikhil Bansal , Sumit Gupta , Nikil Dutt , and Alexandru Nicolau . 2003 . Analysis of the performance of coarse-grain reconfigurable architectures with different processing element configurations . In Proceedings of the Workshop on Application Specific Processors, Held in Conjunction with the International Symposium on Microarchitecture. Nikhil Bansal, Sumit Gupta, Nikil Dutt, and Alexandru Nicolau. 2003. Analysis of the performance of coarse-grain reconfigurable architectures with different processing element configurations. In Proceedings of the Workshop on Application Specific Processors, Held in Conjunction with the International Symposium on Microarchitecture."},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the 36th Annual IEEE\/ACM International Symposium on Microarchitecture 2003 (MICRO-36)","author":"Baraz L.","unstructured":"L. Baraz , T. Devor , O. Etzion , S. Goldenberg , A. Skaletsky , Yun Wang , and Y. Zemach . 2003. IA-32 execution layer: A two-phase dynamic translator designed to support IA-32 applications on Itanium\/spl reg\/-based systems . In Proceedings of the 36th Annual IEEE\/ACM International Symposium on Microarchitecture 2003 (MICRO-36) . 191--201. L. Baraz, T. Devor, O. Etzion, S. Goldenberg, A. Skaletsky, Yun Wang, and Y. Zemach. 2003. IA-32 execution layer: A two-phase dynamic translator designed to support IA-32 applications on Itanium\/spl reg\/-based systems. In Proceedings of the 36th Annual IEEE\/ACM International Symposium on Microarchitecture 2003 (MICRO-36). 191--201."},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the 2011 NASA\/ESA Conference on Adaptive Hardware and Systems (AHS\u201911)","author":"Bauer L.","unstructured":"L. Bauer , M. Shafique , and J. Henkel . 2011. Concepts, architectures, and run-time systems for efficient and adaptive reconfigurable processors . In Proceedings of the 2011 NASA\/ESA Conference on Adaptive Hardware and Systems (AHS\u201911) . 80--87. L. Bauer, M. Shafique, and J. Henkel. 2011. Concepts, architectures, and run-time systems for efficient and adaptive reconfigurable processors. In Proceedings of the 2011 NASA\/ESA Conference on Adaptive Hardware and Systems (AHS\u201911). 80--87."},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 1208--1213","author":"Beck Antonio Carlos S.","year":"2008","unstructured":"Antonio Carlos S. Beck , Mateus B. Rutzig , Georgi Gaydadjiev , and Luigi Carro . 2008 . Transparent reconfigurable acceleration for heterogeneous embedded applications . In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 1208--1213 . Antonio Carlos S. Beck, Mateus B. Rutzig, Georgi Gaydadjiev, and Luigi Carro. 2008. Transparent reconfigurable acceleration for heterogeneous embedded applications. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 1208--1213."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1454115.1454128"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the International Conference on Field-Programmable Technology. 437--440","author":"Bispo Jo\u00e3o","unstructured":"Jo\u00e3o Bispo and Jo\u00e3o M. P. Cardoso . 2010. On identifying and optimizing instruction sequences for dynamic compilation . In Proceedings of the International Conference on Field-Programmable Technology. 437--440 . Jo\u00e3o Bispo and Jo\u00e3o M. P. Cardoso. 2010. On identifying and optimizing instruction sequences for dynamic compilation. In Proceedings of the International Conference on Field-Programmable Technology. 437--440."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2501654.2501655"},{"key":"e_1_2_1_12_1","volume-title":"Article 13 (June","author":"Cardoso Jo\u00e3o M. P.","year":"2010","unstructured":"Jo\u00e3o M. P. Cardoso , Pedro C. Diniz , and Markus Weinhardt . 2010. Compiling for reconfigurable computing: A survey. ACM Comput. Surv. 42, 4 , Article 13 (June 2010 ), 65 pages. Jo\u00e3o M. P. Cardoso, Pedro C. Diniz, and Markus Weinhardt. 2010. Compiling for reconfigurable computing: A survey. ACM Comput. Surv. 42, 4, Article 13 (June 2010), 65 pages."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAD.2013.6691166"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2012.35"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.2197\/ipsjtsldm.4.31"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.825697"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2005.9"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2008.33"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2004.5"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/508352.508353"},{"key":"e_1_2_1_21_1","unstructured":"Standard Performance Evaluation Corporation. 2006. SPEC CPU Benchmark Suites. www.spec.org\/cpu.  Standard Performance Evaluation Corporation. 2006. SPEC CPU Benchmark Suites. www.spec.org\/cpu."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/786453.786708"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2133806.2133822"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2866573"},{"key":"e_1_2_1_25_1","volume-title":"Transformations of high-level synthesis codes for high-performance computing. CoRR abs\/1805.08288","author":"de Fine Licht Johannes","year":"2018","unstructured":"Johannes de Fine Licht , Simon Meierhans , and Torsten Hoefler . 2018. Transformations of high-level synthesis codes for high-performance computing. CoRR abs\/1805.08288 ( 2018 ). Johannes de Fine Licht, Simon Meierhans, and Torsten Hoefler. 2018. Transformations of high-level synthesis codes for high-performance computing. CoRR abs\/1805.08288 (2018)."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.1974.1050511"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.931892"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2012.17"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2324876.2324879"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.sysarc.2012.10.001"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2009.4798266"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1356058.1356075"},{"key":"e_1_2_1_33_1","volume-title":"2014 23rd International Conference on Parallel Architecture and Compilation Techniques (PACT\u201914)","author":"Fatehi E.","unstructured":"E. Fatehi and P. V. Gratz . 2014. ILP and TLP in shared memory applications: A limit study . In 2014 23rd International Conference on Parallel Architecture and Compilation Techniques (PACT\u201914) . 113--125. E. Fatehi and P. V. Gratz. 2014. ILP and TLP in shared memory applications: A limit study. In 2014 23rd International Conference on Parallel Architecture and Compilation Techniques (PACT\u201914). 113--125."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/SAMOS.2014.6893197"},{"key":"e_1_2_1_35_1","volume-title":"-L. Lin","author":"Gajski Daniel D.","year":"1992","unstructured":"Daniel D. Gajski , Nikil D. Dutt , Allen C.-H. Wu , and Steve Y . -L. Lin . 1992 . High-level Synthesis : Introduction to Chip and System Design. Kluwer Academic Publishers , Norwell, MA. Daniel D. Gajski, Nikil D. Dutt, Allen C.-H. Wu, and Steve Y.-L. Lin. 1992. High-level Synthesis: Introduction to Chip and System Design. Kluwer Academic Publishers, Norwell, MA."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1968502.1968509"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.1999.765937"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2005.165"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2012.51"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2011.5749755"},{"key":"e_1_2_1_41_1","volume-title":"Retrieved","author":"The IMPACT Research Group","year":"2012","unstructured":"The IMPACT Research Group . 2012 . Parboil benchmark suite . Retrieved January 22, 2019 from http:\/\/impact.crhc.illinois.edu\/parboil\/parboil.aspx. The IMPACT Research Group. 2012. Parboil benchmark suite. Retrieved January 22, 2019 from http:\/\/impact.crhc.illinois.edu\/parboil\/parboil.aspx."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.825696"},{"key":"e_1_2_1_43_1","volume-title":"Retrieved","author":"Gupta P. K.","year":"2016","unstructured":"P. K. Gupta . 2016 . Accelerating Datacenter Workloads. Keynote at FPL2016 . Retrieved July 26, 2019 from https:\/\/www.fpl2016.org\/. P. K. Gupta. 2016. Accelerating Datacenter Workloads. Keynote at FPL2016. Retrieved July 26, 2019 from https:\/\/www.fpl2016.org\/."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155623"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/370155.370535"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01205185"},{"key":"e_1_2_1_47_1","volume-title":"Retrieved","author":"Instruments Texas","year":"2002","unstructured":"Texas Instruments . 2002 . TMS320C6000 Image Library . Retrieved January 22, 2019 from www.ti.com\/tool\/sprc264. Texas Instruments. 2002. TMS320C6000 Image Library. Retrieved January 22, 2019 from www.ti.com\/tool\/sprc264."},{"key":"e_1_2_1_48_1","volume-title":"Retrieved","year":"2006","unstructured":"Intel. 2006 . Intel Processors and FPGAs\u2014Better Together . Retrieved January 22, 2019 from https:\/\/itpeernetwork.intel.com\/intel-processors-fpga-better-together. Intel. 2006. Intel Processors and FPGAs\u2014Better Together. Retrieved January 22, 2019 from https:\/\/itpeernetwork.intel.com\/intel-processors-fpga-better-together."},{"key":"e_1_2_1_49_1","volume-title":"Retrieved","year":"2019","unstructured":"Intel. 2019 . Intel FPGA SDK for OpenCL Software Technology . Retrieved July 26, 2019 from https:\/\/www.intel.com\/content\/www\/us\/en\/software\/programmable\/sdk-for-opencl\/overview.html. Intel. 2019. Intel FPGA SDK for OpenCL Software Technology. Retrieved July 26, 2019 from https:\/\/www.intel.com\/content\/www\/us\/en\/software\/programmable\/sdk-for-opencl\/overview.html."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISOCC.2011.6138785"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2003695.2003708"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/53990.54022"},{"key":"e_1_2_1_53_1","volume-title":"Proceedings of the 2010 IEEE International Symposium on Performance Analysis of Systems Software (ISPASS\u201910)","author":"Laurenzano M. A.","unstructured":"M. A. Laurenzano , M. M. Tikir , L. Carrington , and A. Snavely . 2010. PEBIL: Efficient static binary instrumentation for Linux . In Proceedings of the 2010 IEEE International Symposium on Performance Analysis of Systems Software (ISPASS\u201910) . 175--183. M. A. Laurenzano, M. M. Tikir, L. Carrington, and A. Snavely. 2010. PEBIL: Efficient static binary instrumentation for Linux. In Proceedings of the 2010 IEEE International Symposium on Performance Analysis of Systems Software (ISPASS\u201910). 175--183."},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the ACM\/IEEE International Symposium on Microarchitecture. 330--335","author":"Lee Chunho","unstructured":"Chunho Lee , Miodrag Potkonjak , and William H . Mangione-Smith. 1997. MediaBench: A tool for evaluating and synthesizing multimedia and communicatons systems . In Proceedings of the ACM\/IEEE International Symposium on Microarchitecture. 330--335 . Chunho Lee, Miodrag Potkonjak, and William H. Mangione-Smith. 1997. MediaBench: A tool for evaluating and synthesizing multimedia and communicatons systems. In Proceedings of the ACM\/IEEE International Symposium on Microarchitecture. 330--335."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.4018\/978-1-61350-116-0"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.micpro.2015.03.005"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/DATE.2004.1268892"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/1509288.1509294"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.micpro.2015.09.005"},{"key":"e_1_2_1_60_1","volume-title":"Proceedings of the ACM\/IEEE International Symposium on Low Power Electronics and Design. 241--243","author":"Malik Afzal","unstructured":"Afzal Malik , B. Moyer , and D. Cermak . 2000. A lower power unified cache architecture providing power and performance flexibility . In Proceedings of the ACM\/IEEE International Symposium on Low Power Electronics and Design. 241--243 . Afzal Malik, B. Moyer, and D. Cermak. 2000. A lower power unified cache architecture providing power and performance flexibility. In Proceedings of the ACM\/IEEE International Symposium on Low Power Electronics and Design. 241--243."},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSPC.2007.4728256"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/VLSISOC.2007.4402489"},{"key":"e_1_2_1_63_1","volume-title":"Proceedings of the IEEE International Parallel and Distributed Processing Symposium. 1--8.","author":"Mehta Gayatri","unstructured":"Gayatri Mehta , Justin Slander , Mustafa Baz , Brady Hunsaker , and Alex K. Jones . 2007. Interconnect customization for a coarse-grained reconfigurable fabric . In Proceedings of the IEEE International Parallel and Distributed Processing Symposium. 1--8. Gayatri Mehta, Justin Slander, Mustafa Baz, Brady Hunsaker, and Alex K. Jones. 2007. Interconnect customization for a coarse-grained reconfigurable fabric. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium. 1--8."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/1273442.1250746"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2015.7293960"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/1365490.1365500"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-010-0505-0"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-008-0174-4"},{"key":"e_1_2_1_69_1","volume-title":"Retrieved","author":"University of Michigan.","year":"2013","unstructured":"University of Michigan. 2013 . MiBench benchmark suite . Retrieved January 22, 2019 from http:\/\/vhosts.eecs.umich.edu\/mibench. University of Michigan. 2013. MiBench benchmark suite. Retrieved January 22, 2019 from http:\/\/vhosts.eecs.umich.edu\/mibench."},{"key":"e_1_2_1_70_1","first-page":"4","article-title":"Binary acceleration using coarse-grained reconfigurable architecture","volume":"38","author":"Paek Jong Kyung","year":"2011","unstructured":"Jong Kyung Paek , Kiyoung Choi , and Jongeun Lee . 2011 . Binary acceleration using coarse-grained reconfigurable architecture . SIGARCH Comput. Arch. News 38 , 4 (Jan. 2011), 33--39. Jong Kyung Paek, Kiyoung Choi, and Jongeun Lee. 2011. Binary acceleration using coarse-grained reconfigurable architecture. SIGARCH Comput. Arch. News 38, 4 (Jan. 2011), 33--39.","journal-title":"SIGARCH Comput. Arch. News"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669160"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/2629468"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2016.2573640"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2018.2874079"},{"key":"e_1_2_1_75_1","volume-title":"Retrieved","author":"Peters Tim","year":"1992","unstructured":"Tim Peters . 1992 . Livermore loops coded in C . Retrieved January 22, 2019 from www.netlib.org\/benchmark\/livermorec. Tim Peters. 1992. Livermore loops coded in C. Retrieved January 22, 2019 from www.netlib.org\/benchmark\/livermorec."},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1145\/192724.192731"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.23919\/DATE.2017.7927147"},{"key":"e_1_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2864288"},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.7873\/DATE.2013.317"},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1155\/2011\/546962"},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897164"},{"key":"e_1_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACRIM.2009.5291237"},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.859540"},{"key":"e_1_2_1_84_1","first-page":"3","article-title":"Thread warping: Dynamic and transparent synthesis of thread accelerators","volume":"16","author":"Stitt Greg","year":"2011","unstructured":"Greg Stitt and Frank Vahid . 2011 . Thread warping: Dynamic and transparent synthesis of thread accelerators . ACM Trans. Des. Autom. Electr. Syst. 16 , 3 (Jun. 2011). Greg Stitt and Frank Vahid. 2011. Thread warping: Dynamic and transparent synthesis of thread accelerators. ACM Trans. Des. Autom. Electr. Syst. 16, 3 (Jun. 2011).","journal-title":"ACM Trans. Des. Autom. Electr. Syst."},{"key":"e_1_2_1_85_1","first-page":"5","article-title":"Selective flexibility: Creating domain-specific reconfigurable arrays","volume":"32","author":"Stojilovi\u0107 Mirjana","year":"2013","unstructured":"Mirjana Stojilovi\u0107 , David Novo , Lazar Saranovac , Philip Brisk , and Paolo Ienne . 2013 . Selective flexibility: Creating domain-specific reconfigurable arrays . IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 32 , 5 (May 2013), 681--694. Mirjana Stojilovi\u0107, David Novo, Lazar Saranovac, Philip Brisk, and Paolo Ienne. 2013. Selective flexibility: Creating domain-specific reconfigurable arrays. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 32, 5 (May 2013), 681--694.","journal-title":"IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst."},{"key":"e_1_2_1_86_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2010.69"},{"key":"e_1_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2017.2761740"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2013.90"},{"key":"e_1_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2011.2182009"},{"key":"e_1_2_1_90_1","volume-title":"Retrieved","author":"EEMBC The Embedded Microprocessor Benchmark Consortium","year":"2015","unstructured":"EEMBC The Embedded Microprocessor Benchmark Consortium . 2015 . CoreMark-Pro . Retrieved January 22, 2019 from http:\/\/www.eembc.org. EEMBC The Embedded Microprocessor Benchmark Consortium. 2015. CoreMark-Pro. Retrieved January 22, 2019 from http:\/\/www.eembc.org."},{"key":"e_1_2_1_91_1","volume-title":"Retrieved","year":"2007","unstructured":"Trimaran. 2007 . An infrastructure for research in backend compilation and architecture exploration . Retrieved January 22, 2019 from www.trimaran.org. Trimaran. 2007. An infrastructure for research in backend compilation and architecture exploration. Retrieved January 22, 2019 from www.trimaran.org."},{"key":"e_1_2_1_92_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2008.240"},{"key":"e_1_2_1_93_1","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2009.5272293"},{"key":"e_1_2_1_94_1","volume-title":"The MOLEN -coded processor","author":"Vassiliadis Stamatis","unstructured":"Stamatis Vassiliadis , Stephan Wong , and Sorin Cot\u00f6fan\u0103 . 2001. The MOLEN -coded processor . In Field-Programmable Logic and Applications, Gordon Brebner and Roger Woods (Eds.). Springer , Berlin , 275--285. Stamatis Vassiliadis, Stephan Wong, and Sorin Cot\u00f6fan\u0103. 2001. The MOLEN -coded processor. In Field-Programmable Logic and Applications, Gordon Brebner and Roger Woods (Eds.). Springer, Berlin, 275--285."},{"key":"e_1_2_1_95_1","volume-title":"Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS\u201999)","author":"Wall David W.","year":"1999","unstructured":"David W. Wall . 1999 . Limits of instruction-level parallelism . In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS\u201999) . David W. Wall. 1999. Limits of instruction-level parallelism. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS\u201999)."},{"key":"e_1_2_1_96_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2897611"},{"key":"e_1_2_1_97_1","volume-title":"Article 49 (Jun.","author":"Wenzl Matthias","year":"2019","unstructured":"Matthias Wenzl , Georg Merzdovnik , Johanna Ullrich , and Edgar Weippl . 2019. From hack to elaborate technique -- A survey on binary rewriting. ACM Comput. Surv. 52, 3 , Article 49 (Jun. 2019 ), 37 pages. Matthias Wenzl, Georg Merzdovnik, Johanna Ullrich, and Edgar Weippl. 2019. From hack to elaborate technique -- A survey on binary rewriting. ACM Comput. Surv. 52, 3, Article 49 (Jun. 2019), 37 pages."},{"key":"e_1_2_1_98_1","doi-asserted-by":"publisher","DOI":"10.1109\/FPT.2004.1393248"},{"key":"e_1_2_1_99_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2003.1193227"},{"key":"e_1_2_1_100_1","doi-asserted-by":"publisher","DOI":"10.1145\/223982.223990"},{"key":"e_1_2_1_101_1","volume-title":"Retrieved","year":"2019","unstructured":"Xilinx. 2019 . SDAccel Development Environment . Retrieved July 26, 2019 from https:\/\/www.xilinx.com\/products\/design-tools\/software-zone\/sdaccel.html. Xilinx. 2019. SDAccel Development Environment. Retrieved July 26, 2019 from https:\/\/www.xilinx.com\/products\/design-tools\/software-zone\/sdaccel.html."},{"key":"e_1_2_1_102_1","volume-title":"Retrieved","year":"2019","unstructured":"Xilinx. 2019 . Zynq UltraScale+ MPSoC . Retrieved July 26, 2019 from https:\/\/www.xilinx.com\/products\/silicon-devices\/soc\/zynq-ultrascale-mpsoc.html. Xilinx. 2019. Zynq UltraScale+ MPSoC. Retrieved July 26, 2019 from https:\/\/www.xilinx.com\/products\/silicon-devices\/soc\/zynq-ultrascale-mpsoc.html."},{"key":"e_1_2_1_103_1","doi-asserted-by":"publisher","DOI":"10.1145\/996566.996764"},{"key":"e_1_2_1_104_1","volume-title":"Proceedings of the 10th ACM SIGPLAN\/SIGOPS International Conference on Virtual Execution Environments (VEE\u201914)","author":"Zhang Mingwei","unstructured":"Mingwei Zhang , Rui Qiao , Niranjan Hasabnis , and R. Sekar . 2014. A platform for secure static binary instrumentation . In Proceedings of the 10th ACM SIGPLAN\/SIGOPS International Conference on Virtual Execution Environments (VEE\u201914) . 129--140. Mingwei Zhang, Rui Qiao, Niranjan Hasabnis, and R. Sekar. 2014. A platform for secure static binary instrumentation. In Proceedings of the 10th ACM SIGPLAN\/SIGOPS International Conference on Virtual Execution Environments (VEE\u201914). 129--140."},{"key":"e_1_2_1_105_1","volume-title":"Proceedings of the International Conference on Signal Processing Applications and Technology.","author":"Zivojnovic Vojin","year":"1994","unstructured":"Vojin Zivojnovic , Juan M. Velarde , Christian Schlager , and Heinrich Meyr . 1994 . DSPstone: A DSP-oriented benchmarking methodology . In Proceedings of the International Conference on Signal Processing Applications and Technology. Vojin Zivojnovic, Juan M. Velarde, Christian Schlager, and Heinrich Meyr. 1994. DSPstone: A DSP-oriented benchmarking methodology. In Proceedings of the International Conference on Signal Processing Applications and Technology."}],"container-title":["ACM Computing Surveys"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3369764","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3369764","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:41:08Z","timestamp":1750200068000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3369764"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,2,6]]},"references-count":105,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,1,31]]}},"alternative-id":["10.1145\/3369764"],"URL":"https:\/\/doi.org\/10.1145\/3369764","relation":{},"ISSN":["0360-0300","1557-7341"],"issn-type":[{"value":"0360-0300","type":"print"},{"value":"1557-7341","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,2,6]]},"assertion":[{"value":"2019-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-02-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}