{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T01:54:38Z","timestamp":1773194078942,"version":"3.50.1"},"reference-count":45,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2016,6,14]],"date-time":"2016-06-14T00:00:00Z","timestamp":1465862400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"European Research Council under the European Community's Seventh Framework Programme","award":["FP7\/2007-2013"],"award-info":[{"award-number":["FP7\/2007-2013"]}]},{"name":"ERC","award":["259295"],"award-info":[{"award-number":["259295"]}]},{"DOI":"10.13039\/501100006280","name":"Spanish Ministry of Science and Technology","doi-asserted-by":"crossref","award":["TIN2015-65316-P"],"award-info":[{"award-number":["TIN2015-65316-P"]}],"id":[{"id":"10.13039\/501100006280","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002809","name":"Generalitat de Catalunya","doi-asserted-by":"crossref","award":["2014-SGR-1051 and 2014-SGR-1272"],"award-info":[{"award-number":["2014-SGR-1051 and 2014-SGR-1272"]}],"id":[{"id":"10.13039\/501100002809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Spanish Government under the Severo Ochoa program","award":["SEV-2015-0493"],"award-info":[{"award-number":["SEV-2015-0493"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2016,6,27]]},"abstract":"<jats:p>\n            The end of Dennard scaling leads to new research directions that try to cope with the utilization wall in modern chips, such as the design of specialized architectures. Processor customization utilizes transistors more efficiently, optimizing not only for performance but also for power. However, hardware specialization for each application is costly and impractical due to time-to-market constraints. Domain-specific specialization is an alternative that can increase hardware reutilization across applications that share similar computations. This article explores the specialization of low-power processors with custom instructions (CIs) that run on a specialized functional unit. We are the first, to our knowledge, to design CIs for an application domain\n            <jats:italic>and<\/jats:italic>\n            across basic blocks, selecting CIs that maximize both performance and energy efficiency improvements.\n          <\/jats:p>\n          <jats:p>We present the Merged Instructions Generator for Large Efficiency (MInGLE), an automated framework that identifies and selects CIs. Our framework analyzes large sequences of code (across basic blocks) to maximize acceleration potential while also performing partial matching across applications to optimize for reuse of the specialized hardware. To do this, we convert the code into a new canonical representation, the Merging Diagram, which represents the code\u2019s functionality instead of its structure. This is key to being able to find similarities across such large code sequences from different applications with different coding styles. Groups of potential CIs are clustered depending on their similarity score to effectively reduce the search space. Additionally, we create new CIs that cover not only whole-body loops but also fragments of the code to optimize hardware reutilization further. For a set of 11 applications from the media domain, our framework generates CIs that significantly improve the energy-delay product (EDP) and performance speedup. CIs with the highest utilization opportunities achieve an average EDP improvement of 3.8 \u00d7 compared to a baseline processor modeled after an Intel Atom. We demonstrate that we can efficiently accelerate a domain with partially matched CIs, and that their design time, from identification to selection, stays within tractable bounds.<\/jats:p>","DOI":"10.1145\/2898356","type":"journal-article","created":{"date-parts":[[2016,6,14]],"date-time":"2016-06-14T12:29:28Z","timestamp":1465907368000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["MInGLE"],"prefix":"10.1145","volume":"13","author":[{"given":"Cecilia","family":"Gonz\u00e1lez-\u00e1lvarez","sequence":"first","affiliation":[{"name":"Ghent University &amp; Universitat Polit\u00e8cnica de Catalunya"}]},{"given":"Jennifer B.","family":"Sartor","sequence":"additional","affiliation":[{"name":"Ghent University &amp; Vrije Universiteit Brussel"}]},{"given":"Carlos","family":"\u00c1lvarez","sequence":"additional","affiliation":[{"name":"Universitat Polit\u00e8cnica de Catalunya, Barcelona, Spain"}]},{"given":"Daniel","family":"Jim\u00e9nez-Gonz\u00e1lez","sequence":"additional","affiliation":[{"name":"Universitat Polit\u00e8cnica de Catalunya, Barcelona, Spain"}]},{"given":"Lieven","family":"Eeckhout","sequence":"additional","affiliation":[{"name":"Ghent University, Zwijnaarde, Belgium"}]}],"member":"320","published-online":{"date-parts":[[2016,6,14]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/VLSI.Design.2010.68"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2010.2090543"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASAP.2008.4580145"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:IJPP.0000004508.14594.b9"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1391469.1391486"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the Conference on Adaptive Hardware and Systems (AHS\u201911)","author":"Bauer L.","unstructured":"L. Bauer , M. Shafique , and J. Henkel . 2011. Concepts, architectures, and run-time systems for efficient and adaptive reconfigurable processors . In Proceedings of the Conference on Adaptive Hardware and Systems (AHS\u201911) . 80--87. L. Bauer, M. Shafique, and J. Henkel. 2011. Concepts, architectures, and run-time systems for efficient and adaptive reconfigurable processors. In Proceedings of the Conference on Adaptive Hardware and Systems (AHS\u201911). 80--87."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2012.6168949"},{"key":"e_1_2_1_8_1","unstructured":"G. Bradski. 2000. The OpenCV library. Dr. Dobb\u2019s Journal of Software Tools 20 11 120--126.  G. Bradski. 2000. The OpenCV library. Dr. Dobb\u2019s Journal of Software Tools 20 11 120--126."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.1986.1676819"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2629677"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/360276.360328"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2006.153"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2005.156"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/968280.968307"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.1974.1050511"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.micpro.2009.02.010"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541228.2555303"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2011.18"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2012.51"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155623"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/1128020.1128563"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2656106.2656114"},{"key":"e_1_2_1_23_1","first-page":"1","article-title":"Intel\u2019s tiny atom","volume":"040708","author":"Halfhill T. R.","year":"2008","unstructured":"T. R. Halfhill . 2008 . Intel\u2019s tiny atom . Microprocessor Report , 040708 , 1 -- 13 . T. R. Halfhill. 2008. Intel\u2019s tiny atom. Microprocessor Report, 040708, 1--13.","journal-title":"Microprocessor Report"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC\u201914)","author":"Huang H.","unstructured":"H. Huang , T. Kim , and Y. Hoskote . 2014. Edit distance based instruction merging technique to improve flexibility of custom instructions toward flexible accelerator design . In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC\u201914) . 219--224. H. Huang, T. Kim, and Y. Hoskote. 2014. Edit distance based instruction merging technique to improve flexibility of custom instructions toward flexible accelerator design. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC\u201914). 219--224."},{"key":"e_1_2_1_25_1","unstructured":"IBM. 2014. ILOG CPLEX. Retrieved May 9 2016 from http:\/\/www-01.ibm.com\/software\/integration\/optimization\/cplex-optimizer\/.  IBM. 2014. ILOG CPLEX. Retrieved May 9 2016 from http:\/\/www-01.ibm.com\/software\/integration\/optimization\/cplex-optimizer\/."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.vlsi.2009.06.002"},{"key":"e_1_2_1_27_1","doi-asserted-by":"crossref","unstructured":"K. Karuri and R. Leupers. 2011. A primer on ISA customization. In Application Analysis Tools for ASIP Design. Springer 93--109.  K. Karuri and R. Leupers. 2011. A primer on ISA customization. In Application Analysis Tools for ASIP Design. Springer 93--109.","DOI":"10.1007\/978-1-4419-8255-1_6"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the Conference on Computer Design: VLSI in Computers and Processors. 84--90","author":"Keutzer K.","unstructured":"K. Keutzer , S. Malik , and A. R. Newton . 2002. From ASIC to ASIP: The next design discontinuity . In Proceedings of the Conference on Computer Design: VLSI in Computers and Processors. 84--90 . K. Keutzer, S. Malik, and A. R. Newton. 2002. From ASIC to ASIP: The next design discontinuity. In Proceedings of the Conference on Computer Design: VLSI in Computers and Processors. 84--90."},{"key":"e_1_2_1_29_1","unstructured":"D. Kroshko. 2015. OpenOpt: Free scientific-engineering software for mathematical modeling and optimization.  D. Kroshko. 2015. OpenOpt: Free scientific-engineering software for mathematical modeling and optimization."},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the Symposium on Code Generation and Optimization (CGO\u201904)","author":"Lattner C.","unstructured":"C. Lattner and V. Adve . 2004. LLVM: A compilation framework for lifelong program analysis and transformation . In Proceedings of the Symposium on Code Generation and Optimization (CGO\u201904) . IEEE, Los Alamitos, CA. C. Lattner and V. Adve. 2004. LLVM: A compilation framework for lifelong program analysis and transformation. In Proceedings of the Symposium on Code Generation and Optimization (CGO\u201904). IEEE, Los Alamitos, CA."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669172"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1629395.1629402"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2209285.2209289"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/581199.581203"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v053.i09"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1550987.1550989"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2005.855950"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2013.2282265"},{"key":"e_1_2_1_39_1","unstructured":"Sage Development Team. 2013. Sage Mathematics Software (Version 5.8). Available at http:\/\/www.sagemath.org.  Sage Development Team. 2013. Sage Mathematics Software (Version 5.8). Available at http:\/\/www.sagemath.org."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2012.2235127"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155640"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1289881.1289905"},{"key":"e_1_2_1_43_1","volume-title":"Vivado High-Level Synthesis. Retrieved","year":"2016","unstructured":"Xilinx. 2014. Vivado High-Level Synthesis. Retrieved May 9, 2016 , from http:\/\/www.xilinx.com\/products\/design-tools\/vivado\/integration\/esl-design.html. Xilinx. 2014. Vivado High-Level Synthesis. Retrieved May 9, 2016, from http:\/\/www.xilinx.com\/products\/design-tools\/vivado\/integration\/esl-design.html."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1023833.1023844"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2009.2026355"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2898356","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2898356","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:56:29Z","timestamp":1750222589000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2898356"}},"subtitle":["An Efficient Framework for Domain Acceleration Using Low-Power Specialized Functional Units"],"short-title":[],"issued":{"date-parts":[[2016,6,14]]},"references-count":45,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2016,6,27]]}},"alternative-id":["10.1145\/2898356"],"URL":"https:\/\/doi.org\/10.1145\/2898356","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,6,14]]},"assertion":[{"value":"2015-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-06-14","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}