{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T01:54:35Z","timestamp":1773194075344,"version":"3.50.1"},"reference-count":35,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2013,12,1]],"date-time":"2013-12-01T00:00:00Z","timestamp":1385856000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2013,12]]},"abstract":"<jats:p>Hardware specialization has received renewed interest recently as chips are hitting power limits. Chip designers of traditional processor architectures have primarily focused on general-purpose computing, partially due to time-to-market pressure and simpler design processes. But new power limits require some chip specialization. Although hardware configured for a specific application yields large speedups for low-power dissipation, its design is more complex and less reusable. We instead explore domain-based specialization, a scalable approach that balances hardware\u2019s reusability and performance efficiency. We focus on specialization using customized compute units that accelerate particular operations. In this article, we develop automatic techniques to identify code sequences from different applications within a domain that can be targeted to a new custom instruction that will be run inside a configurable specialized functional unit (SFU). We demonstrate that using a canonical representation of computations finds more common code sequences among applications that can be mapped to the same custom instruction, leading to larger speedups while specializing a smaller core area than previous pattern-matching techniques. We also propose new heuristics to narrow the search space of domain-specific custom instructions, finding those that achieve the best performance across applications. We estimate the overall performance achieved with our automatic techniques using hardware models on a set of nine media benchmarks, showing that when limiting the core area devoted to specialization, the SFU customization with the largest speedups includes both application- and domain-specific custom instructions. We demonstrate that exploring domain-specific hardware acceleration is key to continued computing system performance improvements.<\/jats:p>","DOI":"10.1145\/2541228.2555303","type":"journal-article","created":{"date-parts":[[2014,1,14]],"date-time":"2014-01-14T13:39:57Z","timestamp":1389706797000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Accelerating an application domain with specialized functional units"],"prefix":"10.1145","volume":"10","author":[{"given":"Cecilia","family":"Gonz\u00e1lez-\u00c1lvarez","sequence":"first","affiliation":[{"name":"Ghent University &amp; UPC, Gent, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jennifer B.","family":"Sartor","sequence":"additional","affiliation":[{"name":"Ghent University, Gent, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Carlos","family":"\u00c1lvarez","sequence":"additional","affiliation":[{"name":"UPC, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel","family":"Jim\u00e9nez-Gonz\u00e1lez","sequence":"additional","affiliation":[{"name":"UPC, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lieven","family":"Eeckhout","sequence":"additional","affiliation":[{"name":"Ghent University, Gent, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2013,12]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings 1st International Workshop on GCC Research Opportunities (GROW\u201909)","author":"Almer O.","unstructured":"Almer , O. and Bennett , R . 2009. An end-to-end design flow for automated instruction set extension and complex instruction selection based on GCC . In Proceedings 1st International Workshop on GCC Research Opportunities (GROW\u201909) . Almer, O. and Bennett, R. 2009. An end-to-end design flow for automated instruction set extension and complex instruction selection based on GCC. In Proceedings 1st International Workshop on GCC Research Opportunities (GROW\u201909)."},{"key":"e_1_2_1_2_1","volume-title":"Retrieved","author":"Altera Corporation","year":"2013","unstructured":"Altera Corporation . 2013 . Altera Nios II . Retrieved November 26, 2013 from http:\/\/www.altera.com\/devices\/processor\/nios2\/ni2-index.html. Altera Corporation. 2013. Altera Nios II. Retrieved November 26, 2013 from http:\/\/www.altera.com\/devices\/processor\/nios2\/ni2-index.html."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/371636.371677"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/VLSI.Design.2010.68"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2010.2090543"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASAP.2008.4580145"},{"key":"e_1_2_1_7_1","unstructured":"Bradski G. 2000. The OpenCV Library. Dr. Dobb\u2019s Journal of Software Tools.  Bradski G. 2000. The OpenCV Library. Dr. Dobb\u2019s Journal of Software Tools."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2006.153"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2005.156"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/968280.968307"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/MDT.2010.141"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.1974.1050511"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2000064.2000108"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1460361.1460365"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.micpro.2009.02.010"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/HLDVT.2004.1431235"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/40.848473"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the Workshop in Reconfigurable Computing (HiPEAC\u201911)","author":"Gonz\u00e1lez-\u00c1lvarez C.","unstructured":"Gonz\u00e1lez-\u00c1lvarez , C. , Fern\u00e1ndez , M. , Jim\u00e9nez-Gonz\u00e1lez , D. , Alvarez , C. , and Martorell , X . 2011. Automatic generation and testing of application specific hardware accelerators on a new reconfigurable OpenSPARC platform . In Proceedings of the Workshop in Reconfigurable Computing (HiPEAC\u201911) . 85--94. Gonz\u00e1lez-\u00c1lvarez, C., Fern\u00e1ndez, M., Jim\u00e9nez-Gonz\u00e1lez, D., Alvarez, C., and Martorell, X. 2011. Automatic generation and testing of application specific hardware accelerators on a new reconfigurable OpenSPARC platform. In Proceedings of the Workshop in Reconfigurable Computing (HiPEAC\u201911). 85--94."},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the 7th Python in Science Conference (SciPy\u201908)","author":"Hagberg A. A.","unstructured":"Hagberg , A. A. , Schult , D. A. , and Swart , P. J . 2008. Exploring network structure, dynamics, and function using NetworkX . In Proceedings of the 7th Python in Science Conference (SciPy\u201908) . 11--15. Hagberg, A. A., Schult, D. A., and Swart, P. J. 2008. Exploring network structure, dynamics, and function using NetworkX. In Proceedings of the 7th Python in Science Conference (SciPy\u201908). 11--15."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815968"},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the International Symposium on Code Generation and Optimization (CGO&rsquo;\u201904)","author":"Lattner C.","unstructured":"Lattner , C. and Adve , V . 2004. Llvm: A compilation framework for lifelong program analysis & transformation . In Proceedings of the International Symposium on Code Generation and Optimization (CGO&rsquo;\u201904) . IEEE Computer Society, Washington, DC, 75. Lattner, C. and Adve, V. 2004. Llvm: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization (CGO&rsquo;\u201904). IEEE Computer Society, Washington, DC, 75."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1629395.1629402"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1550987.1550989"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 14th International Conference on ASAP, Application-Specific Systems.","author":"Peymandoust A.","unstructured":"Peymandoust , A. and Pozzi , L . 2003. Automatic instruction set extension and utilization for embedded processors . In Proceedings of the 14th International Conference on ASAP, Application-Specific Systems. Peymandoust, A. and Pozzi, L. 2003. Automatic instruction set extension and utilization for embedded processors. In Proceedings of the 14th International Conference on ASAP, Application-Specific Systems."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/VLSID.2007.40"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2005.855950"},{"key":"e_1_2_1_27_1","unstructured":"SRISC. 2012. Simply risc s1 core.  SRISC. 2012. Simply risc s1 core."},{"key":"e_1_2_1_28_1","volume-title":"et al","author":"Stein W.","year":"2013","unstructured":"Stein , W. et al . 2013 . Sage Mathematics Software (Version x.y.z). The Sage Development Team. Retreived from http:\/\/www.sagemath.org. Stein, W. et al. 2013. Sage Mathematics Software (Version x.y.z). The Sage Development Team. Retreived from http:\/\/www.sagemath.org."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2004.104"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1735970.1736044"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1289881.1289905"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2010.2041849"},{"key":"e_1_2_1_33_1","unstructured":"Xilinx. 2012. Vivado Design Suite User Guide.  Xilinx. 2012. Vivado Design Suite User Guide."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1023833.1023844"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the International Conference on Field Programmable Logic and Applications (FPL\u201907)","author":"Yu P.","unstructured":"Yu , P. and Mitra , T . 2007. Disjoint pattern enumeration for custom instructions identification . In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL\u201907) . 273--278. Yu, P. and Mitra, T. 2007. Disjoint pattern enumeration for custom instructions identification. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL\u201907). 273--278."}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2541228.2555303","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2541228.2555303","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T07:35:01Z","timestamp":1750232101000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2541228.2555303"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,12]]},"references-count":35,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2013,12]]}},"alternative-id":["10.1145\/2541228.2555303"],"URL":"https:\/\/doi.org\/10.1145\/2541228.2555303","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,12]]},"assertion":[{"value":"2013-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-12-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}