{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,1,11]],"date-time":"2025-01-11T16:40:35Z","timestamp":1736613635515,"version":"3.32.0"},"reference-count":50,"publisher":"Association for Computing Machinery (ACM)","issue":"3","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2006,8]]},"abstract":"<jats:p>Multiprocessor Systems on Chips (MPSoCs) have become a popular architectural technique to increase performance. However, MPSoCs may lead to undesirable power consumption characteristics for computing systems that have strict power budgets, such as PDAs, mobile phones, and notebook computers. This paper presents the super-complex instruction-set computing (SuperCISC) Embedded Processor Architecture and, in particular, investigates performance and power consumption of this device compared to traditional processor architecture-based execution. SuperCISC is a heterogeneous, multicore processor architecture designed to exceed performance of traditional embedded processors while maintaining a reduced power budget compared to low-power embedded processors. At the heart of the SuperCISC processor is a multicore VLIW (Very Large Instruction Word) containing several homogeneous execution cores\/functional units. In addition, complex and heterogeneous combinational hardware function cores are tightly integrated to the core VLIW engine providing an opportunity for improved performance and reduced energy consumption. Our SuperCISC processor core has been synthesized for both a 90-nm Stratix II Field Programmable Gate Aray (FPGA) and a 160-nm standard cell Application-Specific Integrated Circuit (ASIC) fabrication process from OKI, each operating at approximately 167 MHz for the VLIW core. We examine several reasons for speedup and power improvement through the SuperCISC architecture, including<jats:italic>predicated control flow<\/jats:italic>,<jats:italic>cycle compression<\/jats:italic>, and a reduction in arithmetic power consumption, which we call<jats:italic>power compression<\/jats:italic>. Finally, testing our SuperCISC processor with multimedia and signal-processing benchmarks, we show how the SuperCISC processor can provide performance improvements ranging from 7X to 160X with an average of 60X, while<jats:italic>also<\/jats:italic>providing orders of magnitude of power improvements for the computational kernels. The power improvements for our benchmark kernels range from just over 40X to over 400X, with an average savings exceeding 130X. By combining these power and performance improvements, our total energy improvements all exceed 1000X. As these savings are limited to the computational kernels of the applications, which often consume approximately 90% of the execution time, we expect our savings to approach the ideal application improvement of 10X.<\/jats:p>","DOI":"10.1145\/1165780.1165785","type":"journal-article","created":{"date-parts":[[2006,10,18]],"date-time":"2006-10-18T18:11:32Z","timestamp":1161195092000},"page":"658-686","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":12,"title":["Reducing power while increasing performance with supercisc"],"prefix":"10.1145","volume":"5","author":[{"given":"Alex K.","family":"Jones","sequence":"first","affiliation":[{"name":"University of Pittsburgh, Pittsburgh, PA"}]},{"given":"Raymond","family":"Hoare","sequence":"additional","affiliation":[{"name":"University of Pittsburgh, Pittsburgh, PA"}]},{"given":"Dara","family":"Kusic","sequence":"additional","affiliation":[{"name":"University of Pittsburgh, Pittsburgh, PA"}]},{"given":"Gayatri","family":"Mehta","sequence":"additional","affiliation":[{"name":"University of Pittsburgh, Pittsburgh, PA"}]},{"given":"Josh","family":"Fazekas","sequence":"additional","affiliation":[{"name":"University of Pittsburgh, Pittsburgh, PA"}]},{"given":"John","family":"Foster","sequence":"additional","affiliation":[{"name":"University of Pittsburgh, Pittsburgh, PA"}]}],"member":"320","published-online":{"date-parts":[[2006,8]]},"reference":[{"volume-title":"IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM).","author":"Banerjee P.","key":"e_1_2_1_1_1","unstructured":"Banerjee , P. , Shenoy , N. , Choudhary , A. , Hauck , S. , Bachmann , C. , Chang , M. , Haldar , M. , Joisha , P. , Jones , A. , Kanhare , A. , Nayak , A. , Periyacheri , S. , Walkden , M. , and Zaretsky , D . 2000. A matlab compiler for distributed, heterogeneous, reconfigurable computing systems . In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM). Banerjee, P., Shenoy, N., Choudhary, A., Hauck, S., Bachmann, C., Chang, M., Haldar, M., Joisha, P., Jones, A., Kanhare, A., Nayak, A., Periyacheri, S., Walkden, M., and Zaretsky, D. 2000. A matlab compiler for distributed, heterogeneous, reconfigurable computing systems. In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM)."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2004.824301"},{"volume-title":"Proceedings of the 1999 International Symposium on Low Power Electronics and Design. ACM Press","author":"Benini L.","key":"e_1_2_1_3_1","unstructured":"Benini , L. , Macii , A. , Macii , E. , and Poncino , M . 1999. Selective instruction compression for memory energy reduction in embedded systems . In Proceedings of the 1999 International Symposium on Low Power Electronics and Design. ACM Press , New York. 206--211. 10.1145\/313817.313927 Benini, L., Macii, A., Macii, E., and Poncino, M. 1999. Selective instruction compression for memory energy reduction in embedded systems. In Proceedings of the 1999 International Symposium on Low Power Electronics and Design. ACM Press, New York. 206--211. 10.1145\/313817.313927"},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Callahan T. J. Hauser J. R. and Wawrzynek J. 2000. The garp architecture and c compiler. Computer 33. 10.1109\/2.839323 Callahan T. J. Hauser J. R. and Wawrzynek J. 2000. The garp architecture and c compiler. Computer 33. 10.1109\/2.839323","DOI":"10.1109\/2.839323"},{"key":"e_1_2_1_5_1","unstructured":"Chandar S. Mehendale M. and Govindarajan R. 2001. Area and power reduction of embedded dsp systems using instruction compression and reconfigurable encoding. In Proeedings of ICCAD. Chandar S. Mehendale M. and Govindarajan R. 2001. Area and power reduction of embedded dsp systems using instruction compression and reconfigurable encoding. In Proeedings of ICCAD."},{"key":"e_1_2_1_6_1","first-page":"473","article-title":"Low-power cmos digital design","volume":"27","author":"Chandrakasan A.","year":"1992","unstructured":"Chandrakasan , A. , Sheng , S. , and Brodersen , R. 1992 . Low-power cmos digital design . JSSC 27 , 4, 473 -- 484 . Chandrakasan, A., Sheng, S., and Brodersen, R. 1992. Low-power cmos digital design. JSSC 27, 4, 473--484.","journal-title":"JSSC"},{"volume-title":"European Design Automation Conference.","author":"Chang J.-M.","key":"e_1_2_1_7_1","unstructured":"Chang , J.-M. and Pedram , M . 1996. Module assignment for low power . In European Design Automation Conference. Chang, J.-M. and Pedram, M. 1996. Module assignment for low power. In European Design Automation Conference."},{"volume-title":"DAC '98: Proceedings of the 35th Annual Conference on Design Automation. ACM Press","author":"Chen Z.","key":"e_1_2_1_8_1","unstructured":"Chen , Z. and Roy , K . 1998. A power macromodeling technique based on power sensitivity . In DAC '98: Proceedings of the 35th Annual Conference on Design Automation. ACM Press , New York. 678--683. 10.1145\/277044.277216 Chen, Z. and Roy, K. 1998. A power macromodeling technique based on power sensitivity. In DAC '98: Proceedings of the 35th Annual Conference on Design Automation. ACM Press, New York. 678--683. 10.1145\/277044.277216"},{"volume-title":"Proceedings of ISCAS.","author":"Cousin J.-G.","key":"e_1_2_1_9_1","unstructured":"Cousin , J.-G. , Sentieys , O. , and Chillet , D . 2000. Multi-algorithm asip synthesis and power estimation for dsp applications . In Proceedings of ISCAS. Cousin, J.-G., Sentieys, O., and Chillet, D. 2000. Multi-algorithm asip synthesis and power estimation for dsp applications. In Proceedings of ISCAS."},{"volume-title":"CoWare","author":"CoWare","key":"e_1_2_1_10_1","unstructured":"CoWare . The lisatek solution: Automated embedded processor design and software development tool generation. Datasheet , CoWare , Inc . CoWare. The lisatek solution: Automated embedded processor design and software development tool generation. Datasheet, CoWare, Inc."},{"key":"e_1_2_1_11_1","volume-title":"IEEE Workshop on VLSI Signal Processing.","author":"Dutta S.","year":"1996","unstructured":"Dutta , S. , Wolfe , A. , Wolf , W. , and O'Connor , K. 1996 . Design issues for very-long-instruction-word vlsi video signal processors . In IEEE Workshop on VLSI Signal Processing. Dutta, S., Wolfe, A., Wolf, W., and O'Connor, K. 1996. Design issues for very-long-instruction-word vlsi video signal processors. In IEEE Workshop on VLSI Signal Processing."},{"volume-title":"in the 6th International Workshop on Field-Programmable Logic and Applications.","author":"Ebeling C.","key":"e_1_2_1_12_1","unstructured":"Ebeling , C. , Cronquist , D. C. , and Franklin , P . 1996. Rapid - reconfigurable pipelined datapath . In in the 6th International Workshop on Field-Programmable Logic and Applications. Ebeling, C., Cronquist, D. C., and Franklin, P. 1996. Rapid - reconfigurable pipelined datapath. In in the 6th International Workshop on Field-Programmable Logic and Applications."},{"volume-title":"Proceedings of ISLPED. ACM. 10","author":"Lee J.","key":"e_1_2_1_13_1","unstructured":"eun Lee , J. , Choi , K. , and Dutt , N. D . 2003. Energy-efficient instruction set synthesis for application-specific processors . In Proceedings of ISLPED. ACM. 10 .1145\/871506.871588 eun Lee, J., Choi, K., and Dutt, N. D. 2003. Energy-efficient instruction set synthesis for application-specific processors. In Proceedings of ISLPED. ACM. 10.1145\/871506.871588"},{"key":"e_1_2_1_14_1","unstructured":"Georing R. 2000. Synopsys launches power tool. EETimes. Georing R. 2000. Synopsys launches power tool. EETimes."},{"volume-title":"Proceedings of the Wkshp. Signal Processing Systems (SIPS).","author":"Glokler C.","key":"e_1_2_1_15_1","unstructured":"Glokler , C. and Meyr , H . 2001. Power reduction for asips: A case study . In Proceedings of the Wkshp. Signal Processing Systems (SIPS). Glokler, C. and Meyr, H. 2001. Power reduction for asips: A case study. In Proceedings of the Wkshp. Signal Processing Systems (SIPS)."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/40.848473"},{"volume-title":"Proceedings of the 2003 international conference on Compilers, Architectures and Synthesis for Embedded Systems. ACM Press","author":"Goodwin D.","key":"e_1_2_1_17_1","unstructured":"Goodwin , D. and Petkov , D . 2003. Automatic generation of application specific processors . In Proceedings of the 2003 international conference on Compilers, Architectures and Synthesis for Embedded Systems. ACM Press , New York. 137--147. 10.1145\/951710.951730 Goodwin, D. and Petkov, D. 2003. Automatic generation of application specific processors. In Proceedings of the 2003 international conference on Compilers, Architectures and Synthesis for Embedded Systems. ACM Press, New York. 137--147. 10.1145\/951710.951730"},{"volume-title":"DAC '97: Proceedings of the 34th Annual Conference on Design Automation. ACM Press","author":"Gupta S.","key":"e_1_2_1_18_1","unstructured":"Gupta , S. and Najm , F. N . 1997. Power macromodeling for high level power estimation . In DAC '97: Proceedings of the 34th Annual Conference on Design Automation. ACM Press , New York. 365--370. 10.1145\/266021.266171 Gupta, S. and Najm, F. N. 1997. Power macromodeling for high level power estimation. In DAC '97: Proceedings of the 34th Annual Conference on Design Automation. ACM Press, New York. 365--370. 10.1145\/266021.266171"},{"key":"e_1_2_1_19_1","unstructured":"Gupta S. Gupta R. Dutt N. and Nicolau A. 2004. SPARK: : A Parallelizing Approach to the High-Level Synthesis of Digital Circuits. Kluwer Academic Publishers Boston MA. Gupta S. Gupta R. Dutt N. and Nicolau A. 2004. SPARK: : A Parallelizing Approach to the High-Level Synthesis of Digital Circuits. Kluwer Academic Publishers Boston MA."},{"volume-title":"IEEE Symposium on FPGAs for Custom Computing Machines(FCCM). 87--96","author":"Hauck S.","key":"e_1_2_1_20_1","unstructured":"Hauck , S. , Fry , T. W. , Hosler , M. M. , and Kao , J. P . 1997. The chimaera reconfigurable functional unit . In IEEE Symposium on FPGAs for Custom Computing Machines(FCCM). 87--96 . Hauck, S., Fry, T. W., Hosler, M. M., and Kao, J. P. 1997. The chimaera reconfigurable functional unit. In IEEE Symposium on FPGAs for Custom Computing Machines(FCCM). 87--96."},{"volume-title":"IASTED International Conference on Parallel and Distributed Computing and Systems.","author":"Hoare R.","key":"e_1_2_1_21_1","unstructured":"Hoare , R. , Tung , S. , and Werger , K . 2003. A 64-way simd processing architecture on an fpga . In IASTED International Conference on Parallel and Distributed Computing and Systems. Hoare, R., Tung, S., and Werger, K. 2003. A 64-way simd processing architecture on an fpga. In IASTED International Conference on Parallel and Distributed Computing and Systems."},{"volume-title":"International Parallel and Distributed Processing Symposium (IPDPS).","author":"Hoare R.","key":"e_1_2_1_22_1","unstructured":"Hoare , R. , Tung , S. , and Werger , K . 2004. An 88-way multiprocessor within an fpga with customizable instructions . In International Parallel and Distributed Processing Symposium (IPDPS). Hoare, R., Tung, S., and Werger, K. 2004. An 88-way multiprocessor within an fpga with customizable instructions. In International Parallel and Distributed Processing Symposium (IPDPS)."},{"key":"e_1_2_1_23_1","doi-asserted-by":"crossref","unstructured":"Hoare R. Jones A. K. Kusic D. Fazekas J. Foster J. Tung S. and McCloud M. 2005. Rapid vliw processor customization for signal processing applications using combinational hardware functions. EURASIP Journal on Applied Signal Processing. 10.1155\/ASP\/2006\/46472 Hoare R. Jones A. K. Kusic D. Fazekas J. Foster J. Tung S. and McCloud M. 2005. Rapid vliw processor customization for signal processing applications using combinational hardware functions. EURASIP Journal on Applied Signal Processing. 10.1155\/ASP\/2006\/46472","DOI":"10.1155\/ASP\/2006\/46472"},{"volume-title":"Proc. of the Design Automation Conference (DAC). 10","author":"Huang Z.","key":"e_1_2_1_24_1","unstructured":"Huang , Z. and Malik , S . 2002. Exploiting operation level parallelism through dynamicall reconfigurable datapaths . In Proc. of the Design Automation Conference (DAC). 10 .1145\/513918.514006 Huang, Z. and Malik, S. 2002. Exploiting operation level parallelism through dynamicall reconfigurable datapaths. In Proc. of the Design Automation Conference (DAC). 10.1145\/513918.514006"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/993396.993403"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 2001 IEEE\/ACM International Conference on Computer-Aided Design. IEEE Press, 259--263","author":"Jha N. K.","year":"2001","unstructured":"Jha , N. K. 2001 . Low power system scheduling and synthesis . In Proceedings of the 2001 IEEE\/ACM International Conference on Computer-Aided Design. IEEE Press, 259--263 . Jha, N. K. 2001. Low power system scheduling and synthesis. In Proceedings of the 2001 IEEE\/ACM International Conference on Computer-Aided Design. IEEE Press, 259--263."},{"key":"e_1_2_1_27_1","doi-asserted-by":"crossref","unstructured":"Jones A. K. Bagchi D. Pal S. Banerjee P. and Choudhary A. 2002. Pact HDL: Compiler Targeting ASIC's and FPGA's with Power and Performance Optimizations. Kluwer Academic Publishers Boston MA. Jones A. K. Bagchi D. Pal S. Banerjee P. and Choudhary A. 2002. Pact HDL: Compiler Targeting ASIC's and FPGA's with Power and Performance Optimizations. Kluwer Academic Publishers Boston MA.","DOI":"10.1145\/581630.581659"},{"volume-title":"IEEE International Conference on Electronics, Circuits, and Systems (ICECS).","author":"Jones A.","key":"e_1_2_1_28_1","unstructured":"Jones , A. , Hoare , R. , Kourtev , I. , Fazekas , J. , Kusic , D. , Foster , J. , Boddie , S. , and Muaydh , A . 2004. A 64way vliw\/simd fpga processing architecture and design flow . In IEEE International Conference on Electronics, Circuits, and Systems (ICECS). Jones, A., Hoare, R., Kourtev, I., Fazekas, J., Kusic, D., Foster, J., Boddie, S., and Muaydh, A. 2004. A 64way vliw\/simd fpga processing architecture and design flow. In IEEE International Conference on Electronics, Circuits, and Systems (ICECS)."},{"volume-title":"ACM International Symposium on Field-Programmable Gate Arrays (FPGA). 10","author":"Jones A. K.","key":"e_1_2_1_29_1","unstructured":"Jones , A. K. , Hoare , R. , Kusic , D. , Fazekas , J. , and Foster , J . 2005. An fpga-based vliw processor with custom hardware execution . In ACM International Symposium on Field-Programmable Gate Arrays (FPGA). 10 .1145\/1046192.1046207 Jones, A. K., Hoare, R., Kusic, D., Fazekas, J., and Foster, J. 2005. An fpga-based vliw processor with custom hardware execution. In ACM International Symposium on Field-Programmable Gate Arrays (FPGA). 10.1145\/1046192.1046207"},{"key":"e_1_2_1_30_1","unstructured":"Khailany B. and etal 2001. Imagine: media processing with streams. In Micro. 10.1109\/40.918001 Khailany B. and et al. 2001. Imagine: media processing with streams. In Micro. 10.1109\/40.918001"},{"volume-title":"IEEE International Conference on Computer Design (ICCD).","author":"Khailany B.","key":"e_1_2_1_31_1","unstructured":"Khailany , B. , Dally , W. J. , Chang , A. , Kapsi , U. J. , Namkoong , J. , and Towles , B . 2002. Vlsi design and verification of the imagine processor . In IEEE International Conference on Computer Design (ICCD). Khailany, B., Dally, W. J., Chang, A., Kapsi, U. J., Namkoong, J., and Towles, B. 2002. Vlsi design and verification of the imagine processor. In IEEE International Conference on Computer Design (ICCD)."},{"volume-title":"Proc. Design Automation & Test in Europe Conf. 848--854","author":"Khouri K.","key":"e_1_2_1_32_1","unstructured":"Khouri , K. , Lakshminarayana , G. , and Jha , N . 1998. Impact: A highlevel synthesis system for low power control-flow intensive circuits . In Proc. Design Automation & Test in Europe Conf. 848--854 . Khouri, K., Lakshminarayana, G., and Jha, N. 1998. Impact: A highlevel synthesis system for low power control-flow intensive circuits. In Proc. Design Automation & Test in Europe Conf. 848--854."},{"volume-title":"Proceedings of the International Symposium on Microarchitecture.","author":"Lee C.","key":"e_1_2_1_33_1","unstructured":"Lee , C. , Potkonjak , M. , and Magione-Smith , W. K . 1997. Mediabench: A tool for evaluating and synhesizing multimedia and communications systems . In Proceedings of the International Symposium on Microarchitecture. Lee, C., Potkonjak, M., and Magione-Smith, W. K. 1997. Mediabench: A tool for evaluating and synhesizing multimedia and communications systems. In Proceedings of the International Symposium on Microarchitecture."},{"key":"e_1_2_1_34_1","volume-title":"Piperench: Power & performance evaluation of a programmable pipelined datapath. presented at Hot Chips 14","author":"Levine B.","year":"2002","unstructured":"Levine , B. and Schmit , H . 2002 . Piperench: Power & performance evaluation of a programmable pipelined datapath. presented at Hot Chips 14 , Palo Alto, CA . Levine, B. and Schmit, H. 2002. Piperench: Power & performance evaluation of a programmable pipelined datapath. presented at Hot Chips 14, Palo Alto, CA."},{"volume-title":"IEEE Symposium on FPGAs for Custom Computing Machines(FCCM).","author":"Levine B. A.","key":"e_1_2_1_35_1","unstructured":"Levine , B. A. and Schmit , H . 2003. Efficient application representation for haste: Hybrid architectures with a single, transformable executable . In IEEE Symposium on FPGAs for Custom Computing Machines(FCCM). Levine, B. A. and Schmit, H. 2003. Efficient application representation for haste: Hybrid architectures with a single, transformable executable. In IEEE Symposium on FPGAs for Custom Computing Machines(FCCM)."},{"key":"e_1_2_1_36_1","doi-asserted-by":"crossref","unstructured":"Liu X. and Papaefthymiou M. C. 2001. A static power estimation methodology for ip-based design. In Design Automation and Test in Europe. 280--287. Liu X. and Papaefthymiou M. C. 2001. A static power estimation methodology for ip-based design. In Design Automation and Test in Europe. 280--287.","DOI":"10.1109\/DATE.2001.915038"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2004.829819"},{"key":"e_1_2_1_38_1","unstructured":"McCloud S. 2004. Catapult c synthesis-based design flow: Speeding implementation and increasing flexibility. Tech. rep. Mentor Graphics. McCloud S. 2004. Catapult c synthesis-based design flow: Speeding implementation and increasing flexibility. Tech. rep. Mentor Graphics."},{"key":"e_1_2_1_39_1","volume-title":"Tech. Rep. TR-ECE-2005-07-001","author":"Mehta G.","year":"2005","unstructured":"Mehta , G. , Jones , A. K. , and Hoare , R . 2005 . An energy-efficient coarse-grained reconfigurable fabric arch itecture. Tech. Rep. TR-ECE-2005-07-001 , University of Pittsburgh, Department of Electrical and Computer Engineering. July . Mehta, G., Jones, A. K., and Hoare, R. 2005. An energy-efficient coarse-grained reconfigurable fabric arch itecture. Tech. Rep. TR-ECE-2005-07-001, University of Pittsburgh, Department of Electrical and Computer Engineering. July."},{"volume-title":"in Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines.","author":"Mirsky E.","key":"e_1_2_1_40_1","unstructured":"Mirsky , E. and Dehon , A . 1996. Matrix: A reconfigurable computing architecture with configurable instruction distribution and deployable resources . In in Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines. Mirsky, E. and Dehon, A. 1996. Matrix: A reconfigurable computing architecture with configurable instruction distribution and deployable resources. In in Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines."},{"volume-title":"Proceedings of the International Symposium on Low-Power Design. 99--104","author":"Musoll E.","key":"e_1_2_1_41_1","unstructured":"Musoll , E. and Cortadella , J . 1995. High-level synthesis techniques for reducing the activity of functional units . In Proceedings of the International Symposium on Low-Power Design. 99--104 . 10.1145\/224081.224099 Musoll, E. and Cortadella, J. 1995. High-level synthesis techniques for reducing the activity of functional units. In Proceedings of the International Symposium on Low-Power Design. 99--104. 10.1145\/224081.224099"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/92.335013"},{"key":"e_1_2_1_43_1","volume-title":"Trimaran: An infrastructure for compiler research in instruction level parallelism.","author":"Nene A.","year":"1998","unstructured":"Nene , A. , Talla , S. , Goldberg , B. , Kim , H. , and Rabbah , R. M . 1998 . Trimaran: An infrastructure for compiler research in instruction level parallelism. Nene, A., Talla, S., Goldberg, B., Kim, H., and Rabbah, R. M. 1998. Trimaran: An infrastructure for compiler research in instruction level parallelism."},{"volume-title":"Proceedings of ICCD. 318--322","author":"Raghunathan A.","key":"e_1_2_1_44_1","unstructured":"Raghunathan , A. and Jha , N. K . 1994. Behavioral synthesis for low power . In Proceedings of ICCD. 318--322 . Raghunathan, A. and Jha, N. K. 1994. Behavioral synthesis for low power. In Proceedings of ICCD. 318--322."},{"key":"e_1_2_1_45_1","unstructured":"Roy K. and Prasad S. 2000. Low-Power CMOS VLSI Design. Wiley New York. Roy K. and Prasad S. 2000. Low-Power CMOS VLSI Design. Wiley New York."},{"volume-title":"Proceedings of the IEEE Custom Integrated Circuits Conference.","author":"Schmit H.","key":"e_1_2_1_46_1","unstructured":"Schmit , H. , Whelihan , D. , Tsai , A. , Moe , M. , Levine , B. , and Taylor , R. R . 2002. Piperench: A virtualized programmable datapath in 0.18 micron technolog . In Proceedings of the IEEE Custom Integrated Circuits Conference. Schmit, H., Whelihan, D., Tsai, A., Moe, M., Levine, B., and Taylor, R. R. 2002. Piperench: A virtualized programmable datapath in 0.18 micron technolog. In Proceedings of the IEEE Custom Integrated Circuits Conference."},{"volume-title":"IEEE International Symposium on Circuits and Systems.","author":"Shen Z. X.","key":"e_1_2_1_47_1","unstructured":"Shen , Z. X. and Jong , C. C . 1997. Exploring module selection space for architectural synthesis of low power designs . In IEEE International Symposium on Circuits and Systems. Shen, Z. X. and Jong, C. C. 1997. Exploring module selection space for architectural synthesis of low power designs. In IEEE International Symposium on Circuits and Systems."},{"key":"e_1_2_1_48_1","unstructured":"Sima M. Cotofana S. van Eijndhoven J. T. J. Vassilidis S. and Vissers K. 2001. An 8 \u00d7 8 idct implementation on an fpga-augmented trimedia. In Field Programmable Custom Computing Machines (FCCM). 10.1109\/FCCM.2001.9 Sima M. Cotofana S. van Eijndhoven J. T. J. Vassilidis S. and Vissers K. 2001. An 8 \u00d7 8 idct implementation on an fpga-augmented trimedia. In Field Programmable Custom Computing Machines (FCCM). 10.1109\/FCCM.2001.9"},{"key":"e_1_2_1_49_1","unstructured":"Synopsys Inc. Design compiler and primepower manual. www.synopsys.com. Synopsys Inc. Design compiler and primepower manual. www.synopsys.com."},{"key":"e_1_2_1_50_1","volume-title":"Proceedings of the IEEE International Conference on VLSI Design. 10","author":"Tang X.","year":"2005","unstructured":"Tang , X. , Jiang , T. , Jones , A. K. , and Banerjee , P . 2005. Behavioral synthesis of data-dominated circuits for minimal energy implementation . In Proceedings of the IEEE International Conference on VLSI Design. 10 .1109\/ICVD. 2005 .62 Tang, X., Jiang, T., Jones, A. K., and Banerjee, P. 2005. Behavioral synthesis of data-dominated circuits for minimal energy implementation. In Proceedings of the IEEE International Conference on VLSI Design. 10.1109\/ICVD.2005.62"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1165780.1165785","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,11]],"date-time":"2025-01-11T16:19:12Z","timestamp":1736612352000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1165780.1165785"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,8]]},"references-count":50,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2006,8]]}},"alternative-id":["10.1145\/1165780.1165785"],"URL":"https:\/\/doi.org\/10.1145\/1165780.1165785","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2006,8]]},"assertion":[{"value":"2006-08-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}