{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:38:17Z","timestamp":1750307897259,"version":"3.41.0"},"reference-count":24,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2007,9,1]],"date-time":"2007-09-01T00:00:00Z","timestamp":1188604800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Des. Autom. Electron. Syst."],"published-print":{"date-parts":[[2007,9]]},"abstract":"<jats:p>Clustering L0 buffers is effective for energy reduction in the instruction memory hierarchy of embedded VLIW processors. However, the efficiency of the clustering depends on the schedule of the target application. Especially in heterogeneous or data clustered VLIW processors, determining energy efficient scheduling is more constraining.<\/jats:p>\n          <jats:p>This article proposes a realistic technique supported by a tool flow to explore operation shuffling for improving generation of L0 clusters. The tool flow explores assignment of operations for each cycle and generates various schedules. This approach makes it possible to reduce energy consumption for various processor architectures. However, the computational complexity is large because of the huge exploration space. Therefore, some heuristics are also developed, which reduce the size of the exploration space while the solution quality remains reasonable. Furthermore, we also propose a technique to support VLIW processors with multiple data clusters, which is essential to apply the methodology to real world processors.<\/jats:p>\n          <jats:p>The experimental results indicate potential gains of up to 27.6% in energy in L0 buffers, through operation shuffling for heterogeneous processor architectures as well as a homogeneous architecture. Furthermore, the proposed heuristics drastically reduce the exploration search space by about 90%, while the results are comparable to full search, with average differences of less than 1%. The experimental results indicate that energy efficiency can be improved in most of the media benchmarks by the proposed methodology, where the average gain is around 10% in comparison with generating clusters without operation shuffling.<\/jats:p>","DOI":"10.1145\/1278349.1278354","type":"journal-article","created":{"date-parts":[[2007,10,14]],"date-time":"2007-10-14T12:41:11Z","timestamp":1192365671000},"page":"41","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Methodology for operation shuffling and L0 cluster generation for low energy heterogeneous VLIW processors"],"prefix":"10.1145","volume":"12","author":[{"given":"Yuki","family":"Kobayashi","sequence":"first","affiliation":[{"name":"Graduate School of Information Science and Technology, Osaka University, Osaka, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Murali","family":"Jayapala","sequence":"additional","affiliation":[{"name":"IMEC vzw., Leuven, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Praveen","family":"Raghavan","sequence":"additional","affiliation":[{"name":"IMEC vzw., Katholieke Universitait Leuven, Leuven, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Francky","family":"Catthoor","sequence":"additional","affiliation":[{"name":"IMEC vzw., Katholieke Universitait Leuven, Leuven, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Masaharu","family":"Imai","sequence":"additional","affiliation":[{"name":"Graduate School of Information Science and Technology, Osaka University, Osaka, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2007,9]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/92.645068"},{"volume-title":"Proceedings of the IEEE International Workshop on Power And Timing Modeling, Optimization and Simulation, Yverdon-Les-Bains, IEEE. Switzerland.","author":"Benini L.","key":"e_1_2_1_2_1","unstructured":"Benini , L. , Bruni , D. , Chinosi , M. , Silvano , C. , Zaccaria , V. , and Zafalon , R . 2001. A power modeling and estimation framework for VLIW-based embedded systems . In Proceedings of the IEEE International Workshop on Power And Timing Modeling, Optimization and Simulation, Yverdon-Les-Bains, IEEE. Switzerland. Benini, L., Bruni, D., Chinosi, M., Silvano, C., Zaccaria, V., and Zafalon, R. 2001. A power modeling and estimation framework for VLIW-based embedded systems. In Proceedings of the IEEE International Workshop on Power And Timing Modeling, Optimization and Simulation, Yverdon-Les-Bains, IEEE. Switzerland."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/513918.514137"},{"volume-title":"Proceedings of the Design, Automation and Test in Europe","author":"Bona A.","key":"e_1_2_1_4_1","unstructured":"Bona , A. , Sami , M. , Sciuto , D. , Silvano , C. , Zaccaria , V. , and Zafalon , R . 2002b. An instruction-level methodology for power estimation and optimization of embedded VLIW cores . In Proceedings of the Design, Automation and Test in Europe . Paris, France, 1128. Bona, A., Sami, M., Sciuto, D., Silvano, C., Zaccaria, V., and Zafalon, R. 2002b. An instruction-level methodology for power estimation and optimization of embedded VLIW cores. In Proceedings of the Design, Automation and Test in Europe. Paris, France, 1128."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339657"},{"key":"e_1_2_1_6_1","unstructured":"Clear Speed. http:\/\/www.clearspeed.com\/.  Clear Speed. http:\/\/www.clearspeed.com\/."},{"volume-title":"Proceedings of the International Conference on Field Programmable Logic and Applications","author":"de Beeck P. O.","key":"e_1_2_1_7_1","unstructured":"de Beeck , P. O. , Barat , F. , Jayapala , M. , and Lauwereins , R . 2001. CRISP: A template for reconfigurable instruction set processors . In Proceedings of the International Conference on Field Programmable Logic and Applications . Belfast, Ireland, 296--305. de Beeck, P. O., Barat, F., Jayapala, M., and Lauwereins, R. 2001. CRISP: A template for reconfigurable instruction set processors. In Proceedings of the International Conference on Field Programmable Logic and Applications. Belfast, Ireland, 296--305."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339682"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/DATE.2005.141"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2005.165"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/54.844333"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2005.92"},{"volume-title":"Proceedings of the IEEE International Workshop on Power And Timing Modeling, Optimization and Simulation","author":"Jayapala M.","key":"e_1_2_1_13_1","unstructured":"Jayapala , M. , Vander Aa , T. , Barat , F. , Catthoor , F. , Coporaal , H. , and Deconinck , G . 2004. L0 cluster synthesis and operation shuffling . In Proceedings of the IEEE International Workshop on Power And Timing Modeling, Optimization and Simulation . Santorini, Greece. IEEE, 311--321. Jayapala, M., Vander Aa, T., Barat, F., Catthoor, F., Coporaal, H., and Deconinck, G. 2004. L0 cluster synthesis and operation shuffling. In Proceedings of the IEEE International Workshop on Power And Timing Modeling, Optimization and Simulation. Santorini, Greece. IEEE, 311--321."},{"volume-title":"Proceedings of the IEEE 16th International Conference on Application-Specific Systems, Architectures and Processors","author":"Lambrechts A.","key":"e_1_2_1_14_1","unstructured":"Lambrechts , A. , Raghavan , P. , Leroy , A. , Talavera , G. , VanderAa , T. , Jayapala , M. , Catthoor , F. , Verkest , D. , Deconinck , G. , Coporaal , H. , Robert , F. , and Carrabina , J . 2005. Power breakdown analysis for a heterogeneous NoC platform running a video application . In Proceedings of the IEEE 16th International Conference on Application-Specific Systems, Architectures and Processors . Samos, Greece, 179--184. Lambrechts, A., Raghavan, P., Leroy, A., Talavera, G., VanderAa, T., Jayapala, M., Catthoor, F., Verkest, D., Deconinck, G., Coporaal, H., Robert, F., and Carrabina, J. 2005. Power breakdown analysis for a heterogeneous NoC platform running a video application. In Proceedings of the IEEE 16th International Conference on Application-Specific Systems, Architectures and Processors. Samos, Greece, 179--184."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/REAL.2004.18"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/313817.313944"},{"key":"e_1_2_1_17_1","unstructured":"MediaBench. http:\/\/cares.icsl.ucla.edu\/MediaBench\/.  MediaBench. http:\/\/cares.icsl.ucla.edu\/MediaBench\/."},{"volume-title":"Proceedings of the International Symposium on High-Performance Computer Architecture","author":"Rixner S.","key":"e_1_2_1_18_1","unstructured":"Rixner , S. , Dally , W. J. , Khailany , B. , Mattson , P. , Kapasi , U. J. , and Owens , J. D . 2000. Register organization for media processing . In Proceedings of the International Symposium on High-Performance Computer Architecture . Toulouse, France, 375--386. Rixner, S., Dally, W. J., Khailany, B., Mattson, P., Kapasi, U. J., and Owens, J. D. 2000. Register organization for media processing. In Proceedings of the International Symposium on High-Performance Computer Architecture. Toulouse, France, 375--386."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/11847083_2"},{"key":"e_1_2_1_20_1","unstructured":"Silicon Hive. http:\/\/www.silicon-hive.com\/.  Silicon Hive. http:\/\/www.silicon-hive.com\/."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/780732.780759"},{"key":"e_1_2_1_22_1","unstructured":"Texas Instruments. 2000. TMS320C6000 CPU and Instruction Set Reference Guide.  Texas Instruments. 2000. TMS320C6000 CPU and Instruction Set Reference Guide."},{"key":"e_1_2_1_23_1","unstructured":"Trimaran. Trimaran: An infrastructure for research in instruction-level parallelism. http:\/\/www.trimaran.org\/.  Trimaran. Trimaran: An infrastructure for research in instruction-level parallelism. http:\/\/www.trimaran.org\/."},{"volume-title":"Proceedings of the IEEE Asia and South Pacific Design Automation Conference","author":"Vander Aa T.","key":"e_1_2_1_24_1","unstructured":"Vander Aa , T. , Jayapala , M. , Barat , F. , Deconinck , G. , Lauwereins , R. , Catthoor , F. , and Coporaal , H . 2004. Instruction buffering exploration for low energy VLIW with instruction clusters . In Proceedings of the IEEE Asia and South Pacific Design Automation Conference . Yokohama, Japan, IEEE, 825--830. Vander Aa, T., Jayapala, M., Barat, F., Deconinck, G., Lauwereins, R., Catthoor, F., and Coporaal, H. 2004. Instruction buffering exploration for low energy VLIW with instruction clusters. In Proceedings of the IEEE Asia and South Pacific Design Automation Conference. Yokohama, Japan, IEEE, 825--830."}],"container-title":["ACM Transactions on Design Automation of Electronic Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1278349.1278354","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1278349.1278354","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T14:47:29Z","timestamp":1750258049000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1278349.1278354"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,9]]},"references-count":24,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2007,9]]}},"alternative-id":["10.1145\/1278349.1278354"],"URL":"https:\/\/doi.org\/10.1145\/1278349.1278354","relation":{},"ISSN":["1084-4309","1557-7309"],"issn-type":[{"type":"print","value":"1084-4309"},{"type":"electronic","value":"1557-7309"}],"subject":[],"published":{"date-parts":[[2007,9]]},"assertion":[{"value":"2007-09-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}