{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:41:45Z","timestamp":1760197305522,"version":"build-2065373602"},"reference-count":43,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2018,4,14]],"date-time":"2018-04-14T00:00:00Z","timestamp":1523664000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>Heterogeneous and configurable multicore systems provide hardware specialization to meet disparate application hardware requirements. However, effective multicore system specialization can require a priori knowledge of the applications, application profiling information, and\/or dynamic hardware tuning to schedule and execute applications on the most energy efficient cores. Furthermore, even though highly disparate core heterogeneity and\/or highly configurable parameters with numerous potential parameter values result in more fine-grained specialization and higher energy savings potential, these large design spaces are challenging to efficiently explore. To address these challenges, we propose a novel configuration-subsetted heterogeneous and configurable multicore system, wherein each core offers a small subset of the design space, and propose a novel scheduling and tuning (SaT) algorithm to efficiently exploit the energy savings potential of this system. Our proposed architecture and algorithm require no a priori application knowledge or profiling, and incur minimal runtime overhead. Results reveal energy savings potential and insights on energy trade-offs in heterogeneous, configurable systems.<\/jats:p>","DOI":"10.3390\/computers7020025","type":"journal-article","created":{"date-parts":[[2018,4,16]],"date-time":"2018-04-16T03:12:09Z","timestamp":1523848329000},"page":"25","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Scheduling and Tuning for Low Energy in Heterogeneous and Configurable Multicore Systems"],"prefix":"10.3390","volume":"7","author":[{"given":"Mohamad","family":"Alsafrjalani","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL 33146, USA"}]},{"given":"Ann","family":"Gordon-Ross","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32608, USA"}]}],"member":"1968","published-online":{"date-parts":[[2018,4,14]]},"reference":[{"key":"ref_1","unstructured":"ARM Ltd. (2018, February 01). big.LITTLE Technology. Available online: http:\/\/www.arm.com\/files\/pdf\/big_LITTLE_Technology_the_Futue_of_Mobile.pdf."},{"key":"ref_2","unstructured":"Texas Instruments (2018, February 01). OMAP3530 Applications Processors Datasheet. Available online: http:\/\/www.ti.com\/lit\/ds\/sprt656\/sprt656.pdf."},{"key":"ref_3","unstructured":"Adegbija, T., and Gordon-Ross, A. (October, January 29). Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems. Proceedings of the International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies (UBICOMM\u201913), Porto, Portugal."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Gordon-Ross, A., and Vahid, F. (2007, January 4\u20138). A Self-Tuning Configurable Cache. Proceedings of the 2007 44th ACM\/IEEE Design Automation Conference, San Diego, CA, USA.","DOI":"10.1109\/DAC.2007.375159"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Viana, P., Gordon-Ross, A., Keogh, E., Barros, E., and Vahid, F. (2006, January 24\u201328). Configurable cache subsetting for fast cache tuning. Proceedings of the 2006 43rd ACM\/IEEE Design Automation Conference, San Francisco, CA, USA.","DOI":"10.1109\/DAC.2006.229310"},{"key":"ref_6","unstructured":"Jeong, K., Kahng, A.B., Kang, S., Rosing, T.S., and Strong, R. (2012, January 12\u201316). MAPG: memory access power gating. Proceedings of the Design, Automation and Test in Europe, Dresden, Germany."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1288","DOI":"10.1109\/TCAD.2013.2257923","article-title":"Many-Core Token-Based Adaptive Power Gating","volume":"32","author":"Kahng","year":"2013","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Adegbija, T., and Gordon-Ross, A. (2016, January 11\u201313). Phase-Based Dynamic Instruction Window Optimization for Embedded Systems. Proceedings of the 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), Pittsburgh, PA, USA.","DOI":"10.1109\/ISVLSI.2016.96"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Kora, Y., Yamaguchi, K., and Ando, H. (2013, January 7\u201311). MLP-aware dynamic instruction window resizing for adaptively exploiting both ILP and MLP. Proceedings of the 46th Annual IEEE\/ACM International Symposium on Microarchitecture, Davis, CA, USA.","DOI":"10.1145\/2540708.2540713"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhang, C., Vahid, F., and Najjar, W. (2003, January 9\u201311). A highly configurable cache architecture for embedded systems. Proceedings of the 30th Annual International Symposium on Computer Architecture, San Diego, CA, USA.","DOI":"10.1145\/859618.859635"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Alsafrjalani, M.H., Gordon-Ross, A., and Viana, P. (2014, January 26\u201328). Minimum Effort Design Space Subsetting for Configurable Caches. Proceedings of the 2014 12th IEEE International Conference on Embedded and Ubiquitous Computing, Milano, Italy.","DOI":"10.1109\/EUC.2014.19"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1109\/MC.2005.379","article-title":"Heterogeneous chip multiprocessors","volume":"38","author":"Kumar","year":"2005","journal-title":"Computer"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"De Abreu Silva, B., Cuminato, L.A., and Bonato, V. (2012, January 5\u20137). Reducing the overall cache miss rate using different cache sizes for Heterogeneous Multi-core Processors. Proceedings of the 2012 International Conference on Reconfigurable Computing and FPGAs, Cancun, Mexico.","DOI":"10.1109\/ReConFig.2012.6416783"},{"key":"ref_14","unstructured":"Semeraro, G., Magklis, G., Balasubramonian, D., Albonesi, S., Dwarkadas, H., and Scott, M. (2002, January 2\u20136). Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling. Proceedings of the Eighth International Symposium on High-Performance Computer Architecture, Cambridge, MA, USA."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Silvano, C., Palermo, G., Xydis, S., and Stamelakos, I. (2014, January 24\u201328). Voltage island management in near threshold manycore architectures to mitigate dark silicon. Proceedings of the 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, Germany.","DOI":"10.7873\/DATE2014.214"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Alsafrjalani, M.H., and Ross, A.G. (2014, January 26\u201328). Dynamic Scheduling for Reduced Energy in Configuration-Subsetted Heterogeneous Multicore Systems. Proceedings of the 2014 12th IEEE International Conference on Embedded and Ubiquitous Computing, Milano, Italy.","DOI":"10.1109\/EUC.2014.12"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Kayiran, O., Jog, A., Pattnaik, A., Ausavarungnirun, R., Tang, X., Kandemir, M.T., Loh, G.H., Mutlu, O., and Das, C.R. (2016, January 11\u201315). \u03bcC-States: Fine-grained GPU datapath power management. Proceedings of the 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT), Haifa, Israel.","DOI":"10.1145\/2967938.2967941"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Powell, M., Yang, S.-H., Falsafi, B., Roy, K., and Vijaykumar, T.N. (2000, January 26\u201327). Gated-Vdd: A circuit technique to reduce leakage in deep-submicron cache memories. Proceedings of the 2000 International Symposium on Low Power Electronics and Design (ISLPED\u201900), Rapallo, Italy.","DOI":"10.1109\/LPE.2000.155259"},{"key":"ref_19","unstructured":"Kaxiras, S., Zhigang, H., and Martonosi, M. (July, January 30). Cache decay: Exploiting generational behavior to reduce cache leakage power. Proceedings of the Proceedings 28th Annual International Symposium on Computer Architecture, Goteborg, Sweden."},{"key":"ref_20","unstructured":"Zhou, H., Toburen, M.C., Rotenberg, E., and Conte, T.M. (2001, January 8\u201312). Adaptive mode control: A static-power-efficient cache design. Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques, Barcelona, Spain."},{"key":"ref_21","unstructured":"Flautner, K., Kim, N., Martin, S., Blaauw, D., and Mudge, T. (2002, January 25\u201329). Drowsy caches: Simple techniques for reducing leakage power. Proceedings of the 29th Annual International Symposium on Computer Architecture, Anchorage, AK, USA."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Luo, J., and Jha, N. (2001, January 22). Battery-aware static scheduling for distributed real-time embedded systems. Proceedings of the Design Automation Conference, Las Vegas, NV, USA.","DOI":"10.1145\/378239.378553"},{"key":"ref_23","unstructured":"Kim, K., Kim, D., and Park, C. (2006, January 12\u201315). Real-time scheduling in heterogeneous dual-core architectures. Proceedings of the 12th International Conference on Parallel and Distributed Systems (ICPADS\u201906), Minneapolis, MN, USA."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Van Craeynest, K., Jaleel, A., Eeckhout, L., Narvaez, P., and Emer, J. (2012, January 9\u201313). Scheduling heterogeneous multi-cores through performance impact estimation (PIE). Proceedings of the 2012 39th Annual International Symposium on Computer Architecture (ISCA), Portland, OR, USA.","DOI":"10.1109\/ISCA.2012.6237019"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1145\/2508148.2485936","article-title":"Utility-based acceleration of multithreaded applications on asymmetric CMPs","volume":"41","author":"Joao","year":"2013","journal-title":"ACM SIGARCH Comput. Arch. News"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Das, R., Ausavarungnirun, R., Mutlu, O., Kumar, A., and Azimi, M. (2012, January 19\u201323). Application-to-core mapping policies to reduce memory interference in multi-core systems. Proceedings of the 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT), Minneapolis, MN, USA.","DOI":"10.1145\/2370816.2370893"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1145\/1531793.1531804","article-title":"HASS: A scheduler for heterogeneous multicore systems","volume":"43","author":"Shelepov","year":"2009","journal-title":"ACM SIGOPS Oper. Syst. Rev."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Wang, W., Mishra, P., and Gordon-Ross, A. (2009, January 5\u20139). SACR: Scheduling-Aware Cache Reconfiguration for Real-Time Embedded Systems. Proceedings of the 2009 22nd International Conference on VLSI Design, New Delhi, India.","DOI":"10.1109\/VLSI.Design.2009.66"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Malik, A., Moyer, B., and Cermak, D. (2000, January 26\u201327). A low power unified cache architecture providing power and performance flexibility. Proceedings of the 2000 International Symposium on Low Power Electronics and Design (ISLPED\u201900), Rapallo, Italy.","DOI":"10.1145\/344166.344610"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Srikantaiah, S., Kultursay, E., Zhang, T., Kandemir, M., Irwin, M.J., and Xie, Y. (2011, January 12\u201316). MorphCache: A Reconfigurable Adaptive Multi-level Cache hierarchy. Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture, San Antonio, TX, USA.","DOI":"10.1109\/HPCA.2011.5749732"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Chen, L., Zou, X., Lei, J., and Liu, Z. (2007, January 24\u201327). Dynamically Reconfigurable Cache for Low-Power Embedded System. Proceedings of the Third International Conference on Natural Computation (ICNC 2007), Haikou, China.","DOI":"10.1109\/ICNC.2007.346"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Gordon-Ross, A., Viana, P., Vahid, F., Najjar, W., and Barros, E. (2007, January 16\u201320). A One-Shot Configurable-Cache Tuner for Improved Energy and Performance. Proceedings of the 2007 Design, Automation & Test in Europe Conference & Exhibition, Nice, France.","DOI":"10.1109\/DATE.2007.364686"},{"key":"ref_33","unstructured":"Rawlins, M., and Gordon-Ross, A. (February, January 30). An application classification guided cache tuning heuristic for multi-core architectures. Proceedings of the 17th Asia and South Pacific Design Automation Conference, Sydney, NSW, Australia."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1816","DOI":"10.1109\/TCAD.2009.2028681","article-title":"ReSPIR: A response surface-based Pareto iterative refinement for application-specific design space exploration","volume":"28","author":"Palermo","year":"2009","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Adegbija, T., Gordon-Ross, A., and Rawlins, M. (2014, January 5\u20137). Analysis of cache tuner architectural layouts for multicore embedded systems. Proceedings of the 2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC), Austin, TX, USA.","DOI":"10.1109\/PCCC.2014.7017091"},{"key":"ref_36","unstructured":"EEMBC (2017, December 15). The Embedded Microprocessor Benchmark Consortium. Available online: https:\/\/www.eembc.org\/benchmark\/automotive_sl.php."},{"key":"ref_37","unstructured":"(2017, December 15). Mediabench Consortium. Available online: http:\/\/euler.slu.edu\/~fritts\/mediabench\/."},{"key":"ref_38","unstructured":"Mauerer, W. (2008). Process Management and Scheduling. Professional Linux Kernel Architecture, Wrox. [1st ed.]. Chapter 2."},{"key":"ref_39","unstructured":"Silberschatz, A. (2012). Processes. Operating System Concept, Wiley. [9th ed.]. Chapter 3."},{"key":"ref_40","unstructured":"Reinman, G., and Jouppi, N.P. (1999). CACTI2.0: An Integrated Cache Timing and Power Model, COMPAQ Western Research Laboratory."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Alsafrjalani, M.H., and Gordon-Ross, A. (2016, January 18\u201320). Quality of service-aware, scalable cache tuning algorithm in consumer-based embedded devices. Proceedings of the 2016 International Great Lakes Symposium on VLSI (GLSVLSI), Boston, MA, USA.","DOI":"10.1145\/2902961.2902987"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Alsafrjalani, M.H., and Gordon-Ross, A. (2017, January 17\u201319). Instruction set architecture impact on design space subsetting for configurable systems. Proceedings of the 2017 3rd IEEE International Conference on Control Science and Systems Engineering (ICCSSE), Beijing, China.","DOI":"10.1109\/CCSSE.2017.8088028"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Adegbija, T., and Gordon-Ross, A. (2014, January 10\u201313). Energy-efficient phase-based cache tuning for multimedia applications in embedded systems. Proceedings of the 2014 IEEE 11th Consumer Communications and Networking Conference (CCNC), Las Vegas, NV, USA.","DOI":"10.1109\/CCNC.2014.7056323"}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/7\/2\/25\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:00:42Z","timestamp":1760194842000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/7\/2\/25"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,4,14]]},"references-count":43,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2018,6]]}},"alternative-id":["computers7020025"],"URL":"https:\/\/doi.org\/10.3390\/computers7020025","relation":{},"ISSN":["2073-431X"],"issn-type":[{"type":"electronic","value":"2073-431X"}],"subject":[],"published":{"date-parts":[[2018,4,14]]}}}