{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:27:48Z","timestamp":1750307268347,"version":"3.41.0"},"reference-count":30,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2011,6,1]],"date-time":"2011-06-01T00:00:00Z","timestamp":1306886400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Des. Autom. Electron. Syst."],"published-print":{"date-parts":[[2011,6]]},"abstract":"<jats:p>UNISIM has been shown to ease the development of simulators for multi-\/many-core systems. However, UNISIM cycle-level simulations of large-scale multiprocessor systems could be very time consuming. In this article, we propose a systematic framework for accelerating UNISIM cycle-level simulations on multicore platforms. The proposed framework relies on exploiting the fine-grained parallelism within the simulated cycles using POSIX threads. A multithreaded simulation engine has been devised from the single-threaded UNISIM SystemC engine to facilitate the exploitation of inherent parallelism. An adaptive technique that manages the overall computation workload by adjusting the number of threads employed at any given time is proposed. In addition, we have introduced a technique to balance the workloads of multithreaded executions. This load balancing involves the distributions of SystemC objects among threads. A graph-partitioning-based technique has been introduced to automate such distributions. Finally, two strategies are proposed for realizing nonautomated and fully automated adaptive multithreaded simulations, respectively. Our investigations show that notable acceleration can be achieved by deploying the proposed framework. In particular, we show that simulations on an 8-core multicore platform can provide for close to 6X speedups when simulating many-core systems with large number of cores.<\/jats:p>","DOI":"10.1145\/1970353.1970359","type":"journal-article","created":{"date-parts":[[2011,6,14]],"date-time":"2011-06-14T14:44:54Z","timestamp":1308062694000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Accelerating UNISIM-Based Cycle-Level Microarchitectural Simulations on Multicore Platforms"],"prefix":"10.1145","volume":"16","author":[{"given":"Xiongfei","family":"Liao","sequence":"first","affiliation":[{"name":"Nanyang Technological University"}]},{"given":"Thambipillai","family":"Srikanthan","sequence":"additional","affiliation":[{"name":"Nanyang Technological University"}]}],"member":"320","published-online":{"date-parts":[[2011,6]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/L-CA.2007.12"},{"key":"e_1_2_1_2_1","unstructured":"CellSim. 2009. CellSim simulator. http:\/\/pcsostres.ac.upc.edu\/cellsim\/doku.php\/start. CellSim . 2009. CellSim simulator. http:\/\/pcsostres.ac.upc.edu\/cellsim\/doku.php\/start."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/643114.643116"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPA.2008.124"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/L-CA.2006.14"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/PADS.2009.25"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/244804.244808"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/84537.84545"},{"volume-title":"Proceedings of the International Symposium on Industrial Embedded Systems.","author":"Huang K.","key":"e_1_2_1_9_1","unstructured":"Huang , K. , Bacivarov , I. , Hugelshofer , F. , and Thiele , L . 2008. Scalably distributed SystemC simulation for embedded applications . In Proceedings of the International Symposium on Industrial Embedded Systems. Huang, K., Bacivarov, I., Hugelshofer, F., and Thiele, L. 2008. Scalably distributed SystemC simulation for embedded applications. In Proceedings of the International Symposium on Industrial Embedded Systems."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCSA.2008.63"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1137\/S1064827595287997"},{"key":"e_1_2_1_12_1","first-page":"4","article-title":"A modular simulator framework for Network-on-Chip based manycore chips using UNISIM","volume":"4","author":"Liao X.","year":"2009","unstructured":"Liao , X. , Jigang , W. , and Srikanthan , T. 2009 . A modular simulator framework for Network-on-Chip based manycore chips using UNISIM . Trans. High-Perform. Embed. Archit. Compil. 4 , 4 . Liao, X., Jigang, W., and Srikanthan, T. 2009. A modular simulator framework for Network-on-Chip based manycore chips using UNISIM. Trans. High-Perform. Embed. Archit. Compil. 4, 4.","journal-title":"Trans. High-Perform. Embed. Archit. Compil."},{"key":"e_1_2_1_13_1","doi-asserted-by":"crossref","unstructured":"Low Y. Lim C.-C. Cai W. Huang S.-Y. Hsu W.-J. Jain S. and Turner S. J. 1999. Survey of languages and runtime libraries for parallel discrete event simulation. http:\/\/sim.sagepub.com\/content\/72\/3\/170.abstract. Low Y. Lim C.-C. Cai W. Huang S.-Y. Hsu W.-J. Jain S. and Turner S. J . 1999. Survey of languages and runtime libraries for parallel discrete event simulation. http:\/\/sim.sagepub.com\/content\/72\/3\/170.abstract.","DOI":"10.1177\/003754979907200309"},{"volume-title":"Proceedings of the 31st Annual International Symposium on Computer Architecture.","author":"Mullins R.","key":"e_1_2_1_14_1","unstructured":"Mullins , R. , West , A. , and Moore , S . 2004. Low-Latency virtual-channel routers for on-chip networks . In Proceedings of the 31st Annual International Symposium on Computer Architecture. Mullins, R., West, A., and Moore, S. 2004. Low-Latency virtual-channel routers for on-chip networks. In Proceedings of the 31st Annual International Symposium on Computer Architecture."},{"volume-title":"Proceedings of the Conference on Design, Automation and Test in Europe (DATE\u201907)","author":"Naguib Y. N.","key":"e_1_2_1_15_1","unstructured":"Naguib , Y. N. and Guindi , R. S . 2007. Speeding up SystemC simulation through process splitting . In Proceedings of the Conference on Design, Automation and Test in Europe (DATE\u201907) . Naguib, Y. N. and Guindi, R. S. 2007. Speeding up SystemC simulation through process splitting. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE\u201907)."},{"volume-title":"Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC\u201910)","author":"Nanjundappa M.","key":"e_1_2_1_16_1","unstructured":"Nanjundappa , M. , Patel , H. D. , Jose , B. A. , and Shukla , S. K . 2010. SCGPSim: A fast SystemC simulator on GPUs . In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC\u201910) . Nanjundappa, M., Patel, H. D., Jose, B. A., and Shukla, S. K. 2010. SCGPSim: A fast SystemC simulator on GPUs. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC\u201910)."},{"key":"e_1_2_1_17_1","unstructured":"OSCI. 2009. OSCI standards and reference implementation. http:\/\/www.systemc.org. OSCI . 2009. OSCI standards and reference implementation. http:\/\/www.systemc.org."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2005.850819"},{"volume-title":"Proceedings of the International Symposium on High-Performance Computer Architecture.","author":"Penry D.","key":"e_1_2_1_19_1","unstructured":"Penry , D. , Fay , D. , Hodgdon , D. , Wells , R. , Schelle , G. , August , D. I. , and Connors , D . 2006. Exploiting parallelism and structure to accelerate the simulation of chip multiprocessors . In Proceedings of the International Symposium on High-Performance Computer Architecture. Penry, D., Fay, D., Hodgdon, D., Wells, R., Schelle, G., August, D. I., and Connors, D. 2006. Exploiting parallelism and structure to accelerate the simulation of chip multiprocessors. In Proceedings of the International Symposium on High-Performance Computer Architecture."},{"volume-title":"Proceedings of the Conference on Design, Automation and Test in Europe (DATE\u201904)","author":"P\u00e9rez D. G.","key":"e_1_2_1_20_1","unstructured":"P\u00e9rez , D. G. , Mouchard , G. , and Temam , O . 2004. A new optimized implemention of the systemc engine using acyclic scheduling . In Proceedings of the Conference on Design, Automation and Test in Europe (DATE\u201904) . P\u00e9rez, D. G., Mouchard, G., and Temam, O. 2004. A new optimized implemention of the systemc engine using acyclic scheduling. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE\u201904)."},{"key":"e_1_2_1_21_1","unstructured":"PI. 2009. Calculating &pi; using MPI. http:\/\/www.unix.mcs.anl.gov\/mpi\/usingmpi\/examples\/simplempi\/cpi_c.htm. PI . 2009. Calculating &pi; using MPI. http:\/\/www.unix.mcs.anl.gov\/mpi\/usingmpi\/examples\/simplempi\/cpi_c.htm."},{"volume-title":"Proceedings of the Conference on Design, Automation and Test in Europe (DATE\u201902)","author":"Savoiu N.","key":"e_1_2_1_22_1","unstructured":"Savoiu , N. , Shukla , S. , and Gupta , R . 2002. Automated concurrency re-assignment in high level system models for efficient system-level simulation . In Proceedings of the Conference on Design, Automation and Test in Europe (DATE\u201902) . Savoiu, N., Shukla, S., and Gupta, R. 2002. Automated concurrency re-assignment in high level system models for efficient system-level simulation. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE\u201902)."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/37888.37917"},{"volume-title":"Proceedings of the 25th ACM\/IEEE Design Automation Conference (DAC\u201988)","author":"Soul\u00e9 L.","key":"e_1_2_1_24_1","unstructured":"Soul\u00e9 , L. and Blank , T . 1988. Parallel logic simulation on general purpose machines . In Proceedings of the 25th ACM\/IEEE Design Automation Conference (DAC\u201988) . Soul\u00e9, L. and Blank, T. 1988. Parallel logic simulation on general purpose machines. In Proceedings of the 25th ACM\/IEEE Design Automation Conference (DAC\u201988)."},{"key":"e_1_2_1_25_1","unstructured":"Tilera. 2009. Tilera. http:\/\/www.tilera.com. Tilera . 2009. Tilera. http:\/\/www.tilera.com."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2008.2011239"},{"key":"e_1_2_1_27_1","unstructured":"Trams M. 2004. A first mature revision of a synchronization library for distributed RTL simulation in SystemC. http:\/\/www.digital-force.net. Trams M. 2004. A first mature revision of a synchronization library for distributed RTL simulation in SystemC. http:\/\/www.digital-force.net."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/256563.256818"},{"key":"e_1_2_1_29_1","unstructured":"UNISIM. 2009. UNIted SIMulation Environment. https:\/\/unisim.org. UNISIM . 2009. UNIted SIMulation Environment. https:\/\/unisim.org."},{"volume-title":"Proceedings of the IEEE International Conference on Solid-State Circuits (ISSCC\u201907)","author":"Vangal S.","key":"e_1_2_1_30_1","unstructured":"Vangal , S. , Howard , J. , Ruhl , G. , Dighe , S. , Wilson , H. , Tschanz , J. , Finan , D. , Iyer , P. , Singh , A. , Jacob , T. , Jain , S. , Venkataraman , S. , Hoskote , Y. , and Borkar , N . 2007. An 80-tile 1.28tflops Network-on-Chip in 65nm CMOS . In Proceedings of the IEEE International Conference on Solid-State Circuits (ISSCC\u201907) (Digest of Technical Papers). Vangal, S., Howard, J., Ruhl, G., Dighe, S., Wilson, H., Tschanz, J., Finan, D., Iyer, P., Singh, A., Jacob, T., Jain, S., Venkataraman, S., Hoskote, Y., and Borkar, N. 2007. An 80-tile 1.28tflops Network-on-Chip in 65nm CMOS. In Proceedings of the IEEE International Conference on Solid-State Circuits (ISSCC\u201907) (Digest of Technical Papers)."}],"container-title":["ACM Transactions on Design Automation of Electronic Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1970353.1970359","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1970353.1970359","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T10:52:52Z","timestamp":1750243972000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1970353.1970359"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,6]]},"references-count":30,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2011,6]]}},"alternative-id":["10.1145\/1970353.1970359"],"URL":"https:\/\/doi.org\/10.1145\/1970353.1970359","relation":{},"ISSN":["1084-4309","1557-7309"],"issn-type":[{"type":"print","value":"1084-4309"},{"type":"electronic","value":"1557-7309"}],"subject":[],"published":{"date-parts":[[2011,6]]},"assertion":[{"value":"2010-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-06-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}