{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:22:29Z","timestamp":1750220549068,"version":"3.41.0"},"reference-count":28,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2021,2,9]],"date-time":"2021-02-09T00:00:00Z","timestamp":1612828800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2021,6,30]]},"abstract":"<jats:p>Multi-core systems are now found in many electronic devices. But does current software design fully leverage their capabilities? The complexity of the hardware and software stacks in these platforms requires software optimization with end-to-end knowledge of the system.<\/jats:p>\n          <jats:p>To optimize software performance, we must have accurate information about system behavior and time losses. Standard monitoring engines impose tradeoffs on profiling tools, making it impossible to reconcile all the expected requirements: accurate hardware views, fine-grain measurements, speed, and so on. Subsequently, new approaches have to be examined.<\/jats:p>\n          <jats:p>In this article, we propose a non-intrusive, accurate tool chain, which can reveal and quantify slowdowns in low-level software mechanisms. Based on emulation, this tool chain extracts behavioral information (time, contention) through hardware side channels, without distorting the software execution flow. This tool consists of two parts. (1) An online acquisition part that dumps hardware platform signals. (2) An offline processing part that consolidates meaningful behavioral information from the dumped data. Using our tool chain, we studied and propose optimizations to MultiProcessor System on Chip (MPSoC) support in the Linux kernel, saving about 60% of the time required for the release phase of the GNU OpenMP synchronization barrier when running on a 64-core MPSoC.<\/jats:p>","DOI":"10.1145\/3445030","type":"journal-article","created":{"date-parts":[[2021,2,10]],"date-time":"2021-02-10T14:29:54Z","timestamp":1612967394000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["A Non-Intrusive Tool Chain to Optimize MPSoC End-to-End Systems"],"prefix":"10.1145","volume":"18","author":[{"given":"Maxime","family":"France-Pillois","sequence":"first","affiliation":[{"name":"Univ. Grenoble Alpes, CEA, LETI, MINATEC Campus, France, Grenoble, France"}]},{"given":"J\u00e9r\u00f4me","family":"Martin","sequence":"additional","affiliation":[{"name":"Univ. Grenoble Alpes, CEA, LETI, MINATEC Campus, France, Grenoble, France"}]},{"given":"Fr\u00e9d\u00e9ric","family":"Rousseau","sequence":"additional","affiliation":[{"name":"Univ. Grenoble Alpes, CNRS, Grenoble INP, TIMA, France, Grenoble, France"}]}],"member":"320","published-online":{"date-parts":[[2021,2,9]]},"reference":[{"volume-title":"Proceedings of the 2015 10th International Design Test Symposium (IDT). 14--19","author":"AbdElSalam M.","key":"e_1_2_1_1_1","unstructured":"M. AbdElSalam and A. Salem . 2015. SoC verification platforms using HW emulation and co-modeling Testbench technologies . In Proceedings of the 2015 10th International Design Test Symposium (IDT). 14--19 . M. AbdElSalam and A. Salem. 2015. SoC verification platforms using HW emulation and co-modeling Testbench technologies. In Proceedings of the 2015 10th International Design Test Symposium (IDT). 14--19."},{"key":"e_1_2_1_2_1","first-page":"3","article-title":"The Nas parallel benchmarks","volume":"5","author":"Bailey D. H.","year":"1991","unstructured":"D. H. Bailey , E. Barszcz , J. T. Barton , D. S. Browning , R. L. Carter , L. Dagum , R. A. Fatoohi , P. O. Frederickson , T. A. Lasinski , R. S. Schreiber , and 1991 . The Nas parallel benchmarks . Int. J. High Perform. Comput. Appl. 5 , 3 (Sept. 1991), 63--73. D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, and et al.1991. The Nas parallel benchmarks. Int. J. High Perform. Comput. Appl. 5, 3 (Sept. 1991), 63--73.","journal-title":"Int. J. High Perform. Comput. Appl."},{"volume-title":"Proceedings of the 2012 IEEE\/ACM International Conference on Computer-Aided Design (ICCAD). 115--122","author":"Banerjee S.","key":"e_1_2_1_3_1","unstructured":"S. Banerjee and T. Gupta . 2012. Fast and scalable hybrid functional verification and debug with dynamically reconfigurable co-simulation . In Proceedings of the 2012 IEEE\/ACM International Conference on Computer-Aided Design (ICCAD). 115--122 . S. Banerjee and T. Gupta. 2012. Fast and scalable hybrid functional verification and debug with dynamically reconfigurable co-simulation. In Proceedings of the 2012 IEEE\/ACM International Conference on Computer-Aided Design (ICCAD). 115--122."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"volume-title":"Proceedings of the 2007 Internatonal Conference on Microelectronics. 101--104","author":"Buchmann R.","key":"e_1_2_1_5_1","unstructured":"R. Buchmann and A. Greiner . 2007. A fully static scheduling approach for fast cycle accurate systemC simulation of MPSoCs . In Proceedings of the 2007 Internatonal Conference on Microelectronics. 101--104 . R. Buchmann and A. Greiner. 2007. A fully static scheduling approach for fast cycle accurate systemC simulation of MPSoCs. In Proceedings of the 2007 Internatonal Conference on Microelectronics. 101--104."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCSoC.2016.20"},{"key":"e_1_2_1_7_1","unstructured":"Cadence. Cadence Emulation Platform. Retrieved from https:\/\/www.cadence.com\/en_US\/home\/tools\/system-design-and-verification\/acceleration-and-emulation\/palladium-z1.html.  Cadence. Cadence Emulation Platform. Retrieved from https:\/\/www.cadence.com\/en_US\/home\/tools\/system-design-and-verification\/acceleration-and-emulation\/palladium-z1.html."},{"volume-title":"Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 1112--1121","author":"France-Pillois M.","key":"e_1_2_1_8_1","unstructured":"M. France-Pillois , J. Martin , and F. Rousseau . 2020. Implementation and evaluation of a hardware decentralized synchronization lock for MPSoCs . In Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 1112--1121 . M. France-Pillois, J. Martin, and F. Rousseau. 2020. Implementation and evaluation of a hardware decentralized synchronization lock for MPSoCs. In Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 1112--1121."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1049\/iet-cdt.2018.5136"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1629435.1629446"},{"volume-title":"Proceedings of the 2016 IEEE International Conference on Advances in Electronics, Communication and Computer Technology (ICAECCT). 423--428","author":"Gopikrishna S.","key":"e_1_2_1_11_1","unstructured":"S. Gopikrishna , M. Jha , S. Sreekanth , and G. Savithri . 2016. A multiprocessor system on chip verification on hardware accelerator and Software Emulation . In Proceedings of the 2016 IEEE International Conference on Advances in Electronics, Communication and Computer Technology (ICAECCT). 423--428 . S. Gopikrishna, M. Jha, S. Sreekanth, and G. Savithri. 2016. A multiprocessor system on chip verification on hardware accelerator and Software Emulation. In Proceedings of the 2016 IEEE International Conference on Advances in Electronics, Communication and Computer Technology (ICAECCT). 423--428."},{"key":"e_1_2_1_12_1","unstructured":"Mentor Graphics. Codelink. Retrieved from https:\/\/www.mentor.com\/products\/fv\/codelink\/.  Mentor Graphics. Codelink. Retrieved from https:\/\/www.mentor.com\/products\/fv\/codelink\/."},{"key":"e_1_2_1_13_1","unstructured":"Mentor Graphics. Veloce Emulation Platform. Retrieved from https:\/\/www.mentor.com\/products\/fv\/emulation-systems\/.  Mentor Graphics. Veloce Emulation Platform. Retrieved from https:\/\/www.mentor.com\/products\/fv\/emulation-systems\/."},{"key":"e_1_2_1_14_1","unstructured":"On-Chip Bus Development Working Group. 2001. VSI Alliance Virtual Component Interface Standard Version 2 (OCB 2 2.0). Retrieved from http:\/\/home.mit.bme.hu\/ feher\/MSC_RA\/VCI\/VCI.pdf.  On-Chip Bus Development Working Group. 2001. VSI Alliance Virtual Component Interface Standard Version 2 (OCB 2 2.0). Retrieved from http:\/\/home.mit.bme.hu\/ feher\/MSC_RA\/VCI\/VCI.pdf."},{"volume-title":"Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 13--22","author":"Gutierrez A.","key":"e_1_2_1_15_1","unstructured":"A. Gutierrez , J. Pusdesris , R. G. Dreslinski , T. Mudge , C. Sudanthi , C. D. Emmons , M. Hayenga , and N. Paver . 2014. Sources of error in full-system simulation . In Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 13--22 . A. Gutierrez, J. Pusdesris, R. G. Dreslinski, T. Mudge, C. Sudanthi, C. D. Emmons, M. Hayenga, and N. Paver. 2014. Sources of error in full-system simulation. In Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 13--22."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2018.00014"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3373376.3378455"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3295816.3295821"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/DATE.2011.5763121"},{"volume-title":"Proceedings of the 2012 Design, Automation and Test in Europe Conference 8 Exhibition (DATE). IEEE, Dresden, 685--690","author":"Leupers R.","key":"e_1_2_1_20_1","unstructured":"R. Leupers , Frank Schirrmeister , Grant Martin , Tim Kogel , Roman Plyaskin , Andreas Herkersdorf , and M. Vaupel . 2012. Virtual platforms: Breaking new grounds . In Proceedings of the 2012 Design, Automation and Test in Europe Conference 8 Exhibition (DATE). IEEE, Dresden, 685--690 . R. Leupers, Frank Schirrmeister, Grant Martin, Tim Kogel, Roman Plyaskin, Andreas Herkersdorf, and M. Vaupel. 2012. Virtual platforms: Breaking new grounds. In Proceedings of the 2012 Design, Automation and Test in Europe Conference 8 Exhibition (DATE). IEEE, Dresden, 685--690."},{"key":"e_1_2_1_21_1","unstructured":"Lip6. InterconnexionNetworks TSAR. Retrieved from https:\/\/www-soc.lip6.fr\/trac\/tsar\/wiki\/InterconnexionNetworks.  Lip6. InterconnexionNetworks TSAR. Retrieved from https:\/\/www-soc.lip6.fr\/trac\/tsar\/wiki\/InterconnexionNetworks."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/VDAT.2006.258142"},{"volume-title":"Proceedings of the 41st Design Automation Conference (DAC\u201904)","author":"Ohba N.","key":"e_1_2_1_23_1","unstructured":"N. Ohba and K. Takano . 2004. An SoC design methodology using FPGAs and embedded microprocessors . In Proceedings of the 41st Design Automation Conference (DAC\u201904) . 747--752. N. Ohba and K. Takano. 2004. An SoC design methodology using FPGAs and embedded microprocessors. In Proceedings of the 41st Design Automation Conference (DAC\u201904). 747--752."},{"volume-title":"Proceedings of the 2016 International Conference on Hardware\/Software Codesign and System Synthesis (CODES+ISSS). 1--10","author":"Saboori E.","key":"e_1_2_1_24_1","unstructured":"E. Saboori and S. Abdi . 2016. Fast and cycle-accurate simulation of multi-threaded applications on SMP architectures using hybrid prototyping . In Proceedings of the 2016 International Conference on Hardware\/Software Codesign and System Synthesis (CODES+ISSS). 1--10 . E. Saboori and S. Abdi. 2016. Fast and cycle-accurate simulation of multi-threaded applications on SMP architectures using hybrid prototyping. In Proceedings of the 2016 International Conference on Hardware\/Software Codesign and System Synthesis (CODES+ISSS). 1--10."},{"key":"e_1_2_1_25_1","unstructured":"Synopsys. Synopsys Emulation Platform. Retrieved from https:\/\/www.synopsys.com\/verification\/emulation.html.  Synopsys. Synopsys Emulation Platform. Retrieved from https:\/\/www.synopsys.com\/verification\/emulation.html."},{"key":"e_1_2_1_26_1","unstructured":"Lawrence Vivolo. 2013. Transaction-based Verification and Emulation Combine for Multi-megahertz Verification Performance. Retrieved from http:\/\/www.electronicdesign.com\/eda\/transaction-based-verification-and-emulation-combine-multi-megahertz-verification-performance.  Lawrence Vivolo. 2013. Transaction-based Verification and Emulation Combine for Multi-megahertz Verification Performance. Retrieved from http:\/\/www.electronicdesign.com\/eda\/transaction-based-verification-and-emulation-combine-multi-megahertz-verification-performance."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 2000 IEEE International Test Conference (ITC\u201900)","author":"Yang Zan","year":"2000","unstructured":"Zan Yang , Byeong Min , and Gwan Choi . 2000 . Si-emulation: System verification using simulation and emulation . In Proceedings of the 2000 IEEE International Test Conference (ITC\u201900) . IEEE Computer Society, 160. Zan Yang, Byeong Min, and Gwan Choi. 2000. Si-emulation: System verification using simulation and emulation. In Proceedings of the 2000 IEEE International Test Conference (ITC\u201900). IEEE Computer Society, 160."},{"volume-title":"Proceedings of the 2016 IEEE Conference on Communications and Network Security (CNS). 73--81","author":"Zheng Chengyu","key":"e_1_2_1_28_1","unstructured":"Chengyu Zheng , M. D. Preda , J. Granjal , S. Zanero , and F. Maggi . 2016. On-chip system call tracing: A feasibility study and open prototype . In Proceedings of the 2016 IEEE Conference on Communications and Network Security (CNS). 73--81 . Chengyu Zheng, M. D. Preda, J. Granjal, S. Zanero, and F. Maggi. 2016. On-chip system call tracing: A feasibility study and open prototype. In Proceedings of the 2016 IEEE Conference on Communications and Network Security (CNS). 73--81."}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3445030","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3445030","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:28:13Z","timestamp":1750195693000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3445030"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,9]]},"references-count":28,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,6,30]]}},"alternative-id":["10.1145\/3445030"],"URL":"https:\/\/doi.org\/10.1145\/3445030","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2021,2,9]]},"assertion":[{"value":"2020-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-12-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-02-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}