{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:16:05Z","timestamp":1750306565978,"version":"3.41.0"},"reference-count":30,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2015,5,21]],"date-time":"2015-05-21T00:00:00Z","timestamp":1432166400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Science Foundation of China","doi-asserted-by":"crossref","award":["61100074"],"award-info":[{"award-number":["61100074"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"crossref","award":["2013QNA5008"],"award-info":[{"award-number":["2013QNA5008"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Science and Technology Major Project of China","award":["2012ZX01039-004"],"award-info":[{"award-number":["2012ZX01039-004"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2015,5,21]]},"abstract":"<jats:p>Communication frequency is increasing with the growing complexity of emerging embedded applications and the number of processors in the implemented multiprocessor SoC architectures. In this article, we consider the issue of communication cost reduction during multithreaded code generation from partitioned Simulink models to help designers in code optimization to improve system performance. We first propose a technique combining message aggregation and communication pipeline methods, which groups communications with the same destinations and sources and parallelizes communication and computation tasks. We also present a method to apply static analysis and dynamic emulation for efficient communication buffer allocation to further reduce synchronization cost and increase processor utilization. The existing cyclic dependency in the mapped model may hinder the effectiveness of the two techniques. We further propose a set of optimizations involving repartition with strongly connected threads to maximize the degree of communication reduction and preprocessing strategies with available delays in the model to reduce the number of communication channels that cannot be optimized. Experimental results demonstrate the advantages of the proposed optimizations with 11--143% throughput improvement.<\/jats:p>","DOI":"10.1145\/2644811","type":"journal-article","created":{"date-parts":[[2015,5,26]],"date-time":"2015-05-26T14:36:05Z","timestamp":1432650965000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Communication Optimizations for Multithreaded Code Generation from Simulink Models"],"prefix":"10.1145","volume":"14","author":[{"given":"Kai","family":"Huang","sequence":"first","affiliation":[{"name":"Department of ISEE, Zhejiang University, Hangzhou, China"}]},{"given":"Min","family":"Yu","sequence":"additional","affiliation":[{"name":"Department of ISEE, Zhejiang University, Hangzhou, China"}]},{"given":"Rongjie","family":"Yan","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Science, Institute of Software, Beijing, China"}]},{"given":"Xiaomeng","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of EE, Zhejiang University"}]},{"given":"Xiaolang","family":"Yan","sequence":"additional","affiliation":[{"name":"College of EE, Zhejiang University"}]},{"given":"Lisane","family":"Brisolara","sequence":"additional","affiliation":[{"name":"Universidate federal de pelotas, Pelotas, Brazil"}]},{"given":"Ahmed Amine","family":"Jerraya","sequence":"additional","affiliation":[{"name":"University Grenoble Alpes, CEA, LETI, MINATEC Campus, Grenoble, France"}]},{"given":"Jiong","family":"Feng","sequence":"additional","affiliation":[{"name":"Hangzhou C-SKY Micro-system Co. Ltd, Hangzhou, China"}]}],"member":"320","published-online":{"date-parts":[[2015,5,21]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.467577"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1269843.1269855"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2228360.2228597"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2011.2173941"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/HLDVT.2007.4392782"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1216919.1216934"},{"key":"e_1_2_1_8_1","unstructured":"C-SKY Inc. Homepage. Retrieved from http:\/\/www.c-sky.com.  C-SKY Inc. Homepage. Retrieved from http:\/\/www.c-sky.com."},{"volume-title":"dSPACE","key":"e_1_2_1_9_1","unstructured":"RTI-MP , dSPACE , Inc. Retrieved from http:\/\/www.dspaceinc.com\/ww\/en\/inc\/home\/products\/sw\/impsw\/rtimpblo.cfm. RTI-MP, dSPACE, Inc. Retrieved from http:\/\/www.dspaceinc.com\/ww\/en\/inc\/home\/products\/sw\/impsw\/rtimpblo.cfm."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1816011"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/996566.996636"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10617-007-9009-4"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.vlsi.2008.08.003"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1118299.1118509"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1146909.1147084"},{"volume-title":"Proceedings of the 2008 International Conference on Formal Methods in Computer-Aided Design (FMCAD\u201908)","author":"Hartel Pieter H.","key":"e_1_2_1_16_1","unstructured":"Pieter H. Hartel , Theo C. Ruys , and Marc C. W. Geilen . 2008. Scheduling optimisations for SPIN to minimise buffer requirements in synchronous data flow . In Proceedings of the 2008 International Conference on Formal Methods in Computer-Aided Design (FMCAD\u201908) , Alessandro Cimatti and Robert B. Jones (Eds.). IEEE Press, Piscataway, NJ, Article 21, 10 pages. Pieter H. Hartel, Theo C. Ruys, and Marc C. W. Geilen. 2008. Scheduling optimisations for SPIN to minimise buffer requirements in synchronous data flow. In Proceedings of the 2008 International Conference on Formal Methods in Computer-Aided Design (FMCAD\u201908), Alessandro Cimatti and Robert B. Jones (Eds.). IEEE Press, Piscataway, NJ, Article 21, 10 pages."},{"volume-title":"The Spin Model Checker: Primer and Reference Manual","author":"Holzmann Gerard","key":"e_1_2_1_17_1","unstructured":"Gerard Holzmann . 2003. The Spin Model Checker: Primer and Reference Manual (First ed.). Addison-Wesley Professional . Gerard Holzmann. 2003. The Spin Model Checker: Primer and Reference Manual (First ed.). Addison-Wesley Professional."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2146417.2146425"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1278480.1278491"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of World Computer Congress-IFIP","author":"Kahn Gilles","year":"1976","unstructured":"Gilles Kahn and David MacQueeen . 1976 . Coroutines and networks of parallel processors . In Proceedings of World Computer Congress-IFIP (1977), Toronto, Canada, 993--998. Gilles Kahn and David MacQueeen. 1976. Coroutines and networks of parallel processors. In Proceedings of World Computer Congress-IFIP (1977), Toronto, Canada, 993--998."},{"key":"e_1_2_1_21_1","volume-title":"Parks","author":"Lee Edward A.","year":"2001","unstructured":"Edward A. Lee and Thomas M . Parks . 2001 . Dataflow process networks. In Readings in Hardware\/Software Co-Design, Giovanni De Micheli, Rolf Ernst, and Wayne Wolf (Eds.). Kluwer Academic Publishers , Norwell, MA, 59--85. Edward A. Lee and Thomas M. Parks. 2001. Dataflow process networks. In Readings in Hardware\/Software Co-Design, Giovanni De Micheli, Rolf Ernst, and Wayne Wolf (Eds.). Kluwer Academic Publishers, Norwell, MA, 59--85."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1629435.1629445"},{"key":"e_1_2_1_23_1","unstructured":"Simulink Mathworks. Retrieved from http:\/\/www.mathworks.com.  Simulink Mathworks. Retrieved from http:\/\/www.mathworks.com."},{"volume-title":"workshop, Mathworks.","key":"e_1_2_1_24_1","unstructured":"Real-time workshop, Mathworks. Retrieved from http:\/\/www.mathworks.com. Real-time workshop, Mathworks. Retrieved from http:\/\/www.mathworks.com."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1353610.1353627"},{"volume-title":"Object Management Group","author":"UML","key":"e_1_2_1_26_1","unstructured":"UML , Object Management Group , Inc . http:\/\/www.uml.org\/. UML, Object Management Group, Inc. http:\/\/www.uml.org\/."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 16th Asia and South Pacific Design Automation Conference (ASPDAC\u201911)","author":"Oh Hyunok","year":"2011","unstructured":"Tae-ho Shin, Hyunok Oh , and Soonhoi Ha . 2011 . Minimizing buffer requirements for throughput constrained parallel execution of synchronous dataflow graph . In Proceedings of the 16th Asia and South Pacific Design Automation Conference (ASPDAC\u201911) . IEEE Press, Piscataway, NJ, 165--170. Tae-ho Shin, Hyunok Oh, and Soonhoi Ha. 2011. Minimizing buffer requirements for throughput constrained parallel execution of synchronous dataflow graph. In Proceedings of the 16th Asia and South Pacific Design Automation Conference (ASPDAC\u201911). IEEE Press, Piscataway, NJ, 165--170."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1146909.1147138"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/SWAT.1971.10"},{"key":"e_1_2_1_30_1","unstructured":"V6 TAI Logic Module S2C Inc. http:\/\/www.s2cinc.com\/product\/HardWare\/V6TAILogicModule.htm.  V6 TAI Logic Module S2C Inc. http:\/\/www.s2cinc.com\/product\/HardWare\/V6TAILogicModule.htm."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1278480.1278681"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2644811","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2644811","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:13:12Z","timestamp":1750227192000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2644811"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,5,21]]},"references-count":30,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2015,5,21]]}},"alternative-id":["10.1145\/2644811"],"URL":"https:\/\/doi.org\/10.1145\/2644811","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2015,5,21]]},"assertion":[{"value":"2013-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-05-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}