{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:21:37Z","timestamp":1750306897997,"version":"3.41.0"},"reference-count":35,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2013,3,10]],"date-time":"2013-03-10T00:00:00Z","timestamp":1362873600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2013,3,10]]},"abstract":"<jats:p>When implementing digital signal processing (DSP) applications onto multiprocessor systems, one significant problem in the viewpoints of performance is the memory wall. In this paper, to help alleviate the memory wall problem, we propose a novel, high-performance buffer mapping policy for SDF-represented DSP applications on bus-based multiprocessor systems that support the shared-memory programming model. The proposed policy exploits the bank concurrency of the DRAM main memory system according to the analysis of hierarchical parallelism. Energy consumption is also a critical parameter, especially in battery-based embedded computing systems. In this paper, we apply a synchronization back-off scheme on the top of the proposed high-performance buffer mapping policy to reduce energy consumption. The energy saving is attained by minimizing the number of non-essential synchronization transactions. We measure throughput and energy consumption on both synthetic and real benchmarks. The simulation results show that the proposed buffer mapping policy is very useful in terms of performance, especially in memory-intensive applications where the total execution time of computational tasks is relatively small compared to that of memory operations. In addition, the proposed synchronization back-off scheme provides a reduction in the number of synchronization transactions without degrading performance, which results in system energy saving.<\/jats:p>","DOI":"10.1145\/2442116.2442132","type":"journal-article","created":{"date-parts":[[2013,4,9]],"date-time":"2013-04-09T12:17:58Z","timestamp":1365509878000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["High-performance and low-energy buffer mapping method for multiprocessor DSP systems"],"prefix":"10.1145","volume":"12","author":[{"given":"Dongwon","family":"Lee","sequence":"first","affiliation":[{"name":"Georgia Institute of Technology, Atlanta, Georgia"}]},{"given":"Marilyn","family":"Wolf","sequence":"additional","affiliation":[{"name":"Georgia Institute of Technology, Atlanta, Georgia"}]},{"given":"Shuvra S.","family":"Bhattacharyya","sequence":"additional","affiliation":[{"name":"University of Maryland, College Park, MD"}]}],"member":"320","published-online":{"date-parts":[[2013,4,8]]},"reference":[{"volume-title":"Proceedings of the International Workshop on Rapid System Prototyping. 108--123","author":"Ade M.","key":"e_1_2_1_1_1","unstructured":"Ade , M. , Lauwereins , R. , and Peperstraete , J. A . 1994. Buffer memory requirements in DSP applications . In Proceedings of the International Workshop on Rapid System Prototyping. 108--123 . Ade, M., Lauwereins, R., and Peperstraete, J. A. 1994. Buffer memory requirements in DSP applications. In Proceedings of the International Workshop on Rapid System Prototyping. 108--123."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/266021.266036"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.982917"},{"volume-title":"Proceedings of the International Workshop on Rapid System Prototyping. 194--200","author":"Bhattacharyya S. S.","key":"e_1_2_1_4_1","unstructured":"Bhattacharyya , S. S. , Murthy , P. K. , and Lee , E. A . 1995. Converting graphical DSP programs into memory-constrained software prototypes . In Proceedings of the International Workshop on Rapid System Prototyping. 194--200 . Bhattacharyya, S. S., Murthy, P. K., and Lee, E. A. 1995. Converting graphical DSP programs into memory-constrained software prototypes. In Proceedings of the International Workshop on Rapid System Prototyping. 194--200."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/78.600002"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/4.126534"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2010.43"},{"volume-title":"Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 68--77","author":"Corbal J.","key":"e_1_2_1_8_1","unstructured":"Corbal , J. , Espasa , R. , and Valero , M . 1998. Command vector memory systems: High performance at low cost . In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 68--77 . Corbal, J., Espasa, R., and Valero, M. 1998. Command vector memory systems: High performance at low cost. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 68--77."},{"volume-title":"Proceedings of the 6th International Workshop on CODES\/CASHE. 97--101","author":"Dick R. P.","key":"e_1_2_1_9_1","unstructured":"Dick , R. P. , Rhodes , D. L. , and Wolf , W . 1998. Tgff: Task graphs for free hardware\/software codesign . In Proceedings of the 6th International Workshop on CODES\/CASHE. 97--101 . Dick, R. P., Rhodes, D. L., and Wolf, W. 1998. Tgff: Task graphs for free hardware\/software codesign. In Proceedings of the 6th International Workshop on CODES\/CASHE. 97--101."},{"volume-title":"Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. 191--195","author":"Esteban D.","key":"e_1_2_1_10_1","unstructured":"Esteban , D. and Galand , C . 1977. Application of quadrature mirror filter to split-band voice coding schemes . In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. 191--195 . Esteban, D. and Galand, C. 1977. Application of quadrature mirror filter to split-band voice coding schemes. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. 191--195."},{"key":"e_1_2_1_11_1","unstructured":"Garey M. R. and Johnson D. S. 1999. Computers and Intractability: A Guide to the Theory of NP-completeness. W. H. Freeman and Company New York NY.  Garey M. R. and Johnson D. S. 1999. Computers and Intractability: A Guide to the Theory of NP-completeness. W. H. Freeman and Company New York NY."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACSD.2006.33"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.2307\/3680062"},{"volume-title":"Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation. 136--143","author":"Kee H.","key":"e_1_2_1_14_1","unstructured":"Kee , H. , Bhattacharyya , S. S. , and Kornerup , J . 2010. Efficient static buffering to guarantee throughput-optimal FPGA implementation of synchronous dataow graphs . In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation. 136--143 . Kee, H., Bhattacharyya, S. S., and Kornerup, J. 2010. Efficient static buffering to guarantee throughput-optimal FPGA implementation of synchronous dataow graphs. In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation. 136--143."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2008.136"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/RSP.2009.34"},{"volume-title":"Proceedings of the Global Telecommunications Conference. 1279--1283","author":"Lee E. A.","key":"e_1_2_1_17_1","unstructured":"Lee , E. A. and Ha , S . 1989. Scheduling strategies for multiprocessor real-time DSP . In Proceedings of the Global Telecommunications Conference. 1279--1283 . Lee, E. A. and Ha, S. 1989. Scheduling strategies for multiprocessor real-time DSP. In Proceedings of the Global Telecommunications Conference. 1279--1283."},{"volume-title":"Proceedings of the IEEE. 1235--1245","author":"Lee E. A.","key":"e_1_2_1_18_1","unstructured":"Lee , E. A. and Messerschmitt , D. G . 1987. Synchronous dataflow . Proceedings of the IEEE. 1235--1245 . Lee, E. A. and Messerschmitt, D. G. 1987. Synchronous dataflow. Proceedings of the IEEE. 1235--1245."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/240518.240523"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1105734.1105747"},{"volume-title":"Proceedings of the International Symposium on High-Performance Computer Architecture. 39--48","author":"Mathew B. K.","key":"e_1_2_1_21_1","unstructured":"Mathew , B. K. , McKee , S. A. , Carter , J. B. , and Davis , A . 2000. A design of a parallel vector access unit for sdram memory systems . In Proceedings of the International Symposium on High-Performance Computer Architecture. 39--48 . Mathew, B. K., McKee, S. A., Carter, J. B., and Davis, A. 2000. A design of a parallel vector access unit for sdram memory systems. In Proceedings of the International Symposium on High-Performance Computer Architecture. 39--48."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/645604.662604"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.917539"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/989995.989999"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/1299042.1299052"},{"volume-title":"Proceedings of Supercomputing. 49--58","author":"Raghavan R.","key":"e_1_2_1_26_1","unstructured":"Raghavan , R. and Hayes , J . 1990. On randomly interleaved memories . In Proceedings of Supercomputing. 49--58 . Raghavan, R. and Hayes, J. 1990. On randomly interleaved memories. In Proceedings of Supercomputing. 49--58."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2004.22"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339668"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/1550904"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1146909.1147138"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.1984.1052168"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.42"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1105734.1105748"},{"volume-title":"Modern VLSI Design: System-on-Chip Design","author":"Wolf W.","key":"e_1_2_1_35_1","unstructured":"Wolf , W. 2002. Modern VLSI Design: System-on-Chip Design . Prentice Hall , Upper Saddle River, NJ. Wolf, W. 2002. Modern VLSI Design: System-on-Chip Design. Prentice Hall, Upper Saddle River, NJ."},{"volume-title":"Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 1506--1511","author":"Zhu J.","key":"e_1_2_1_36_1","unstructured":"Zhu , J. , Sander , I. , and Jantsch , A . 2009. Buffer minimization of real-time streaming applications scheduling on hybrid CPU\/FPGA architectures . In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 1506--1511 . Zhu, J., Sander, I., and Jantsch, A. 2009. Buffer minimization of real-time streaming applications scheduling on hybrid CPU\/FPGA architectures. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 1506--1511."}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2442116.2442132","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2442116.2442132","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T08:19:06Z","timestamp":1750234746000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2442116.2442132"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,3,10]]},"references-count":35,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2013,3,10]]}},"alternative-id":["10.1145\/2442116.2442132"],"URL":"https:\/\/doi.org\/10.1145\/2442116.2442132","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2013,3,10]]},"assertion":[{"value":"2010-04-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-04-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}