{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:28:29Z","timestamp":1750307309369,"version":"3.41.0"},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2011,8,1]],"date-time":"2011-08-01T00:00:00Z","timestamp":1312156800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000149","name":"Division of Engineering Education and Centers","doi-asserted-by":"publisher","award":["EEC-0642422"],"award-info":[{"award-number":["EEC-0642422"]}],"id":[{"id":"10.13039\/100000149","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Reconfigurable Technol. Syst."],"published-print":{"date-parts":[[2011,8]]},"abstract":"<jats:p>Reconfigurable Computing (RC) has the potential to provide substantial performance benefits and yet simultaneously consume less power than traditional microprocessors or GPUs. While experimental performance analysis of RC applications has previously been shown crucial for achieving this potential, existing methods still require application designers to manually locate bottlenecks and determine appropriate optimizations, typically requiring significant designer expertise and effort. Worse, the diversity of platforms employed by RC applications further complicates the process of detecting bottlenecks and formulating optimizations. To address these shortcomings, we first discuss our platform-template system, which enables a performance analysis tool to perform more accurate bottleneck detection and achieve a higher degree of portability across diverse FPGA systems. We then provide details for our implementation of these concepts and techniques in the Reconfigurable Computing Application Performance (ReCAP) tool. Next, we present a taxonomy of common RC bottlenecks, providing associated detection and optimization strategies for each bottleneck, which we use to populate ReCAP's knowledge base for bottleneck detection. Finally, we demonstrate the utility of our approach via two application case studies across a total of three platforms.<\/jats:p>","DOI":"10.1145\/2000832.2000842","type":"journal-article","created":{"date-parts":[[2011,8,30]],"date-time":"2011-08-30T13:30:18Z","timestamp":1314711018000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Platform-aware bottleneck detection for reconfigurable computing applications"],"prefix":"10.1145","volume":"4","author":[{"given":"Seth","family":"Koehler","sequence":"first","affiliation":[{"name":"NSF Center for High Performance Reconfigurable Computing, University of Florida, Gainesville, FL"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Greg","family":"Stitt","sequence":"additional","affiliation":[{"name":"NSF Center for High Performance Reconfigurable Computing, University of Florida, Gainesville, FL"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alan D.","family":"George","sequence":"additional","affiliation":[{"name":"NSF Center for High Performance Reconfigurable Computing, University of Florida, Gainesville, FL"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2011,8,22]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1646461.1646464"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/215399.215427"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1095408.1095420"},{"key":"e_1_2_1_4_1","unstructured":"Bodenner R. 2010. Creating platform support packages. http:\/\/www.impulseaccelerated.com\/AppNotes\/APP109_PSP\/IATAPP109_PSP.pdf.  Bodenner R. 2010. Creating platform support packages. http:\/\/www.impulseaccelerated.com\/AppNotes\/APP109_PSP\/IATAPP109_PSP.pdf."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2010.62"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/SASP.2008.4570793"},{"volume-title":"Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS'08)","author":"Chung I.-H.","key":"e_1_2_1_7_1"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1366230.1366234"},{"key":"e_1_2_1_9_1","unstructured":"Cray. 2010. Cray XD1 datasheet. http:\/\/www.hpc.unm.edu\/&percnt;7Etlthomas\/buildout\/Cray_XD1_Datasheet.pdf.  Cray. 2010. Cray XD1 datasheet. http:\/\/www.hpc.unm.edu\/&percnt;7Etlthomas\/buildout\/Cray_XD1_Datasheet.pdf."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1661438.1661443"},{"volume-title":"Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines. 13--23","author":"DeHon A.","key":"e_1_2_1_11_1"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1155\/ES\/2006\/56320"},{"key":"e_1_2_1_13_1","unstructured":"GiDEL. 2010. GiDEL PROCStar III PCIe x8\\texttrademark\\ computation accelerator. http:\/\/www.gidel.com\/pdf\/PROCStarIII&percnt;20Product&percnt;20Brief.pdf.  GiDEL. 2010. GiDEL PROCStar III PCIe x8\\texttrademark\\ computation accelerator. http:\/\/www.gidel.com\/pdf\/PROCStarIII&percnt;20Product&percnt;20Brief.pdf."},{"volume-title":"Proceedings of the 9th Annual High-Performance Embedded Computing Workshop (HPEC'05)","author":"Haney R.","key":"e_1_2_1_14_1"},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","unstructured":"Jorba J. Margalef T. and Luque E. 2008. Applied Parallel Computing. State of the Art in Scientific Computing. Springer (Chapter Search of Performance Inefficiencies in Message Passing Applications with KappaPI 2 Tool) 409--419.   Jorba J. Margalef T. and Luque E. 2008. Applied Parallel Computing. State of the Art in Scientific Computing. Springer (Chapter Search of Performance Inefficiencies in Message Passing Applications with KappaPI 2 Tool) 409--419.","DOI":"10.1007\/978-3-540-75755-9_50"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2008.01.008"},{"volume-title":"Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA).","author":"Koehler S.","key":"e_1_2_1_17_1"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1105734.1105737"},{"volume-title":"Proceedings of the 11th Annual High-Performance Embedded Computing Workshop (HPEC'07)","author":"McGraw-Herdeg M. P.","key":"e_1_2_1_19_1"},{"key":"e_1_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Mohr B. and Wolf F. 2003. Euro-Par 2003 Parallel Processing. Springer (Chapter KOJAK A Tool Set for Automatic Performance Analysis of Parallel Programs.) 1301--1304.  Mohr B. and Wolf F. 2003. Euro-Par 2003 Parallel Processing. Springer (Chapter KOJAK A Tool Set for Automatic Performance Analysis of Parallel Programs.) 1301--1304.","DOI":"10.1007\/978-3-540-45209-6_177"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2008.47"},{"key":"e_1_2_1_22_1","unstructured":"Nallatech. 2010. H101-PCIXM PCI-X FPGA accelerator card. http:\/\/www.nallatech.com\/PCI-Express-Cards\/h101-pcixm.html.  Nallatech. 2010. H101-PCIXM PCI-X FPGA accelerator card. http:\/\/www.nallatech.com\/PCI-Express-Cards\/h101-pcixm.html."},{"key":"e_1_2_1_23_1","unstructured":"OpenFPGA. 2010. OpenFPGA GenAPI version 0.4 draft for comment. http:\/\/www.openfpga.org\/Standards&percnt;20Documents\/OpenFPGA-GenAPIv0.4.pdf.  OpenFPGA. 2010. OpenFPGA GenAPI version 0.4 draft for comment. http:\/\/www.openfpga.org\/Standards&percnt;20Documents\/OpenFPGA-GenAPIv0.4.pdf."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342010370953"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2009.5160938"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1008155020711"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2005.46"},{"volume-title":"Proceedings of the 8th International Europar Conference(EuroPar02)","author":"Truong H.-L.","key":"e_1_2_1_28_1"},{"key":"e_1_2_1_29_1","unstructured":"University of California at Riverside. 2010. ROCCC 2.0 user's manual\u2014Revision 0.5.1. http:\/\/roccc.cs.ucr.edu\/documentation\/files\/UserManual-0.5.1.pdf.  University of California at Riverside. 2010. ROCCC 2.0 user's manual\u2014Revision 0.5.1. http:\/\/roccc.cs.ucr.edu\/documentation\/files\/UserManual-0.5.1.pdf."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1862648.1862649"},{"volume-title":"Proceedings of the Reconfigurable Systems Summer Institute (RSSI).","author":"Williams J.","key":"e_1_2_1_31_1"},{"key":"e_1_2_1_32_1","unstructured":"XtremeData Inc. 2010. XD1000#8482; development system. http:\/\/old.xtremedatainc.com\/index.php?option= com_content&view=article& id=109&Itemid=170.  XtremeData Inc. 2010. XD1000#8482; development system. http:\/\/old.xtremedatainc.com\/index.php?option= com_content&view=article& id=109&Itemid=170."}],"container-title":["ACM Transactions on Reconfigurable Technology and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2000832.2000842","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2000832.2000842","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T11:00:03Z","timestamp":1750244403000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2000832.2000842"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,8]]},"references-count":32,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2011,8]]}},"alternative-id":["10.1145\/2000832.2000842"],"URL":"https:\/\/doi.org\/10.1145\/2000832.2000842","relation":{},"ISSN":["1936-7406","1936-7414"],"issn-type":[{"type":"print","value":"1936-7406"},{"type":"electronic","value":"1936-7414"}],"subject":[],"published":{"date-parts":[[2011,8]]},"assertion":[{"value":"2010-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-08-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}