{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T23:01:40Z","timestamp":1777676500644,"version":"3.51.4"},"reference-count":41,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2012,12,28]],"date-time":"2012-12-28T00:00:00Z","timestamp":1356652800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2013,5]]},"abstract":"<jats:p>Hybrid-core systems speedup applications by offloading certain compute operations that can run faster on hardware accelerators. However, such systems require significant programming and porting effort to gain a performance benefit from the accelerators. Therefore, prior to porting it is prudent to investigate the predicted performance benefit of accelerators for a given workload. To address this problem we present a performance-modeling framework that predicts the application performance rapidly and accurately for hybrid-core systems. We present predictions for two full-scale HPC applications\u2014HYCOM and Milc. Our results for two accelerators (GPU and FPGA) show that gather\/scatter and stream operations can speedup by as much as a factor of 15 and overall compute time of Milc and HYCOM improve by 3.4% and 20%, respectively. We also show that in order to benefit from the accelerators, 70% of the latency of data transfer time between the CPU and the accelerators needs to be overcome.<\/jats:p>","DOI":"10.1177\/1094342012468180","type":"journal-article","created":{"date-parts":[[2012,12,28]],"date-time":"2012-12-28T21:00:12Z","timestamp":1356728412000},"page":"89-108","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":29,"title":["Modeling and predicting performance of high performance computing           applications on hardware accelerators"],"prefix":"10.1177","volume":"27","author":[{"given":"Mitesh R.","family":"Meswani","sequence":"first","affiliation":[{"name":"San Diego Supercomputer Center, La Jolla, CA, USA"}]},{"given":"Laura","family":"Carrington","sequence":"additional","affiliation":[{"name":"San Diego Supercomputer Center, La Jolla, CA, USA"}]},{"given":"Didem","family":"Unat","sequence":"additional","affiliation":[{"name":"University of California at San Diego, San Diego, CA, USA"}]},{"given":"Allan","family":"Snavely","sequence":"additional","affiliation":[{"name":"San Diego Supercomputer Center, La Jolla, CA, USA"}]},{"given":"Scott","family":"Baden","sequence":"additional","affiliation":[{"name":"University of California at San Diego, San Diego, CA, USA"}]},{"given":"Stephen","family":"Poole","sequence":"additional","affiliation":[{"name":"Oak Ridge National Laboratory, Oak Ridge, TN, USA"}]}],"member":"179","published-online":{"date-parts":[[2012,12,28]]},"reference":[{"key":"bibr1-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-75444-2_64"},{"key":"bibr2-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.1997.1346"},{"key":"bibr3-1094342012468180","unstructured":"Asanovic K, Bodik R, Catanzaro BC, (2006) The landscape of parallel computing             research: a view from Berkeley. Technical Report             UCB\/EECS-2006-183, EECS Department, University of             California, Berkeley."},{"issue":"3","key":"bibr4-1094342012468180","first-page":"66","volume":"5","author":"Bailey DH","year":"1991","journal-title":"International Journal of Supercomputer Applications"},{"key":"bibr5-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2009.4919648"},{"key":"bibr6-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2010.135"},{"key":"bibr7-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2010.36"},{"key":"bibr8-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"key":"bibr9-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1145\/268806.268810"},{"key":"bibr10-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2005.33"},{"key":"bibr11-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1145\/1995896.1995928"},{"key":"bibr12-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1145\/240455.240477"},{"key":"bibr13-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1137\/S009753979427491"},{"key":"bibr14-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1145\/1188455.1188549"},{"key":"bibr15-1094342012468180","unstructured":"Graph500 (2012) Brief\n            introduction to Graph500. Available at: www.graph500.org."},{"key":"bibr16-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2011.5762730"},{"key":"bibr17-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2011.21"},{"key":"bibr18-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1145\/1555754.1555775"},{"key":"bibr19-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/2.982915"},{"key":"bibr20-1094342012468180","unstructured":"HYCOM. (2012)\n            HYCOM. Available at: www.hycom.org."},{"key":"bibr21-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1049\/ip-sen:20030808"},{"key":"bibr22-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2010.5452024"},{"key":"bibr23-1094342012468180","doi-asserted-by":"crossref","unstructured":"Luszczek P, Dongarra J, Koester D,  (2005) Introduction to the HPC Challenge             Benchmark Suite. Available at: http:\/\/icl.cs.utk.edu\/hpcc\/pubs.","DOI":"10.21236\/ADA439315"},{"key":"bibr24-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/2.982916"},{"key":"bibr25-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.1998.727287"},{"key":"bibr26-1094342012468180","unstructured":"Milc\u2014The MIMD Lattice Computation (MILC)\n            Collaboration (2012) Available at: www.physics.utah.edu\/~detar\/milc\/."},{"key":"bibr28-1094342012468180","unstructured":"NVIDIA (2009) NVIDIA\u2019s\n            next generation CUDA compute architecture: Fermi. Available\n            at: www.nvidia.com\/object\/fermi_architecture.html."},{"key":"bibr29-1094342012468180","volume-title":"Proceedings of the first international workshop on parallel software tools and tool infrastructures (PSTI)","author":"Olschanowsky C","year":"2010"},{"key":"bibr30-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1145\/781027.781076"},{"key":"bibr31-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/CISIS.2007.49"},{"key":"bibr32-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/12.467697"},{"key":"bibr33-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1145\/235543.235545"},{"key":"bibr34-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2002.10004"},{"key":"bibr35-1094342012468180","volume-title":"Computer System Performance Measurement and Evaluation Methods: Analysis and Applications","author":"Svobodova L","year":"1976"},{"key":"bibr36-1094342012468180","volume-title":"Proceeding of the 2007 ACM\/IEEE Conference on High Performance Networking and Computing (SC\u201907)","author":"Tikir MM","year":"2007"},{"key":"bibr37-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-03869-3_16"},{"key":"bibr38-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2002.1176256"},{"key":"bibr39-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/AINA.2006.68"},{"key":"bibr40-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1145\/1498765.1498785"},{"key":"bibr41-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.1996.0151"},{"key":"bibr42-1094342012468180","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2006.44"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342012468180","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342012468180","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342012468180","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T08:19:10Z","timestamp":1777450750000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342012468180"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,12,28]]},"references-count":41,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2013,5]]}},"alternative-id":["10.1177\/1094342012468180"],"URL":"https:\/\/doi.org\/10.1177\/1094342012468180","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,12,28]]}}}