{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,11]],"date-time":"2025-02-11T20:40:21Z","timestamp":1739306421214,"version":"3.37.0"},"publisher-location":"Berlin, Heidelberg","reference-count":21,"publisher":"Springer Berlin Heidelberg","isbn-type":[{"type":"print","value":"9783642036439"},{"type":"electronic","value":"9783642036446"}],"license":[{"start":{"date-parts":[[2009,1,1]],"date-time":"2009-01-01T00:00:00Z","timestamp":1230768000000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009]]},"DOI":"10.1007\/978-3-642-03644-6_12","type":"book-chapter","created":{"date-parts":[[2009,8,21]],"date-time":"2009-08-21T09:16:51Z","timestamp":1250846211000},"page":"150-164","source":"Crossref","is-referenced-by-count":1,"title":["Performance Optimization Strategies of High Performance Computing on GPU"],"prefix":"10.1007","author":[{"given":"Anguo","family":"Ma","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jing","family":"Cai","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yu","family":"Cheng","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoqiang","family":"Ni","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuxing","family":"Tang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zuocheng","family":"Xing","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","reference":[{"key":"12_CR1","doi-asserted-by":"crossref","unstructured":"Ghuloum, A., Sprangle, E., Fang, J., Wu, G., Zhou, X.: Ct: A Flexible Parallel Programming Model for Tera-scale Architectures. Technical report, Intel Research (2007)","DOI":"10.1145\/1362702.1362707"},{"key":"12_CR2","unstructured":"Gutowitz. H.: A tutorial introduction to Swarm. Technical report, The Santa Fe Institute (1993)"},{"key":"12_CR3","unstructured":"Monteyne, M.: RapidMind: Multi-Core Develpment Platform, RapidMind Official Page (2007), http:\/\/www.rapidmind.net\/"},{"key":"12_CR4","doi-asserted-by":"publisher","first-page":"803","DOI":"10.1002\/cpe.728","volume":"15","author":"J.J. Dongarra","year":"2003","unstructured":"Dongarra, J.J., Luszczek, P., Petitet, A.: The LINPACK Benchmark: Past, Present, and Future. Concurrency and Computation: Practice and Experience\u00a015, 803\u2013820 (2003)","journal-title":"Concurrency and Computation: Practice and Experience"},{"key":"12_CR5","unstructured":"http:\/\/www.netlib.org\/benchmark\/hpl\/index.html"},{"key":"12_CR6","unstructured":"Halfhill, T.R.: Parallel Processing With CUDA. Microprocessor Report (January 2008)"},{"key":"12_CR7","unstructured":"Stone, J.: Accelerating Computational Biology by 100x with CUDA. In: NVISION (2008) (presentation)"},{"key":"12_CR8","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1145\/1375527.1375533","volume-title":"ICS 2008: Proceedings of the 22nd annual international conference on Supercomputing","author":"T.D.R. Hartley","year":"2008","unstructured":"Hartley, T.D.R., Catalyurek, U., Ruiz, A., Igual, F., Mayo, R., Ujaldon, M.: Biomedical image analysis on a cooperative cluster of gpus and multicores. In: ICS 2008: Proceedings of the 22nd annual international conference on Supercomputing, pp. 15\u201325. ACM, New York (2008)"},{"key":"12_CR9","unstructured":"Bond, A.: Havok FX: GPU-accelerated physics for PC games. In: Proceedings of Game Developers Conference 2006 (2006)"},{"key":"12_CR10","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"220","DOI":"10.1007\/11758549_34","volume-title":"Computational Science \u2013 ICCS 2006","author":"T.R. Hagen","year":"2006","unstructured":"Hagen, T.R., Lle, K.-A., Natvig, J.R.: Solving the Euler equations on graphics processing units. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol.\u00a03994, pp. 220\u2013227. Springer, Heidelberg (2006)"},{"key":"12_CR11","doi-asserted-by":"crossref","unstructured":"Zeller, C.: Cloth simulation on the GPU. In: ACM SIGGRAPH 2005 Conference Abstracts and Applications (2005)","DOI":"10.1145\/1187112.1187158"},{"key":"12_CR12","doi-asserted-by":"crossref","unstructured":"Elsen, E., Houston, M., Vishal, V., Darve, E., Hanrahan, P., Pande, V.S.: N-Body simulation on GPUs. In: Proc. 2006 ACM\/IEEE Conf. on Supercomputing, p. 188 (2006)","DOI":"10.1145\/1188455.1188649"},{"key":"12_CR13","doi-asserted-by":"publisher","first-page":"1781","DOI":"10.1002\/jcc.20289","volume":"26","author":"J.C. Phillips","year":"2005","unstructured":"Phillips, J.C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., Chipot, C., Skeel, R.D., Kale, L., Schulten, K.: Scalable molecular dynamics with NAMD. J. Comp. Chem.\u00a026, 1781\u20131802 (2005)","journal-title":"J. Comp. Chem."},{"key":"12_CR14","doi-asserted-by":"publisher","first-page":"2618","DOI":"10.1002\/jcc.20829","volume":"28","author":"J.E. Stone","year":"2007","unstructured":"Stone, J.E., Phillips, J.C., Freddolino, P.L., Hardy, D.J., Trabuco, L.G., Schulten, K.: Accelerating molecular modeling applications with graphics processors. J. Comp. Chem.\u00a028, 2618\u20132640 (2007)","journal-title":"J. Comp. Chem."},{"key":"12_CR15","doi-asserted-by":"crossref","unstructured":"Stone, S.S., Haldar, J.P., Tsao, S.C., Hwu, W.W., Liang, Z., Sutton, B.P.: Accelerating advanced MRI reconstructions on GPUs. In: ACM Computing Frontier Conference (2008)","DOI":"10.1145\/1366230.1366276"},{"key":"12_CR16","unstructured":"openVIDIA, http:\/\/openvidia.sourceforge.net\/"},{"key":"12_CR17","first-page":"1","volume-title":"SC 2008: Proceedings of the 2008 ACM\/IEEE conference on Super-computing","author":"V. Volkov","year":"2008","unstructured":"Volkov, V., Demmel, J.W.: Benchmarking GPUs to tune dense linear algebra. In: SC 2008: Proceedings of the 2008 ACM\/IEEE conference on Super-computing, pp. 1\u201311. IEEE Press, Los Alamitos (2008)"},{"key":"12_CR18","doi-asserted-by":"crossref","unstructured":"Fatica, M.: Accelerating Linpack with CUDA on heterogenous clusters. In: GPGPU 2009. ACM, New york (2009)","DOI":"10.1145\/1513895.1513901"},{"key":"12_CR19","unstructured":"Castillo, M., Chan, E., Igual, F.D., Mayo, R., Quintanaorti, E.S., Quintana-orti, G., Van De Geijn, R., Van Zee, F.G.: Making Programming Synonymous with Programming for Linear Algebra Libraries, FLAME Working Note #31. The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-08-20 (April 17, 2008)"},{"key":"12_CR20","doi-asserted-by":"crossref","unstructured":"Quintana-Orti, G., Igual, F.D., Quintana-Orti, E.S., van de Geijn, R.: Solving Dense Linear Systems on Platforms with Multiple Hardware Accelerators. In: PPoPP, pp. 121\u2013129 (2009)","DOI":"10.1145\/1594835.1504196"},{"key":"12_CR21","unstructured":"decuda, http:\/\/www.cs.rug.nl\/~wladimir\/decuda\/"}],"container-title":["Lecture Notes in Computer Science","Advanced Parallel Processing Technologies"],"original-title":[],"link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-642-03644-6_12","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,11]],"date-time":"2025-02-11T20:25:43Z","timestamp":1739305543000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/978-3-642-03644-6_12"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009]]},"ISBN":["9783642036439","9783642036446"],"references-count":21,"URL":"https:\/\/doi.org\/10.1007\/978-3-642-03644-6_12","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2009]]}}}