{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,2]],"date-time":"2025-03-02T05:48:11Z","timestamp":1740894491885,"version":"3.38.0"},"reference-count":25,"publisher":"SAGE Publications","issue":"3","license":[{"start":{"date-parts":[[2016,5,10]],"date-time":"2016-05-10T00:00:00Z","timestamp":1462838400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"DOI":"10.13039\/501100004003","name":"iMinds","doi-asserted-by":"publisher","award":["ICON MMIQQA project"],"award-info":[{"award-number":["ICON MMIQQA project"]}],"id":[{"id":"10.13039\/501100004003","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2017,5]]},"abstract":"<jats:p> This paper discusses an OpenCL version of a volumetric JPEG 2000 codec that runs on GPUs, multi-core processors or a combination of both. Since the performance critical part consists of a fine-grained (discrete wavelet transform) and coarse-grained algorithm (Tier-1), the best performance is obtained with a hybrid execution in which the discrete wavelet transform is executed on a GPU and Tier-1 on a multi-core. Using an Intel i7 multi-core in combination with a modest NVIDIA Quadro K620 GPU yields speedups greater than 10 compared with the original sequential code. The performance bottlenecks that arise on GPUs when parallelizing algorithms that are coarse-grained by nature are discussed and also the optimizations that are possible. A performance analysis reveals the inefficiencies and explains the deviations from the GPU peak performance. <\/jats:p>","DOI":"10.1177\/1094342016646438","type":"journal-article","created":{"date-parts":[[2016,5,12]],"date-time":"2016-05-12T00:43:41Z","timestamp":1463013821000},"page":"229-245","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["Heterogeneous acceleration of volumetric JPEG 2000 using OpenCL"],"prefix":"10.1177","volume":"31","author":[{"given":"Jan G.","family":"Cornelis","sequence":"first","affiliation":[{"name":"Vrije Universiteit Brussel (VUB), Electronics and Informatics (ETRO) Dept., Belgium"},{"name":"iMinds, Multimedia Technologies Dept., Belgium"}]},{"given":"Jan","family":"Lemeire","sequence":"additional","affiliation":[{"name":"Vrije Universiteit Brussel (VUB), Electronics and Informatics (ETRO) Dept., Belgium"},{"name":"Vrije Universiteit Brussel (VUB), Dept. Of Industrial Sciences (INDI), Belgium"},{"name":"iMinds, Multimedia Technologies Dept., Belgium"}]},{"given":"Tim","family":"Bruylants","sequence":"additional","affiliation":[{"name":"Vrije Universiteit Brussel (VUB), Electronics and Informatics (ETRO) Dept., Belgium"},{"name":"iMinds, Multimedia Technologies Dept., Belgium"}]},{"given":"Peter","family":"Schelkens","sequence":"additional","affiliation":[{"name":"Vrije Universiteit Brussel (VUB), Electronics and Informatics (ETRO) Dept., Belgium"},{"name":"iMinds, Multimedia Technologies Dept., Belgium"}]}],"member":"179","published-online":{"date-parts":[[2016,5,10]]},"reference":[{"key":"bibr1-1094342016646438","first-page":"682","volume-title":"The international conference on parallel and distributed processing techniques and applications (PDPTA)","author":"Ahmadvand M","year":"2012"},{"key":"bibr2-1094342016646438","unstructured":"AMD Radeon Graphics Technology (2012) AMD Graphics cores next (GCN) architecture white paper. Available at: www.amd.com\/Documents\/GCN_Architecture_whitepaper.pdf (accessed 26 April 2016)."},{"key":"bibr3-1094342016646438","unstructured":"Balevic A, Fuerst N, Heide M, (2009) CUJ2K: JPEG 2000 encoder in CUDA. Technical Report, Institute for Parallel and Distributed Systems, University of Stuttgart."},{"key":"bibr4-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1117\/2.1200706.0779"},{"key":"bibr5-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1016\/j.image.2014.12.007"},{"key":"bibr6-1094342016646438","doi-asserted-by":"crossref","unstructured":"Ciznicki M, Kurowski K, Plaza A (2011) GPU implementation of JPEG 2000 for hyperspectral image compression. In: SPIE remote sensing, Prague, Czech Republic, 19\u201322 September 2011, pp.81830H\u201381830H. Cardiff: SPIE.","DOI":"10.1117\/12.897386"},{"key":"bibr7-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1016\/j.jocs.2013.04.002"},{"key":"bibr8-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2010.04.122"},{"key":"bibr9-1094342016646438","unstructured":"Galiano V, L\u00f3pez O, Malumbres MP, (2012) GPU-based 3D Wavelet Transform. Proceedings of the 12th international conference on computational and mathematical methods in science and engineering (CMMSE), La Manga, Spain, 2\u20137 July 2012, pp.580\u2013590. Available at: http:\/\/cmmse.usal.es\/cmmse2015\/images\/stories\/congreso\/2-cmmse-2012.pdf (accessed 26 April 2016)."},{"key":"bibr10-1094342016646438","volume-title":"Introduction to Parallel Computing","author":"Grama A","year":"2003","edition":"2"},{"key":"bibr11-1094342016646438","unstructured":"Khronos (2012) OpenCL 1.2. Reference pages. Available at: www.khronos.org\/registry\/cl\/sdk\/1.2\/docs\/man\/xhtml\/ (accessed 26 April 2016)."},{"key":"bibr12-1094342016646438","first-page":"129","volume-title":"IEEE 9th symposium on application specific processors (SASP)","author":"Le R","year":"2011"},{"key":"bibr13-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1155\/2015\/859491"},{"key":"bibr14-1094342016646438","first-page":"136","volume-title":"Annual doctoral workshop on mathematical and engineering methods in computer science (MEMCS)","author":"Matela J","year":"2009"},{"key":"bibr15-1094342016646438","first-page":"423","volume-title":"Data compression conference (DCC)","author":"Matela J","year":"2011"},{"key":"bibr16-1094342016646438","first-page":"136","volume-title":"Mathematical and engineering methods in computer science (MEMCS)","author":"Matela J","year":"2011"},{"key":"bibr17-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1145\/1365490.1365500"},{"key":"bibr18-1094342016646438","unstructured":"NVIDIA corporation (2009) NVIDIA. NVIDIA\u2019s next-generation CUDA compute architecture. Available at: http:\/\/www.nvidia.com\/content\/pdf\/fermi_white_papers\/nvidia_fermi_compute_architecture_whitepaper.pdf (accessed 26 April 2016)."},{"key":"bibr19-1094342016646438","volume-title":"Computer Organization and Design: The Hardware\/Software Interface","author":"Patterson D","year":"2012","edition":"4"},{"key":"bibr20-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1002\/9780470744635"},{"key":"bibr21-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2011.6114174"},{"key":"bibr22-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2010.69"},{"key":"bibr23-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1006\/acha.1996.0015"},{"key":"bibr24-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4615-0799-4"},{"key":"bibr25-1094342016646438","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2012.115"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342016646438","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342016646438","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342016646438","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,1]],"date-time":"2025-03-01T15:40:52Z","timestamp":1740843652000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342016646438"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,5,10]]},"references-count":25,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2017,5]]}},"alternative-id":["10.1177\/1094342016646438"],"URL":"https:\/\/doi.org\/10.1177\/1094342016646438","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"type":"print","value":"1094-3420"},{"type":"electronic","value":"1741-2846"}],"subject":[],"published":{"date-parts":[[2016,5,10]]}}}