{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,21]],"date-time":"2025-09-21T17:01:44Z","timestamp":1758474104703,"version":"3.41.0"},"reference-count":45,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2024,11,20]],"date-time":"2024-11-20T00:00:00Z","timestamp":1732060800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"JSPS KAKENHI","award":["21K17868"],"award-info":[{"award-number":["21K17868"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2024,12,31]]},"abstract":"<jats:p>The GIST descriptor is a classic feature descriptor primarily used for scene categorization and recognition tasks. It drives a bank of Gabor filters, which respond to edges and textures at various scales and orientations to capture the spatial structures in an image. Compared to other scene recognition algorithms that rely on detailed object detection, GIST has lower computational complexity, allowing it to be widely applied. However, its internal multi-scale and multi-orientation Gabor filters also mean that systems based on it cannot be executed fast enough. This article proposes an optimized GPU kernel for the GIST descriptor. It fully takes advantage of the symmetry of Gabor filters and proposes different optimization strategies for both oblique and orthogonal orientations. Extensive experiments demonstrate that the proposed kernel is adaptable to images of various scales and different GPUs. Compared to the cuFFT library, our kernel achieves 12.09\u00d7 and 3.86\u00d7 acceleration on an RTX 3080 GPU and a Jetson AGX Xavier GPU, respectively.<\/jats:p>","DOI":"10.1145\/3689339","type":"journal-article","created":{"date-parts":[[2024,8,23]],"date-time":"2024-08-23T12:24:59Z","timestamp":1724415899000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["An Optimized GPU Implementation for GIST Descriptor"],"prefix":"10.1145","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6933-6491","authenticated-orcid":false,"given":"Xiang","family":"Li","sequence":"first","affiliation":[{"name":"School of Electronic Science &amp; Engineering, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4447-0480","authenticated-orcid":false,"given":"Qiong","family":"Chang","sequence":"additional","affiliation":[{"name":"School of Computing, Tokyo Institute of Technology, Meguro-ku, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2480-6597","authenticated-orcid":false,"given":"Aolong","family":"Zha","sequence":"additional","affiliation":[{"name":"Research Center for Advanced Science and Technology, The University of Tokyo, Meguro-ku, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2877-684X","authenticated-orcid":false,"given":"Shijie","family":"Chang","sequence":"additional","affiliation":[{"name":"Division of Biomedical Engineering, China Medical University, Shenyang, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1753-7317","authenticated-orcid":false,"given":"Yun","family":"Li","sequence":"additional","affiliation":[{"name":"School of Electronic Science &amp; Engineering, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3038-7678","authenticated-orcid":false,"given":"Jun","family":"Miyazaki","sequence":"additional","affiliation":[{"name":"School of Computing, Tokyo Institute of Technology, Meguro-ku, Japan"}]}],"member":"320","published-online":{"date-parts":[[2024,11,20]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"26","volume-title":"Proceedings of the 2021 IEEE\/ACM Symposium on Edge Computing (SEC\u201921)","author":"Abdelhafez Hazem A.","year":"2021","unstructured":"Hazem A. Abdelhafez, Hassan Halawa, Mohamed Osama Ahmed, Karthik Pattabiraman, and Matei Ripeanu. 2021. Mirage: Machine learning-based modeling of identical replicas of the Jetson AGX embedded platform. In Proceedings of the 2021 IEEE\/ACM Symposium on Edge Computing (SEC\u201921). IEEE, 26\u201340."},{"key":"e_1_3_1_3_2","volume-title":"Towards Novel Inter-Prediction Methods for Image and Video Compression","author":"B\u00e9gaint Jean","year":"2018","unstructured":"Jean B\u00e9gaint. 2018. Towards Novel Inter-Prediction Methods for Image and Video Compression. Ph.D. Dissertation. Rennes 1."},{"issue":"11","key":"e_1_3_1_4_2","first-page":"815","article-title":"On an improved FPGA implementation of CNN-based Gabor-type filters","volume":"59","author":"Cesur Evren","year":"2012","unstructured":"Evren Cesur, Nerhun Yildiz, and Vedat Tavsanoglu. 2012. On an improved FPGA implementation of CNN-based Gabor-type filters. IEEE Transactions on Circuits and Systems II: Express Briefs 59, 11 (2012), 815\u2013819.","journal-title":"IEEE Transactions on Circuits and Systems II: Express Briefs"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2023.03.004"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48891.2023.10160441"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.5555\/3213069.3213072"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.sysarc.2021.102366"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compind.2021.103551"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11265-014-0873-4"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2019.2958404"},{"key":"e_1_3_1_12_2","unstructured":"NVIDIA Corporation. 2020. cuFFT v.11.2 Official Documentation. Retrieved August 28 2024 from https:\/\/docs.nvidia.com\/cuda\/archive\/11.2.0\/cufft\/index.html"},{"key":"e_1_3_1_13_2","unstructured":"NVIDIA Corporation. 2021. CUDA C Programming Guide. Retrieved August 28 2024 from https:\/\/docs.nvidia.com\/cuda\/archive\/11.2.0\/cuda-c-programming-guide\/index.html"},{"key":"e_1_3_1_14_2","first-page":"64","volume-title":"Medical Imaging 2017: Imaging Informatics for Healthcare, Research, and Applications","author":"Ding Meng","year":"2017","unstructured":"Meng Ding, Sameer Antani, Stefan Jaeger, Zhiyun Xue, Sema Candemir, Marc Kohli, and George Thoma. 2017. Local-global classifier fusion for screening chest radiographs. In Medical Imaging 2017: Imaging Informatics for Healthcare, Research, and Applications, Vol. 10138. SPIE, 64\u201369."},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/DICTA.2010.73"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1523\/JNEUROSCI.2088-19.2020"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2022.116743"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dt.2019.12.002"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1007\/s13246-022-01153-z"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2018.2805811"},{"key":"e_1_3_1_21_2","first-page":"1","article-title":"3-D Gabor convolutional neural network for hyperspectral image classification","volume":"60","author":"Jia Sen","year":"2021","unstructured":"Sen Jia, Jianhui Liao, Meng Xu, Yan Li, Jiasong Zhu, Weiwei Sun, Xiuping Jia, and Qingquan Li. 2021. 3-D Gabor convolutional neural network for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing 60 (2021), 1\u201316.","journal-title":"IEEE Transactions on Geoscience and Remote Sensing"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2019.2926475"},{"key":"e_1_3_1_23_2","unstructured":"Aditya Khosla. 2024. Computer Vision Feature Extraction Toolbox. Retrieved August 28 2024 from https:\/\/github.com\/adikhosla\/feature-extraction"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASAP57973.2023.00031"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSI.2022.3176966"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCPMT.2018.2818947"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2021.3084813"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/34.531803"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.18178\/joig.9.4.146-151"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2017.144"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3429981"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1097\/RLI.0000000000000779"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1011139631724"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11554-013-0373-y"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1002\/ima.22865"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1117\/12.2587876"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/CSCI51800.2020.00289"},{"key":"e_1_3_1_38_2","article-title":"A 2D Gabor-wavelet baseline model out-performs a 3D surface model in scene-responsive cortex","author":"Shafer-Skelton Anna","year":"2024","unstructured":"Anna Shafer-Skelton, Timothy F. Brady, and John T. Serences. 2024. A 2D Gabor-wavelet baseline model out-performs a 3D surface model in scene-responsive cortex. bioRxiv (2024), 2024\u201302.","journal-title":"bioRxiv"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2019.2891995"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/TTE.2022.3141780"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2022.3201011"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477497"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCAS.2010.5537757"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.5220\/0006685805530561"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAD57390.2023.10323705"},{"issue":"3","key":"e_1_3_1_46_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3605148","article-title":"MFFT: A GPU accelerated highly efficient mixed-precision large-scale FFT framework","volume":"20","author":"Zhao Yuwen","year":"2023","unstructured":"Yuwen Zhao, Fangfang Liu, Wenjing Ma, Huiyuan Li, Yuanchi Peng, and Cui Wang. 2023. MFFT: A GPU accelerated highly efficient mixed-precision large-scale FFT framework. ACM Transactions on Architecture and Code Optimization 20, 3 (2023), 1\u201323.","journal-title":"ACM Transactions on Architecture and Code Optimization"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3689339","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3689339","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:05:45Z","timestamp":1750291545000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3689339"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,20]]},"references-count":45,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,12,31]]}},"alternative-id":["10.1145\/3689339"],"URL":"https:\/\/doi.org\/10.1145\/3689339","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2024,11,20]]},"assertion":[{"value":"2024-04-11","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-08-11","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-11-20","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}