{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,1]],"date-time":"2026-01-01T10:05:02Z","timestamp":1767261902399,"version":"3.41.0"},"reference-count":30,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2025,4,4]],"date-time":"2025-04-04T00:00:00Z","timestamp":1743724800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62306059"],"award-info":[{"award-number":["62306059"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2025,5,31]]},"abstract":"<jats:p>The visual-based SLAM (Simultaneous Localization and Mapping) is a technology widely used in applications such as robotic navigation and virtual reality, which primarily focuses on detecting feature points from visual images to construct an unknown environmental map and simultaneously determines its own location. It usually imposes stringent requirements on hardware power consumption, processing speed, and accuracy. Currently, the ORB (Oriented FAST and Rotated BRIEF)-based SLAM systems have exhibited superior performance in terms of processing speed and robustness. However, they still fall short of meeting the demands for real-time processing on mobile platforms. This limitation is primarily due to the time-consuming Oriented FAST calculations accounting for approximately half of the entire SLAM system. This article presents two methods to accelerate the Oriented FAST feature detection on low-end embedded GPUs. These methods optimize the most time-consuming steps in Oriented FAST feature detection: FAST feature point detection and Harris corner detection, which is achieved by implementing a binary-level encoding strategy to determine candidate points quickly and a separable Harris detection strategy with efficient low-level GPU hardware-specific instructions. Extensive experiments on a Jetson TX2 embedded GPU demonstrate an average speedup of over 7.3 times compared to widely used OpenCV with GPU support. This significant improvement highlights its effectiveness and potential for real-time applications in mobile and resource-constrained environments.<\/jats:p>","DOI":"10.1145\/3725217","type":"journal-article","created":{"date-parts":[[2025,3,18]],"date-time":"2025-03-18T10:51:44Z","timestamp":1742295104000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Faster than Fast: Accelerating Oriented FAST Feature Detection on Low-end Embedded GPUs"],"prefix":"10.1145","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4447-0480","authenticated-orcid":false,"given":"Qiong","family":"Chang","sequence":"first","affiliation":[{"name":"School of Computing, Institute of Science Tokyo, Meguro, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-9369-747X","authenticated-orcid":false,"given":"Xinyuan","family":"Chen","sequence":"additional","affiliation":[{"name":"International School of Information Science and Engineering, Dalian University of Technology, Dalian, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6933-6491","authenticated-orcid":false,"given":"Xiang","family":"Li","sequence":"additional","affiliation":[{"name":"School of Electronic Science &amp; Engineering, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6557-7175","authenticated-orcid":false,"given":"Weimin","family":"Wang","sequence":"additional","affiliation":[{"name":"International School of Information Science and Engineering, Dalian University of Technology, Dalian, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3038-7678","authenticated-orcid":false,"given":"Jun","family":"Miyazaki","sequence":"additional","affiliation":[{"name":"School of Computing, Institute of Science Tokyo, Meguro, Japan"}]}],"member":"320","published-online":{"date-parts":[[2025,4,4]]},"reference":[{"issue":"2","key":"e_1_3_1_2_2","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1109\/MRA.2006.1638022","article-title":"Simultaneous localization and mapping: Part I","volume":"13","year":"2006","unstructured":"H. Durrant-Whyte and T. Bailey. 2006. Simultaneous localization and mapping: Part I. IEEE Robotics & Automation Magazine 13, 2 (2006), 99\u2013110.","journal-title":"IEEE Robotics & Automation Magazine"},{"key":"e_1_3_1_3_2","article-title":"CUDA-ORB","year":"2023","unstructured":"Accustomer. 2023. CUDA-ORB. Retrieved November 1, 2023 from https:\/\/github.com\/Accustomer\/CUDA-ORB","journal-title":"https:\/\/github.com\/Accustomer\/CUDA-ORB"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1007\/s42452-020-2001-3"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2007.09.014"},{"issue":"1","key":"e_1_3_1_6_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3561972","article-title":"Edge-SLAM: Edge-assisted visual simultaneous localization and mapping","volume":"22","author":"Ali Ali J. Ben","year":"2022","unstructured":"Ali J. Ben Ali, Marziye Kouroshli, Sofiya Semenova, Zakieh Sadat Hashemifar, Steven Y. Ko, and Karthik Dantu. 2022. Edge-SLAM: Edge-assisted visual simultaneous localization and mapping. ACM Transactions on Embedded Computing Systems 22, 1 (2022), 1\u201331.","journal-title":"ACM Transactions on Embedded Computing Systems"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2021.3075644"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2023.03.004"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.2024.3395464"},{"key":"e_1_3_1_10_2","unstructured":"NVIDIA Corporation. 2021. CUDA C Programming Guide. Retrieved March 19 2025 from https:\/\/docs.nvidia.com\/cuda\/archive\/11.2.0\/cuda-c-programming-guide\/index.html"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00828"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2018.01.048"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-93701-4_34"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3626098"},{"key":"e_1_3_1_15_2","unstructured":"Intel IoT Libraries. 2018. Intel-IOT-Devkit. Retrieved March 19 2025 from https:\/\/github.com\/intel-iot-devkit\/sample-videos"},{"key":"e_1_3_1_16_2","first-page":"1","volume-title":"Proceedings of the 56th Annual Design Automation Conference","author":"Liu Runze","year":"2019","unstructured":"Runze Liu, Jianlei Yang, Yiran Chen, and Weisheng Zhao. 2019. eSLAM: An energy-efficient accelerator for real-time ORB-SLAM on FPGA platform. In Proceedings of the 56th Annual Design Automation Conference. 1\u20136."},{"key":"e_1_3_1_17_2","volume-title":"Optimizing Harris Corner Detection on GPGPUs using CUDA","author":"Loundagin Justin","year":"2015","unstructured":"Justin Loundagin. 2015. Optimizing Harris Corner Detection on GPGPUs using CUDA. M.Sc. Thesis. California Polytechnic State University."},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1023\/B:VISI.0000029664.99615.94"},{"key":"e_1_3_1_19_2","first-page":"52","volume-title":"Proceedings of the 2022 IEEE 33rd International Conference on Application-Specific Systems, Architectures, and Processors (ASAP \u201922)","author":"Mimura Yuzuki","year":"2022","unstructured":"Yuzuki Mimura, Chang Qiong, and Tsutomu Maruyama. 2022. Acceleration of video stabilization using embedded GPU. In Proceedings of the 2022 IEEE 33rd International Conference on Application-Specific Systems, Architectures, and Processors (ASAP \u201922). IEEE, 52\u201359."},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3558481.3591310"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/IROS45743.2020.9340851"},{"key":"e_1_3_1_22_2","unstructured":"NVIDIA. 2025. PTX: Parallel Thread Execution ISA Version 8.7. Retrieved March 19 2025 from https:\/\/docs.nvidia.com\/cuda\/parallel-thread-execution\/"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2012.2223873"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2011.6126544"},{"issue":"2","key":"e_1_3_1_25_2","first-page":"565","article-title":"A flexible and efficient real-time ORB-based full-HD image feature extraction accelerator","volume":"28","author":"Sun Rongdi","year":"2019","unstructured":"Rongdi Sun, Jiuchao Qian, Romero Hung Jose, Zheng Gong, Ruihang Miao, Wuyang Xue, and Peilin Liu. 2019. A flexible and efficient real-time ORB-based full-HD image feature extraction accelerator. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 28, 2 (2019), 565\u2013575.","journal-title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2022.12.005"},{"key":"e_1_3_1_27_2","article-title":"OpenCV: Open Source Computer Vision Library","author":"Team OpenCV Development","unstructured":"OpenCV Development Team. n.d. OpenCV: Open Source Computer Vision Library. Version 4.1. Retrieved March 19, 2025 from https:\/\/opencv.org\/","journal-title":"https:\/\/opencv.org\/"},{"key":"e_1_3_1_28_2","first-page":"1","volume-title":"Proceedings of the 2022 International Conference on Field-Programmable Technology (ICFPT \u201922)","author":"Vemulapati Vibhakar","year":"2022","unstructured":"Vibhakar Vemulapati and Deming Chen. 2022. FSLAM: An efficient and accurate SLAM accelerator on SoC FGPAs. In Proceedings of the 2022 International Conference on Field-Programmable Technology (ICFPT \u201922). IEEE, 1\u20139."},{"key":"e_1_3_1_29_2","first-page":"6","volume-title":"Proceedings of the 10th Workshop on Image Analysis for Multimedia Interactive Services","author":"Viswanathan Deepak Geetha","year":"2009","unstructured":"Deepak Geetha Viswanathan. 2009. Features from accelerated segment test (FAST). In Proceedings of the 10th Workshop on Image Analysis for Multimedia Interactive Services. 6\u20138."},{"key":"e_1_3_1_30_2","first-page":"33","volume-title":"Proceedings of the 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM \u201920)","author":"Xu Zhilin","year":"2020","unstructured":"Zhilin Xu, Jincheng Yu, Chao Yu, Hao Shen, Yu Wang, and Huazhong Yang. 2020. CNN-based feature-point extraction for real-time visual SLAM on embedded FPGA. In Proceedings of the 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM \u201920). IEEE, 33\u201337."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11554-016-0594-y"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3725217","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3725217","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:57:03Z","timestamp":1750298223000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3725217"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,4]]},"references-count":30,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,5,31]]}},"alternative-id":["10.1145\/3725217"],"URL":"https:\/\/doi.org\/10.1145\/3725217","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2025,4,4]]},"assertion":[{"value":"2024-09-07","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-14","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}