{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T07:02:49Z","timestamp":1773903769459,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":50,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,17]],"date-time":"2021-10-17T00:00:00Z","timestamp":1634428800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2018YFE0183900"],"award-info":[{"award-number":["2018YFE0183900"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,17]]},"DOI":"10.1145\/3474085.3475314","type":"proceedings-article","created":{"date-parts":[[2021,10,18]],"date-time":"2021-10-18T04:59:18Z","timestamp":1634533158000},"page":"4622-4631","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":46,"title":["From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with Voxel-to-Point Decoder"],"prefix":"10.1145","author":[{"given":"Jiale","family":"Li","sequence":"first","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}]},{"given":"Hang","family":"Dai","sequence":"additional","affiliation":[{"name":"Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE"}]},{"given":"Ling","family":"Shao","sequence":"additional","affiliation":[{"name":"Inception Institute of Artificial Intelligence, Abu Dhabi, UAE"}]},{"given":"Yong","family":"Ding","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}]}],"member":"320","published-online":{"date-parts":[[2021,10,17]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ITSC45102.2020.9294293"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58589-1_5"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.691"},{"key":"e_1_3_2_1_4_1","volume-title":"Fast Point R-CNN. In 2019 IEEE\/CVF International Conference on Computer Vision (ICCV). 9774--9783","author":"Chen Yilun","year":"2019","unstructured":"Yilun Chen , Shu Liu , Xiaoyong Shen , and Jiaya Jia . 2019 . Fast Point R-CNN. In 2019 IEEE\/CVF International Conference on Computer Vision (ICCV). 9774--9783 . Yilun Chen, Shu Liu, Xiaoyong Shen, and Jiaya Jia. 2019. Fast Point R-CNN. In 2019 IEEE\/CVF International Conference on Computer Vision (ICCV). 9774--9783."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-66096-3_2"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01334"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/MFI49285.2020.9235240"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/2354409.2354978"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5244\/C.29.150"},{"key":"e_1_3_2_1_10_1","volume-title":"2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9224--9232","author":"Graham Benjamin","unstructured":"Benjamin Graham , Martin Engelcke , and Laurens van der Maaten. 2018. 3D Semantic Segmentation With Submanifold Sparse Convolutional Networks . In 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9224--9232 . Benjamin Graham, Martin Engelcke, and Laurens van der Maaten. 2018. 3D Semantic Segmentation With Submanifold Sparse Convolutional Networks. In 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9224--9232."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01189"},{"key":"e_1_3_2_1_12_1","volume-title":"Mask R-CNN. In 2017 IEEE International Conference on Computer Vision (ICCV). 2980--2988","author":"He Kaiming","unstructured":"Kaiming He , Georgia Gkioxari , Piotr Doll\u00e1r , and Ross B. Girshick . 2017 . Mask R-CNN. In 2017 IEEE International Conference on Computer Vision (ICCV). 2980--2988 . Kaiming He, Georgia Gkioxari, Piotr Doll\u00e1r, and Ross B. Girshick. 2017. Mask R-CNN. In 2017 IEEE International Conference on Computer Vision (ICCV). 2980--2988."},{"key":"e_1_3_2_1_13_1","volume-title":"Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778","author":"He Kaiming","year":"2016","unstructured":"Kaiming He , Xiangyu Zhang , Shaoqing Ren , and Jian Sun . 2016 . Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778 . Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778."},{"key":"e_1_3_2_1_14_1","volume-title":"Acquisition of Localization Confidence for Accurate Object Detection. In European Conference on Computer Vision (ECCV). 784--799","author":"Jiang Borui","year":"2018","unstructured":"Borui Jiang , Ruixuan Luo , Jiayuan Mao , Tete Xiao , and Yuning Jiang . 2018 . Acquisition of Localization Confidence for Accurate Object Detection. In European Conference on Computer Vision (ECCV). 784--799 . Borui Jiang, Ruixuan Luo, Jiayuan Mao, Tete Xiao, and Yuning Jiang. 2018. Acquisition of Localization Confidence for Accurate Object Detection. In European Conference on Computer Vision (ECCV). 784--799."},{"key":"e_1_3_2_1_15_1","volume-title":"2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS). 1--8.","author":"Ku Jason","unstructured":"Jason Ku , Melissa Mozifian , Jungwook Lee , Ali Harakeh , and Steven L. Waslander . 2018. Joint 3D Proposal Generation and Object Detection from View Aggregation . In 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS). 1--8. Jason Ku, Melissa Mozifian, Jungwook Lee, Ali Harakeh, and Steven L. Waslander. 2018. Joint 3D Proposal Generation and Object Detection from View Aggregation. In 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS). 1--8."},{"key":"e_1_3_2_1_16_1","volume-title":"PointPillars: Fast Encoders for Object Detection From Point Clouds. In 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 12689--12697","author":"Lang Alex H.","year":"2019","unstructured":"Alex H. Lang , Sourabh Vora , Holger Caesar , Lubing Zhou , Jiong Yang , and Oscar Beijbom . 2019 . PointPillars: Fast Encoders for Object Detection From Point Clouds. In 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 12689--12697 . Alex H. Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. 2019. PointPillars: Fast Encoders for Object Detection From Point Clouds. In 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 12689--12697."},{"key":"e_1_3_2_1_17_1","volume-title":"3D IoU-Net: IoU guided 3D object detector for point clouds. arXiv preprint arXiv:2004.04962","author":"Li Jiale","year":"2020","unstructured":"Jiale Li , Shujie Luo , Ziqi Zhu , Hang Dai , Andrey S Krylov , Yong Ding , and Ling Shao . 2020. 3D IoU-Net: IoU guided 3D object detector for point clouds. arXiv preprint arXiv:2004.04962 ( 2020 ). Jiale Li, Shujie Luo, Ziqi Zhu, Hang Dai, Andrey S Krylov, Yong Ding, and Ling Shao. 2020. 3D IoU-Net: IoU guided 3D object detector for point clouds. arXiv preprint arXiv:2004.04962 (2020)."},{"key":"e_1_3_2_1_18_1","volume-title":"P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection from Point Clouds","author":"Li Jiale","year":"2021","unstructured":"Jiale Li , Yu Sun , Shujie Luo , Ziqi Zhu , Hang Dai , Andrey S Krylov , Yong Ding , and Ling Shao . 2021. P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection from Point Clouds . IEEE Access ( 2021 ). Jiale Li, Yu Sun, Shujie Luo, Ziqi Zhu, Hang Dai, Andrey S Krylov, Yong Ding, and Ling Shao. 2021. P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection from Point Clouds. IEEE Access (2021)."},{"key":"e_1_3_2_1_19_1","volume-title":"2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7636--7644","author":"Li P.","unstructured":"P. Li , X. Chen , and S. Shen . 2019. Stereo R-CNN Based 3D Object Detection for Autonomous Driving . In 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7636--7644 . P. Li, X. Chen, and S. Shen. 2019. Stereo R-CNN Based 3D Object Detection for Autonomous Driving. In 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7636--7644."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00752"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01270-0_39"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2858826"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350960"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i07.6837"},{"key":"e_1_3_2_1_25_1","volume-title":"Fixing Weight Decay Regularization in Adam. CoRR abs\/1711.05101","author":"Loshchilov Ilya","year":"2017","unstructured":"Ilya Loshchilov and Frank Hutter . 2017. Fixing Weight Decay Regularization in Adam. CoRR abs\/1711.05101 ( 2017 ). arXiv:1711.05101 Ilya Loshchilov and Frank Hutter. 2017. Fixing Weight Decay Regularization in Adam. CoRR abs\/1711.05101 (2017). arXiv:1711.05101"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-66096-3_3"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00608"},{"key":"e_1_3_2_1_28_1","volume-title":"2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 918--927","author":"Qi Charles R.","unstructured":"Charles R. Qi , Wei Liu , Chenxia Wu , Hao Su , and Leonidas J. Guibas . 2018. Frustum PointNets for 3D Object Detection from RGB-D Data . In 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 918--927 . Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su, and Leonidas J. Guibas. 2018. Frustum PointNets for 3D Object Detection from RGB-D Data. In 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 918--927."},{"key":"e_1_3_2_1_29_1","volume-title":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 77--85","author":"Qi Charles Ruizhongtai","unstructured":"Charles Ruizhongtai Qi , Hao Su , Kaichun Mo , and Leonidas J. Guibas . 2017. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation . In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 77--85 . Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 77--85."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295263"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2577031"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01054"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00086"},{"key":"e_1_3_2_1_35_1","volume-title":"From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network","author":"Shi Shaoshuai","year":"2020","unstructured":"Shaoshuai Shi , Zhe Wang , Jianping Shi , Xiaogang Wang , and Hongsheng Li. 2020. From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network . IEEE Transactions on Pattern Analysis and Machine Intelligence ( 2020 ), 1--1. Shaoshuai Shi, Zhe Wang, Jianping Shi, Xiaogang Wang, and Hongsheng Li. 2020. From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020), 1--1."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00178"},{"key":"e_1_3_2_1_37_1","first-page":"1100612","article-title":"Super-convergence: Very fast training of neural networks using large learning rates","volume":"11006","author":"Smith Leslie N.","year":"2019","unstructured":"Leslie N. Smith and Nicholay Topin . 2019 . Super-convergence: Very fast training of neural networks using large learning rates . In Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications , Vol. 11006. 1100612 . Leslie N. Smith and Nicholay Topin. 2019. Super-convergence: Very fast training of neural networks using large learning rates. In Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications, Vol. 11006. 1100612.","journal-title":"Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications"},{"key":"e_1_3_2_1_38_1","volume-title":"Scalability in Perception for Autonomous Driving: Waymo Open Dataset. In 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2443--2451","author":"Sun Pei","year":"2020","unstructured":"Pei Sun , Henrik Kretzschmar , Xerxes Dotiwalla , Aurelien Chouard , Vijaysai Patnaik , Paul Tsui , James Guo , Yin Zhou , Yuning Chai , Benjamin Caine , Vijay Vasudevan , Wei Han , Jiquan Ngiam , Hang Zhao , Aleksei Timofeev , Scott Ettinger , Maxim Krivokon , Amy Gao , Aditya Joshi , Yu Zhang , Jonathon Shlens , Zhifeng Chen , and Dragomir Anguelov . 2020 . Scalability in Perception for Autonomous Driving: Waymo Open Dataset. In 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2443--2451 . Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Yu Zhang, Jonathon Shlens, Zhifeng Chen, and Dragomir Anguelov. 2020. Scalability in Perception for Autonomous Driving: Waymo Open Dataset. In 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2443--2451."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00466"},{"key":"e_1_3_2_1_40_1","volume-title":"Pillar-Based Object Detection for Autonomous Driving. In European Conference on Computer Vision (ECCV)","volume":"12367","author":"Wang Yue","year":"2020","unstructured":"Yue Wang , Alireza Fathi , Abhijit Kundu , David A. Ross , Caroline Pantofaru , Thomas A. Funkhouser , and Justin Solomon . 2020 . Pillar-Based Object Detection for Autonomous Driving. In European Conference on Computer Vision (ECCV) , Vol. 12367 . Springer, 18--34. Yue Wang, Alireza Fathi, Abhijit Kundu, David A. Ross, Caroline Pantofaru, Thomas A. Funkhouser, and Justin Solomon. 2020. Pillar-Based Object Detection for Autonomous Driving. In European Conference on Computer Vision (ECCV), Vol. 12367. Springer, 18--34."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350924"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2937676"},{"key":"e_1_3_2_1_43_1","volume-title":"Yu","author":"Wu Zonghan","year":"2020","unstructured":"Zonghan Wu , Shirui Pan , Fengwen Chen , Guodong Long , Chengqi Zhang , and Philip S . Yu . 2020 . A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems ( 2020), 1--21. Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2020. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems (2020), 1--21."},{"key":"e_1_3_2_1_44_1","volume-title":"2020 AAAI Conference on Artificial Intelligence (AAAI). 12460--12467","author":"Xie L.","unstructured":"L. Xie , C. Xiang , Z. Yu , G. Xu , Z. Yang , D. Cai , and X. He . 2020. PI-RCNN: An Efficient Multi-Sensor 3D Object Detector with Point-Based Attentive Cont- Conv Fusion Module . In 2020 AAAI Conference on Artificial Intelligence (AAAI). 12460--12467 . L. Xie, C. Xiang, Z. Yu, G. Xu, Z. Yang, D. Cai, and X. He. 2020. PI-RCNN: An Efficient Multi-Sensor 3D Object Detector with Point-Based Attentive Cont- Conv Fusion Module. In 2020 AAAI Conference on Artificial Intelligence (AAAI). 12460--12467."},{"key":"e_1_3_2_1_45_1","first-page":"3337","article-title":"SECOND","volume":"18","author":"Yan Yan","year":"2018","unstructured":"Yan Yan , Yuxing Mao , and Bo Li . 2018 . SECOND : Sparsely Embedded Convolutional Detection. Sensors 18 , 10 (2018), 3337 -- 3354 . Yan Yan, Yuxing Mao, and Bo Li. 2018. SECOND: Sparsely Embedded Convolutional Detection. Sensors 18, 10 (2018), 3337--3354.","journal-title":"Sparsely Embedded Convolutional Detection. Sensors"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01105"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00204"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00191"},{"key":"e_1_3_2_1_49_1","volume-title":"2019 Annual Conference on Robot Learning (CoRL)","volume":"100","author":"Zhou Yin","year":"2019","unstructured":"Yin Zhou , Pei Sun , Yu Zhang , Dragomir Anguelov , Jiyang Gao , Tom Ouyang , James Guo , Jiquan Ngiam , and Vijay Vasudevan . 2019 . End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds . In 2019 Annual Conference on Robot Learning (CoRL) , Vol. 100 . 923--932. Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Tom Ouyang, James Guo, Jiquan Ngiam, and Vijay Vasudevan. 2019. End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds. In 2019 Annual Conference on Robot Learning (CoRL), Vol. 100. 923--932."},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00472"}],"event":{"name":"MM '21: ACM Multimedia Conference","location":"Virtual Event China","acronym":"MM '21","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 29th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3474085.3475314","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3474085.3475314","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:49:18Z","timestamp":1750193358000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3474085.3475314"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,17]]},"references-count":50,"alternative-id":["10.1145\/3474085.3475314","10.1145\/3474085"],"URL":"https:\/\/doi.org\/10.1145\/3474085.3475314","relation":{},"subject":[],"published":{"date-parts":[[2021,10,17]]},"assertion":[{"value":"2021-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}