{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T19:13:59Z","timestamp":1771960439036,"version":"3.50.1"},"reference-count":47,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2024,12,20]],"date-time":"2024-12-20T00:00:00Z","timestamp":1734652800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62172032"],"award-info":[{"award-number":["62172032"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2025,1,31]]},"abstract":"<jats:p>\n            Fisheye object detection is challenging due to the fisheye distortion, which inclines objects to different extents and pushes extensive irrelevant pixels into the predicted horizontal bounding box (HBB). To address the problems above, we establish a new fisheye object detection dataset (named FishOBB) with compact oriented bounding box (OBB) annotations, as well as an OBB-customized mosaic augmentation technology. To our knowledge, there are very few fisheye datasets labeled by OBB, especially the open source forword view dataset like ours. Besides, we provide a fisheye object detection baseline (named FDA-YOLO) with two fisheye adaption units. Concretely, we first design a distortion orientation aggregation (DOA) unit guided by polar sampling to capture distortion-aware fisheye features. On the other hand, to transfer HBB-based detection models to OBB-based counterparts, we propose an oriented anchor attention unit. It automatically weights the unbalanced positive\/negative samples and facilitates convergence for multi-anchor models. Finally, we demonstrate that the two adaption units can be easily integrated into various anchor-based YOLO methods, e.g., ScaledYOLOv4 and YOLOv7, contributing to superior performance to existing state-of-the-art (SoTA) solutions in the proposed dataset. Meanwhile, our method has also achieved SoTA performance on other popular datasets like WEPDTOF. The dataset and code are released at\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/lukanightfever\/FishOBB\">https:\/\/github.com\/lukanightfever\/FishOBB<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3702640","type":"journal-article","created":{"date-parts":[[2024,11,2]],"date-time":"2024-11-02T15:46:04Z","timestamp":1730562364000},"page":"1-19","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Toward Oriented Fisheye Object Detection: Dataset and Baseline"],"prefix":"10.1145","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-4345-976X","authenticated-orcid":false,"given":"Jialin","family":"Yang","sequence":"first","affiliation":[{"name":"Institute of Information Science, Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing Jiaotong University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2847-0349","authenticated-orcid":false,"given":"Chunyu","family":"Lin","sequence":"additional","affiliation":[{"name":"Institute of Information Science, Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing Jiaotong University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7776-889X","authenticated-orcid":false,"given":"Lang","family":"Nie","sequence":"additional","affiliation":[{"name":"Institute of Information Science, Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing Jiaotong University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0954-3330","authenticated-orcid":false,"given":"Zisen","family":"Kong","sequence":"additional","affiliation":[{"name":"Institute of Information Science, Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing Jiaotong University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-8314-9737","authenticated-orcid":false,"given":"Jiapeng","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Information Science, Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing Jiaotong University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8581-9554","authenticated-orcid":false,"given":"Yao","family":"Zhao","sequence":"additional","affiliation":[{"name":"Institute of Information Science, Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing Jiaotong University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,12,20]]},"reference":[{"issue":"8","key":"e_1_3_1_2_2","doi-asserted-by":"crossref","first-page":"2802","DOI":"10.1109\/TITS.2018.2872502","article-title":"Rain removal in traffic surveillance: Does it matter?","volume":"20","author":"Bahnsen Chris H.","year":"2018","unstructured":"Chris H. Bahnsen and Thomas B. Moeslund. 2018. Rain removal in traffic surveillance: Does it matter? IEEE Transactions on Intelligent Transportation Systems 20, 8 (2018), 2802\u20132819.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_1_3_2","unstructured":"Alexey Bochkovskiy Chien-Yao Wang and Hong-Yuan Mark Liao. 2020. YOLOV4: Optimal speed and accuracy of object detection. arXiv:2004.10934. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.2004.10934"},{"key":"e_1_3_1_4_2","doi-asserted-by":"crossref","unstructured":"Nicolas Carion Francisco Massa Gabriel Synnaeve Nicolas Usunier Alexander Kirillov and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. arXiv:2005.12872. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.2005.12872","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2021.3105551"},{"key":"e_1_3_1_6_2","unstructured":"Terrance DeVries and Graham W. Taylor. 2017. Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.1708.04552"},{"key":"e_1_3_1_7_2","first-page":"370","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV \u201918)","author":"Du Dawei","year":"2018","unstructured":"Dawei Du, Yuankai Qi, Hongyang Yu, Yifan Yang, Kaiwen Duan, Guorong Li, Weigang Zhang, Qingming Huang, and Qi Tian. 2018. The unmanned aerial vehicle benchmark: Object detection and tracking. In Proceedings of the European Conference on Computer Vision (ECCV \u201918), 370\u2013386."},{"key":"e_1_3_1_8_2","doi-asserted-by":"crossref","unstructured":"Kaiwen Duan Song Bai Lingxi Xie Honggang Qi Qingming Huang and Qi Tian. 2019. CenterNet: Keypoint triplets for object detection. arXiv:1904.08189. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.1904.08189","DOI":"10.1109\/ICCV.2019.00667"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-014-0733-5"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/3524617"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3474085.3475707"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.169"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.81"},{"key":"e_1_3_1_15_2","first-page":"5304","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR \u201923) Workshops","author":"Gochoo Munkhjargal","year":"2023","unstructured":"Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Erkhembayar Ganbold, Jun-Wei Hsieh, Ming-Ching Chang, Ping-Yang Chen, Byambaa Dorj, Hamad A. l. Jassmi, Ganzorig Batnasan, Fady Alnajjar, Mohammed Abduljabbar, and Fang-Pang Lin. 2023. FishEye8K: A benchmark and dataset for fisheye camera object detection. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR \u201923) Workshops, 5304\u20135312."},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00868"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.446"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3570507"},{"key":"e_1_3_1_19_2","unstructured":"Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.1412.6980"},{"key":"e_1_3_1_20_2","first-page":"250","volume-title":"Photonics Applications in Astronomy, Communications, Industry, and High Energy Physics Experiments 2017","author":"Kvyetnyy Roman","year":"2017","unstructured":"Roman Kvyetnyy, Roman Maslii, Volodymyr Harmash, Ilona Bogach, Andrzej Kotyra, \u017baklin Gr\u0105dz, Aizhan Zhanpeisova, and Nursanat Askarova. 2017. Object detection in images with low light condition. In Photonics Applications in Astronomy, Communications, Industry, and High Energy Physics Experiments 2017, Vol. 10445. SPIE, 250\u2013259."},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01392"},{"key":"e_1_3_1_22_2","unstructured":"Feng Li Hao Zhang Shilong Liu Jian Guo Lionel M. Ni and Lei Zhang. 2022. DN-DETR: Accelerate DETR training by introducing query DeNoising. arXiv:2203.01305. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.2203.01305"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/3579998"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/3545609"},{"key":"e_1_3_1_25_2","unstructured":"Shilong Liu Feng Li Hao Zhang Xiao Yang Xianbiao Qi Hang Su Jun Zhu and Lei Zhang. 2022. DAB-DETR: Dynamic anchor boxes are better queries for DETR. arXiv:2201.12329. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.2201.12329"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/LGRS.2016.2565705"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2018.2848705"},{"key":"e_1_3_1_28_2","unstructured":"Chengqi Lyu Wenwei Zhang Haian Huang Yue Zhou Yudong Wang Yanyi Liu Shilong Zhang and Kai Chen. 2022. RTMDet: An empirical study of designing real-time object detectors. arXiv:2212.07784. https:\/\/doi.org\/10.48550\/arXiv.2212.07784"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i3.16336"},{"key":"e_1_3_1_30_2","first-page":"2272","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision","author":"Rashed Hazem","year":"2021","unstructured":"Hazem Rashed, Eslam Mohamed, Ganesh Sistu, Varun Ravi Kumar, Ciaran Eising, Ahmad El-Sallab, and Senthil Yogamani. 2021. Generalized object detection on fisheye cameras for autonomous driving: Dataset, representations and baseline. In Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, 2272\u20132280."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_3_1_32_2","doi-asserted-by":"crossref","unstructured":"Joseph Redmon Santosh Divvala Ross Girshick and Ali Farhadi. 2016. You only l.ook once: Unified real-time object detection. arXiv:1506.02640. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.1506.02640","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_3_1_33_2","unstructured":"Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An incremental improvement. arXiv:1804.02767. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.1804.02767"},{"key":"e_1_3_1_34_2","unstructured":"Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2016. Faster R-CNN: Towards real-time object detection with region proposal networks. arXiv:1506.01497. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.1506.01497"},{"key":"e_1_3_1_35_2","first-page":"503","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision","author":"Tezcan Ozan","year":"2022","unstructured":"Ozan Tezcan, Zhihao Duan, Mertcan Cokbas, Prakash Ishwar, and Janusz Konrad. 2022. WEPDTOF: A dataset and benchmark algorithms for in-the-wild people detection and tracking from overhead fisheye cameras. In Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, 503\u2013512."},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/3278229.3278243"},{"issue":"4","key":"e_1_3_1_37_2","first-page":"1922","article-title":"FCOS: A simple and strong anchor-free object detector","volume":"44","author":"Tian Zhi","year":"2020","unstructured":"Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. 2020. FCOS: A simple and strong anchor-free object detector. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 4 (2020), 1922\u20131933.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3573942.3574025"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01283"},{"key":"e_1_3_1_40_2","unstructured":"Chien-Yao Wang Alexey Bochkovskiy and Hong-Yuan Mark Liao. 2022. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv:2207.02696. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.2207.02696"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0259283"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00350"},{"key":"e_1_3_1_43_2","first-page":"19961","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Yang Lu","year":"2023","unstructured":"Lu Yang, Liulei Li, Xueshi Xin, Yifan Sun, Qing Song, and Wenguan Wang. 2023. Large-scale person detection and localization using overhead fisheye cameras. In Proceedings of the IEEE\/CVF International Conference on Computer Vision, 19961\u201319971."},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00940"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00612"},{"key":"e_1_3_1_46_2","unstructured":"Hongyi Zhang Moustapha Cisse Yann N. Dauphin and David Lopez-Paz. 2018. mixup: Beyond empirical risk minimization. arXiv:1710.09412. Retrieved from https:\/\/doi.org\/10.48550\/arXiv.1710.09412"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i07.6999"},{"key":"e_1_3_1_48_2","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1109\/3DV.2019.00019","volume-title":"Proceedings of the International Conference on 3D Vision (3DV \u201919)","author":"Zhou Dingfu","year":"2019","unstructured":"Dingfu Zhou, Jin Fang, Xibin Song, Chenye Guan, Junbo Yin, Yuchao Dai, and Ruigang Yang. 2019. IoU loss for 2d\/3d object detection. In Proceedings of the International Conference on 3D Vision (3DV \u201919). IEEE, 85\u201394."}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3702640","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3702640","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:18:03Z","timestamp":1750295883000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3702640"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,20]]},"references-count":47,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,1,31]]}},"alternative-id":["10.1145\/3702640"],"URL":"https:\/\/doi.org\/10.1145\/3702640","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,20]]},"assertion":[{"value":"2023-10-15","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-10-23","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-12-20","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}