{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T05:34:18Z","timestamp":1774935258698,"version":"3.50.1"},"reference-count":53,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2022,7,11]],"date-time":"2022-07-11T00:00:00Z","timestamp":1657497600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada (NSERC)","doi-asserted-by":"publisher","award":["RGPIN-2021-04244"],"award-info":[{"award-number":["RGPIN-2021-04244"]}],"id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada (NSERC)","doi-asserted-by":"publisher","award":["2019-67021-28996"],"award-info":[{"award-number":["2019-67021-28996"]}],"id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000199","name":"United States Department of Agriculture (USDA)","doi-asserted-by":"publisher","award":["RGPIN-2021-04244"],"award-info":[{"award-number":["RGPIN-2021-04244"]}],"id":[{"id":"10.13039\/100000199","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000199","name":"United States Department of Agriculture (USDA)","doi-asserted-by":"publisher","award":["2019-67021-28996"],"award-info":[{"award-number":["2019-67021-28996"]}],"id":[{"id":"10.13039\/100000199","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Label assignment plays a significant role in modern object detection models. Detection models may yield totally different performances with different label assignment strategies. For anchor-based detection models, the IoU (Intersection over Union) threshold between the anchors and their corresponding ground truth bounding boxes is the key element since the positive samples and negative samples are divided by the IoU threshold. Early object detectors simply utilize the fixed threshold for all training samples, while recent detection algorithms focus on adaptive thresholds based on the distribution of the IoUs to the ground truth boxes. In this paper, we introduce a simple while effective approach to perform label assignment dynamically based on the training status with predictions. By introducing the predictions in label assignment, more high-quality samples with higher IoUs to the ground truth objects are selected as the positive samples, which could reduce the discrepancy between the classification scores and the IoU scores, and generate more high-quality boundary boxes. Our approach shows improvements in the performance of the detection models with the adaptive label assignment algorithm and lower bounding box losses for those positive samples, indicating more samples with higher-quality predicted boxes are selected as positives.<\/jats:p>","DOI":"10.3390\/jimaging8070193","type":"journal-article","created":{"date-parts":[[2022,7,11]],"date-time":"2022-07-11T21:57:53Z","timestamp":1657576673000},"page":"193","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":29,"title":["Dynamic Label Assignment for Object Detection by Combining Predicted IoUs and Anchor IoUs"],"prefix":"10.3390","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6171-3176","authenticated-orcid":false,"given":"Tianxiao","family":"Zhang","sequence":"first","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA"}]},{"given":"Bo","family":"Luo","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA"}]},{"given":"Ajay","family":"Sharda","sequence":"additional","affiliation":[{"name":"Department of Biological and Agricultural Engineering, Kansas State University, Manhattan, KS 66506, USA"}]},{"given":"Guanghui","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada"}]}],"member":"1968","published-online":{"date-parts":[[2022,7,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"154740","DOI":"10.1109\/ACCESS.2021.3128942","article-title":"Pulmonary Nodule Detection Based on Faster R-CNN With Adaptive Anchor Box","volume":"9","author":"Nguyen","year":"2021","journal-title":"IEEE Access"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"3014","DOI":"10.1109\/TSMC.2019.2917034","article-title":"A real-time robotic grasping approach with oriented anchor box","volume":"51","author":"Zhang","year":"2019","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Dewi, C., Chen, R.C., Liu, Y.T., Liu, Y.S., and Jiang, L.Q. (2020, January 22\u201324). Taiwan stop sign recognition with customize anchor. Proceedings of the 12th International Conference on Computer Modeling and Simulation, Brisbane, Australia.","DOI":"10.1145\/3408066.3408078"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1109\/TIV.2018.2804166","article-title":"Real-time obstacle detection and tracking for sense-and-avoid mechanism in UAVs","volume":"3","author":"Bharati","year":"2018","journal-title":"IEEE Trans. Intell. Veh."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Zhang, T., Zhang, X., Yang, Y., Wang, Z., and Wang, G. (2020). Efficient Golf Ball Detection and Tracking Based on Convolutional Neural Networks and Kalman Filter. arXiv.","DOI":"10.1109\/SMC42975.2020.9283312"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"107737","DOI":"10.1016\/j.patcog.2020.107737","article-title":"Deep feature augmentation for occluded image classification","volume":"111","author":"Cen","year":"2021","journal-title":"Pattern Recognit."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1016\/j.patrec.2021.12.004","article-title":"A discriminative channel diversification network for image classification","volume":"153","author":"Patel","year":"2022","journal-title":"Pattern Recognit. Lett."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"108440","DOI":"10.1016\/j.patcog.2021.108440","article-title":"Semantic clustering based deduction learning for image recognition and classification","volume":"124","author":"Ma","year":"2022","journal-title":"Pattern Recognit."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201319). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1016\/j.neucom.2021.01.126","article-title":"SOSD-Net: Joint semantic object segmentation and depth estimation from monocular images","volume":"440","author":"He","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Hemmati, M., Biglari-Abhari, M., and Niar, S. (2022). Adaptive real-time object detection for autonomous driving systems. J. Imaging, 8.","DOI":"10.3390\/jimaging8040106"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Li, K., Fathan, M.I., Patel, K., Zhang, T., Zhong, C., Bansal, A., Rastogi, A., Wang, J.S., and Wang, G. (2021). Colonoscopy Polyp Detection and Classification: Dataset Creation and Comparative Evaluations. arXiv.","DOI":"10.1371\/journal.pone.0255809"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Gosavi, D., Cheatham, B., and Sztuba-Solinska, J. (2022). Label-Free Detection of Human Coronaviruses in Infected Cells Using Enhanced Darkfield Hyperspectral Microscopy (EDHM). J. Imaging, 8.","DOI":"10.3390\/jimaging8020024"},{"key":"ref_14","first-page":"91","article-title":"Faster r-cnn: Towards real-time object detection with region proposal networks","volume":"28","author":"Ren","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201319). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 8\u201316). Ssd: Single shot multibox detector. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhang, S., Chi, C., Yao, Y., Lei, Z., and Li, S.Z. (2020, January 14\u201319). Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00978"},{"key":"ref_19","unstructured":"Zhu, B., Wang, J., Jiang, Z., Zong, F., Liu, S., Li, Z., and Sun, J. (2020). Autoassign: Differentiable label assignment for dense object detection. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ge, Z., Wang, J., Huang, X., Liu, S., and Yoshie, O. (2021). Lla: Loss-aware label assignment for dense pedestrian detection. arXiv.","DOI":"10.1016\/j.neucom.2021.07.094"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Ke, W., Zhang, T., Huang, Z., Ye, Q., Liu, J., and Huang, D. (2020, January 14\u201319). Multiple anchor learning for visual object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01022"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Li, X., Wang, W., Wu, L., Chen, S., Hu, X., Li, J., Tang, J., and Yang, J. (2020). Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. arXiv.","DOI":"10.1109\/CVPR46437.2021.01146"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zhang, H., Wang, Y., Dayoub, F., and Sunderhauf, N. (2021, January 20\u201325). Varifocalnet: An iou-aware dense object detector. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00841"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Li, K., Ma, W., Sajid, U., Wu, Y., and Wang, G. (2020). Object detection with convolutional neural networks. Deep Learning in Computer Vision, CRC Press.","DOI":"10.1201\/9781351003827-2"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"107149","DOI":"10.1016\/j.patcog.2019.107149","article-title":"Mdfn: Multi-scale deep feature learning network for object detection","volume":"100","author":"Ma","year":"2020","journal-title":"Pattern Recognit."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"993","DOI":"10.1007\/s11063-019-10124-7","article-title":"Adaptively denoising proposal collection for weakly supervised object localization","volume":"51","author":"Xu","year":"2020","journal-title":"Neural Process. Lett."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s10846-020-01287-w","article-title":"Stereo frustums: A siamese pipeline for 3d object detection","volume":"101","author":"Mo","year":"2021","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Cai, Z., Fan, Q., Feris, R.S., and Vasconcelos, N. (2016). A unified multi-scale deep convolutional neural network for fast object detection. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46493-0_22"},{"key":"ref_30","unstructured":"Dai, J., Li, Y., He, K., and Sun, J. (2016, January 5\u201310). R-fcn: Object detection via region-based fully convolutional networks. Proceedings of the 2016 Advances in Neural Information Processing Systems, Barcelona, Spain."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_33","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Law, H., and Deng, J. (2018, January 8\u201314). Cornernet: Detecting objects as paired keypoints. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_45"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., and Tian, Q. (2019, January 27\u201328). Centernet: Keypoint triplets for object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00667"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Zhou, X., Zhuo, J., and Krahenbuhl, P. (2019, January 27\u201328). Bottom-up object detection by grouping extreme and center points. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seoul, Korea.","DOI":"10.1109\/CVPR.2019.00094"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Zhu, C., He, Y., and Savvides, M. (2019, January 27\u201328). Feature selective anchor-free module for single-shot object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seoul, Korea.","DOI":"10.1109\/CVPR.2019.00093"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Tian, Z., Shen, C., Chen, H., and He, T. (2019, January 27\u201328). Fcos: Fully convolutional one-stage object detection. Proceedings of the IEEE\/CVF international conference on computer vision, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00972"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Patel, K., Bur, A.M., Li, F., and Wang, G. (2022). Aggregating Global Features into Local Vision Transformer. arXiv.","DOI":"10.1109\/ICPR56361.2022.9956379"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-end object detection with transformers. Proceedings of the European Conference on Computer Vision, Online.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_41","unstructured":"Zheng, M., Gao, P., Zhang, R., Li, K., Wang, X., Li, H., and Dong, H. (2020). End-to-end object detection with adaptive clustering transformer. arXiv."},{"key":"ref_42","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020). Deformable detr: Deformable transformers for end-to-end object detection. arXiv."},{"key":"ref_43","unstructured":"Ma, W., Zhang, T., and Wang, G. (2021). Miti-DETR: Object Detection based on Transformers with Mitigatory Self-Attention Convergence. arXiv."},{"key":"ref_44","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017, January 4\u20139). Attention is all you need. Proceedings of the 2017 Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Meng, D., Chen, X., Fan, Z., Zeng, G., Li, H., Yuan, Y., Sun, L., and Wang, J. (2021, January 11\u201317). Conditional detr for fast training convergence. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00363"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Gao, P., Zheng, M., Wang, X., Dai, J., and Li, H. (2021). Fast convergence of detr with spatially modulated co-attention. arXiv.","DOI":"10.1109\/ICCV48922.2021.00360"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Kim, K., and Lee, H.S. (2020, January 23\u201328). Probabilistic anchor assignment with iou prediction for object detection. Proceedings of the Computer Vision\u2014ECCV 2020: 16th European Conference, Glasgow, UK.","DOI":"10.1007\/978-3-030-58595-2_22"},{"key":"ref_48","unstructured":"Zhang, X., Wan, F., Liu, C., Ji, R., and Ye, Q. (2019). Freeanchor: Learning to match anchors for visual object detection. arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Li, X., Wang, W., Hu, X., Li, J., Tang, J., and Yang, J. (2021, January 19\u201324). Generalized focal loss v2: Learning reliable localization quality estimation for dense object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01146"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Li, Y., Chen, Y., Wang, N., and Zhang, Z. (2019, January 15\u201320). Scale-aware trident networks for object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Long Beach, CA, USA.","DOI":"10.1109\/ICCV.2019.00615"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Zhu, C., Chen, F., Shen, Z., and Savvides, M. (2020, January 23\u201328). Soft anchor-point object detection. Proceedings of the Computer Vision\u2014ECCV 2020: 16th European Conference, Glasgow, UK.","DOI":"10.1007\/978-3-030-58545-7_6"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Yang, Z., Liu, S., Hu, H., Wang, L., and Lin, S. (2019, January 15\u201320). Reppoints: Point set representation for object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Long Beach, CA, USA.","DOI":"10.1109\/ICCV.2019.00975"}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/8\/7\/193\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:47:59Z","timestamp":1760140079000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/8\/7\/193"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,11]]},"references-count":53,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2022,7]]}},"alternative-id":["jimaging8070193"],"URL":"https:\/\/doi.org\/10.3390\/jimaging8070193","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,11]]}}}