{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T07:20:55Z","timestamp":1772176855427,"version":"3.50.1"},"reference-count":74,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2021,9,3]],"date-time":"2021-09-03T00:00:00Z","timestamp":1630627200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Mobile robotics in forests is currently a hugely important topic due to the recurring appearance of forest wildfires. Thus, in-site management of forest inventory and biomass is required. To tackle this issue, this work presents a study on detection at the ground level of forest tree trunks in visible and thermal images using deep learning-based object detection methods. For this purpose, a forestry dataset composed of 2895 images was built and made publicly available. Using this dataset, five models were trained and benchmarked to detect the tree trunks. The selected models were SSD MobileNetV2, SSD Inception-v2, SSD ResNet50, SSDLite MobileDet and YOLOv4 Tiny. Promising results were obtained; for instance, YOLOv4 Tiny was the best model that achieved the highest AP (90%) and F1 score (89%). The inference time was also evaluated, for these models, on CPU and GPU. The results showed that YOLOv4 Tiny was the fastest detector running on GPU (8 ms). This work will enhance the development of vision perception systems for smarter forestry robots.<\/jats:p>","DOI":"10.3390\/jimaging7090176","type":"journal-article","created":{"date-parts":[[2021,9,6]],"date-time":"2021-09-06T13:15:56Z","timestamp":1630934156000},"page":"176","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":35,"title":["Visible and Thermal Image-Based Trunk Detection with Deep Learning for Forestry Mobile Robotics"],"prefix":"10.3390","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9999-1550","authenticated-orcid":false,"given":"Daniel Queir\u00f3s","family":"da Silva","sequence":"first","affiliation":[{"name":"INESC Technology and Science (INESC TEC), 4200-465 Porto, Portugal"},{"name":"School of Science and Technology, University of Tr\u00e1s-os-Montes e Alto Douro (UTAD), 5000-801 Vila Real, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8486-6113","authenticated-orcid":false,"given":"Filipe Neves","family":"dos Santos","sequence":"additional","affiliation":[{"name":"INESC Technology and Science (INESC TEC), 4200-465 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0317-4714","authenticated-orcid":false,"given":"Armando Jorge","family":"Sousa","sequence":"additional","affiliation":[{"name":"INESC Technology and Science (INESC TEC), 4200-465 Porto, Portugal"},{"name":"Faculty of Engineering, University of Porto (FEUP), 4200-465 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3747-6577","authenticated-orcid":false,"given":"V\u00edtor","family":"Filipe","sequence":"additional","affiliation":[{"name":"INESC Technology and Science (INESC TEC), 4200-465 Porto, Portugal"},{"name":"School of Science and Technology, University of Tr\u00e1s-os-Montes e Alto Douro (UTAD), 5000-801 Vila Real, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2021,9,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"10822","DOI":"10.3182\/20080706-5-KR-1001.01833","article-title":"BigDog, the Rough-Terrain Quadruped Robot","volume":"41","author":"Raibert","year":"2008","journal-title":"IFAC Proc. Vol."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Wooden, D., Malchano, M., Blankespoor, K., Howardy, A., Rizzi, A.A., and Raibert, M. (2010, January 3\u20137). Autonomous navigation for BigDog. Proceedings of the 2010 IEEE International Conference on Robotics and Automation, Anchorage, AK, USA.","DOI":"10.1109\/ROBOT.2010.5509226"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Alberts, J., Edwards, D., Soule, T., Anderson, M., and O\u2019Rourke, M. (2008, January 6\u20138). Autonomous Navigation of an Unmanned Ground Vehicle in Unstructured Forest Terrain. Proceedings of the 2008 ECSIS Symposium on Learning and Adaptive Behaviors for Robotic Systems (LAB-RS), Edinburgh, UK.","DOI":"10.1109\/LAB-RS.2008.25"},{"key":"ref_4","unstructured":"Teoh, C., Tan, C., Tan, Y.C., and Wang, X. (2010, January 28\u201330). Preliminary study on visual guidance for autonomous vehicle in rain forest terrain. Proceedings of the 2010 IEEE Conference on Robotics, Automation and Mechatronics, Singapore."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"350","DOI":"10.1080\/02827581.2011.566889","article-title":"Path tracking in forest terrain by an autonomous forwarder","volume":"26","author":"Ringdahl","year":"2011","journal-title":"Scand. J. For. Res."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Ali, W., Georgsson, F., and Hellstrom, T. (2008, January 4\u20136). Visual tree detection for autonomous navigation in forest environment. Proceedings of the 2008 IEEE Intelligent Vehicles Symposium, Eindhoven, The Netherlands.","DOI":"10.1109\/IVS.2008.4621315"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1016\/j.ifacol.2019.12.517","article-title":"The development of autonomous navigation and obstacle avoidance for a robotic mower using machine vision technique","volume":"52","author":"Inoue","year":"2019","journal-title":"IFAC-PapersOnLine"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1007\/s10846-015-0292-1","article-title":"Autonomous Navigation of UAV in Foliage Environment","volume":"84","author":"Cui","year":"2016","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhilenkov, A.A., and Epifantsev, I.R. (February, January 29). System of autonomous navigation of the drone in difficult conditions of the forest trails. Proceedings of the 2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), Moscow and St. Petersburg, Russia.","DOI":"10.1109\/EIConRus.2018.8317266"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Michels, J., Saxena, A., and Ng, A.Y. (2005, January 7\u201311). High Speed Obstacle Avoidance Using Monocular Vision and Reinforcement Learning. Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany.","DOI":"10.1145\/1102351.1102426"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"480","DOI":"10.1016\/j.ifacol.2018.05.081","article-title":"Vision-based Control for Aerial Obstacle Avoidance in Forest Environments","volume":"51","author":"Mannar","year":"2018","journal-title":"IFAC-PapersOnLine"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Dionisio-Ortega, S., Rojas-Perez, L.O., Martinez-Carranza, J., and Cruz-Vega, I. (2018, January 21\u201323). A deep learning approach towards autonomous flight in forest environments. Proceedings of the 2018 International Conference on Electronics, Communications and Computers (CONIELECOMP), Cholula, Mexico.","DOI":"10.1109\/CONIELECOMP.2018.8327189"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Sampaio, G.S., Silva, L.A., and Marengoni, M. (2021). 3D Reconstruction of Non-Rigid Plants and Sensor Data Fusion for Agriculture Phenotyping. Sensors, 21.","DOI":"10.3390\/s21124115"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Bietresato, M., Carabin, G., D\u2019Auria, D., Gallo, R., Ristorto, G., Mazzetto, F., Vidoni, R., Gasparetto, A., and Scalera, L. (2016, January 29\u201331). A tracked mobile robotic lab for monitoring the plants volume and health. Proceedings of the 2016 12th IEEE\/ASME International Conference on Mechatronic and Embedded Systems and Applications (MESA), Auckland, New Zealand.","DOI":"10.1109\/MESA.2016.7587134"},{"key":"ref_15","first-page":"661","article-title":"A mobile laboratory for orchard health status monitoring in precision farming","volume":"58","author":"Ristorto","year":"2017","journal-title":"Chem. Eng. Trans."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Wang, L., Xiang, L., Tang, L., and Jiang, H. (2021). A Convolutional Neural Network-Based Method for Corn Stand Counting in the Field. Sensors, 21.","DOI":"10.3390\/s21020507"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Mendes, J., Neves dos Santos, F., Ferraz, N., Couto, P., and Morais, R. (2016, January 4\u20136). Vine Trunk Detector for a Reliable Robot Localization System. Proceedings of the 2016 International Conference on Autonomous Robot Systems and Competitions (ICARSC), Bragan\u00e7a, Portugal.","DOI":"10.1109\/ICARSC.2016.68"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"77308","DOI":"10.1109\/ACCESS.2020.2989052","article-title":"Visual Trunk Detection Using Transfer Learning and a Deep Learning-Based Coprocessor","volume":"8","author":"Aguiar","year":"2020","journal-title":"IEEE Access"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"105535","DOI":"10.1016\/j.compag.2020.105535","article-title":"Vineyard trunk detection using deep learning\u2014An experimental device benchmark","volume":"175","year":"2020","journal-title":"Comput. Electron. Agric."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Aguiar, A.S., Monteiro, N.N., Santos, F.N.D., Solteiro Pires, E.J., Silva, D., Sousa, A.J., and Boaventura-Cunha, J. (2021). Bringing Semantics to the Vineyard: An Approach on Deep Learning-Based Vine Trunk Detection. Agriculture, 11.","DOI":"10.3390\/agriculture11020131"},{"key":"ref_21","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). ImageNet Classification with Deep Convolutional Neural Networks. Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Itakura, K., and Hosoi, F. (2020). Automatic Tree Detection from Three-Dimensional Images Reconstructed from 360\u00b0 Spherical Camera Using YOLO v2. Remote Sens., 12.","DOI":"10.3390\/rs12060988"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"5395","DOI":"10.1109\/TIM.2019.2958580","article-title":"Detecting Trees in Street Images via Deep Learning With Attention Module","volume":"69","author":"Xie","year":"2020","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_24","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Proceedings of the Neural Information Processing Systems (NIPS), Montreal, QC, Canada."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"540","DOI":"10.2174\/1874110X01408010540","article-title":"A region-based image fusion algorithm for detecting trees in forests","volume":"8","author":"Yu","year":"2014","journal-title":"Open Cybern. Syst. J."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Wan Mohd Jaafar, W.S., Woodhouse, I.H., Silva, C.A., Omar, H., Abdul Maulud, K.N., Hudak, A.T., Klauberg, C., Cardil, A., and Mohan, M. (2018). Improving Individual Tree Crown Delineation and Attributes Estimation of Tropical Forests Using Airborne LiDAR Data. Forests, 9.","DOI":"10.3390\/f9120759"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1016\/j.isprsjprs.2020.11.016","article-title":"Combining graph-cut clustering with object-based stem detection for tree segmentation in highly dense airborne lidar point clouds","volume":"172","author":"Dersch","year":"2021","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1721","DOI":"10.3390\/f6051721","article-title":"A Benchmark of Lidar-Based Single Tree Detection Methods Using Heterogeneous Forest Data from the Alpine Space","volume":"6","author":"Eysn","year":"2015","journal-title":"Forests"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Dong, T., Zhou, Q., Gao, S., and Shen, Y. (2018). Automatic Detection of Single Trees in Airborne Laser Scanning Data through Gradient Orientation Clustering. Forests, 9.","DOI":"10.3390\/f9060291"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.3390\/f5051011","article-title":"Assessment of Low Density Full-Waveform Airborne Laser Scanning for Individual Tree Detection and Tree Species Classification","volume":"5","author":"Yu","year":"2014","journal-title":"Forests"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"118986","DOI":"10.1016\/j.foreco.2021.118986","article-title":"Application of conventional UAV-based high-throughput object detection to the early diagnosis of pine wilt disease by deep learning","volume":"486","author":"Wu","year":"2021","journal-title":"For. Ecol. Manag."},{"key":"ref_32","first-page":"1","article-title":"Measuring loblolly pine crowns with drone imagery through deep learning","volume":"32","author":"Lou","year":"2021","journal-title":"J. For. Res."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Li, W., Fu, H., and Yu, L. (2017, January 23\u201328). Deep convolutional neural network based large-scale oil palm tree detection for high-resolution remote sensing images. Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA.","DOI":"10.1109\/IGARSS.2017.8127085"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1016\/j.rse.2007.02.029","article-title":"Single tree detection in very high resolution remote sensing data","volume":"110","author":"Hirschmugl","year":"2007","journal-title":"Remote Sens. Environ."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"e005","DOI":"10.5424\/fs\/2018272-11713","article-title":"Estimating forest uniformity in Eucalyptus spp. and Pinus taeda L. stands using field measurements and structure from motion point clouds generated from unmanned aerial vehicle (UAV) data collection","volume":"27","author":"Silva","year":"2018","journal-title":"For. Syst."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Fujimoto, A., Haga, C., Matsui, T., Machimura, T., Hayashi, K., Sugita, S., and Takagi, H. (2019). An End to End Process Development for UAV-SfM Based Forest Monitoring: Individual Tree Detection, Species Classification and Carbon Dynamics Simulation. Forests, 10.","DOI":"10.3390\/f10080680"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"012041","DOI":"10.1088\/1755-1315\/37\/1\/012041","article-title":"Development of young oil palm tree recognition using Haar- based rectangular windows","volume":"37","author":"Daliman","year":"2016","journal-title":"IOP Conf. Ser. Earth Environ. Sci."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Tianyang, D., Jian, Z., Sibin, G., Ying, S., and Jing, F. (2018). Single-Tree Detection in High-Resolution Remote-Sensing Images Based on a Cascade Neural Network. ISPRS Int. J. Geo-Inf., 7.","DOI":"10.3390\/ijgi7090367"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"118397","DOI":"10.1016\/j.foreco.2020.118397","article-title":"Individual tree detection and species classification of Amazonian palms using UAV images and deep learning","volume":"475","author":"Ferreira","year":"2020","journal-title":"For. Ecol. Manag."},{"key":"ref_40","first-page":"1","article-title":"Detection of Diseased Pine Trees in Unmanned Aerial Vehicle Images by using Deep Convolutional Neural Networks","volume":"35","author":"Hu","year":"2020","journal-title":"Geocarto Int."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Wang, K., Wang, T., and Liu, X. (2019). A Review: Individual Tree Species Classification Using Integrated Airborne LiDAR and Optical Imagery with a Focus on the Urban Environment. Forests, 10.","DOI":"10.3390\/f10010001"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Surov\u00fd, P., and Ku\u017eelka, K. (2019). Acquisition of Forest Attributes for Decision Support at the Forest Enterprise Level Using Remote-Sensing Techniques\u2014A Review. Forests, 10.","DOI":"10.3390\/f10030273"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Deng, S., Katoh, M., Yu, X., Hyypp\u00e4, J., and Gao, T. (2016). Comparison of Tree Species Classifications at the Individual Tree Level by Combining ALS Data and RGB Images Using Different Algorithms. Remote Sens., 8.","DOI":"10.3390\/rs8121034"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Liu, J., Feng, Z., Yang, L., Mannan, A., Khan, T.U., Zhao, Z., and Cheng, Z. (2018). Extraction of Sample Plot Parameters from 3D Point Cloud Reconstruction Based on Combined RTK and CCD Continuous Photography. Remote Sens., 10.","DOI":"10.3390\/rs10081299"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"4415","DOI":"10.1109\/JSTARS.2019.2950721","article-title":"Characterizing Tree Species of a Tropical Wetland in Southern China at the Individual Tree Level Based on Convolutional Neural Network","volume":"12","author":"Sun","year":"2019","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016). SSD: Single Shot MultiBox Detector. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018, January 18\u201323). MobileNetV2: Inverted Residuals and Linear Bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_49","unstructured":"Ioffe, S., and Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Xiong, Y., Liu, H., Gupta, S., Akin, B., Bender, G., Wang, Y., Kindermans, P.J., Tan, M., Singh, V., and Chen, B. (2021). MobileDets: Searching for Object Detection Architectures for Mobile Accelerators. arXiv.","DOI":"10.1109\/CVPR46437.2021.00382"},{"key":"ref_51","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv."},{"key":"ref_52","unstructured":"Simonyan, K., and Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv."},{"key":"ref_53","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv."},{"key":"ref_54","unstructured":"Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., and Vasudevan, V. (November, January 27). Searching for MobileNetV3. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2014). Going Deeper with Convolutions. arXiv.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2015). Rethinking the Inception Architecture for Computer Vision. arXiv.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 22\u201325). YOLO9000: Better, faster, stronger. Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_59","unstructured":"Redmon, J., and Farhadi, A. (2020, June 09). YOLO v.3. Technical Report, University of Washington. Available online: https:\/\/pjreddie.com\/media\/files\/papers\/YOLOv3.pdf."},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Liao, H.Y.M., Yeh, I.H., Wu, Y.H., Chen, P.Y., and Hsieh, J.W. (2020, January 16\u201318). CSPNet: A New Backbone that can Enhance Learning Capability of CNN. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Virtual.","DOI":"10.1109\/CVPRW50498.2020.00203"},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2014). Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-10578-9_23"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201323). Path Aggregation Network for Instance Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1007\/s13735-020-00195-x","article-title":"A survey on instance segmentation: State of the art","volume":"9","author":"Hafiz","year":"2020","journal-title":"Int. J. Multimed. Inf. Retr."},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast R-CNN. Proceedings of the International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask R-CNN. Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Hermans, A., Papandreou, G., Schroff, F., Wang, P., and Adam, H. (2018, January 18\u201323). MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00422"},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Huang, Z., Huang, L., Gong, Y., Huang, C., and Wang, X. (2019). Mask Scoring R-CNN. arXiv.","DOI":"10.1109\/CVPR.2019.00657"},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017). Feature Pyramid Networks for Object Detection. arXiv.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_70","doi-asserted-by":"crossref","unstructured":"Bolya, D., Zhou, C., Xiao, F., and Lee, Y.J. (2019). YOLACT: Real-time Instance Segmentation. arXiv.","DOI":"10.1109\/ICCV.2019.00925"},{"key":"ref_71","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The Pascal Visual Object Classes (VOC) Challenge","volume":"8","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C.L., and Doll\u00e1r, P. (2015). Microsoft COCO: Common Objects in Context. arXiv.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_73","doi-asserted-by":"crossref","unstructured":"Padilla, R., Netto, S.L., and da Silva, E.A.B. (2020, January 1\u20133). A Survey on Performance Metrics for Object-Detection Algorithms. Proceedings of the 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), Niteroi, Brazil.","DOI":"10.1109\/IWSSIP48289.2020.9145130"},{"key":"ref_74","doi-asserted-by":"crossref","unstructured":"Padilla, R., Passos, W.L., Dias, T.L.B., Netto, S.L., and da Silva, E.A.B. (2021). A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit. Electronics, 10.","DOI":"10.3390\/electronics10030279"}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/7\/9\/176\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:56:10Z","timestamp":1760165770000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/7\/9\/176"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,3]]},"references-count":74,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2021,9]]}},"alternative-id":["jimaging7090176"],"URL":"https:\/\/doi.org\/10.3390\/jimaging7090176","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,3]]}}}