{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T21:09:43Z","timestamp":1770844183727,"version":"3.50.1"},"reference-count":45,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2022,10,22]],"date-time":"2022-10-22T00:00:00Z","timestamp":1666396800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Portuguese national project VOAMAIS","award":["Lisboa-01-0145-FEDER-031172"],"award-info":[{"award-number":["Lisboa-01-0145-FEDER-031172"]}]},{"name":"Portuguese national project VOAMAIS","award":["PTDC\/EEI-AUT\/31172\/2017"],"award-info":[{"award-number":["PTDC\/EEI-AUT\/31172\/2017"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>This work proposes a new system capable of real-time ship instance segmentation during maritime surveillance missions by unmanned aerial vehicles using an onboard standard RGB camera. The implementation requires two stages: an instance segmentation network able to produce fast and reliable preliminary segmentation results and a post-processing 3D fully connected Conditional Random Field, which significantly improves segmentation results by exploring temporal correlations between nearby frames in video sequences. Moreover, due to the absence of maritime datasets consisting of properly labeled video sequences, we create a new dataset comprising synthetic video sequences of maritime surveillance scenarios (MarSyn). The main advantages of this approach are the possibility of generating a vast set of images and videos, being able to represent real-world scenarios without the necessity of deploying the real vehicle, and automatic labels, which eliminate human labeling errors. We train the system with the MarSyn dataset and with aerial footage from publicly available annotated maritime datasets to validate the proposed approach. We present some experimental results and compare them to other approaches, and we also illustrate the temporal stability provided by the second stage in missing frames and wrong segmentation scenarios.<\/jats:p>","DOI":"10.3390\/s22218090","type":"journal-article","created":{"date-parts":[[2022,10,24]],"date-time":"2022-10-24T10:09:23Z","timestamp":1666606163000},"page":"8090","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Real-Time Ship Segmentation in Maritime Surveillance Videos Using Automatically Annotated Synthetic Datasets"],"prefix":"10.3390","volume":"22","author":[{"given":"Miguel","family":"Ribeiro","sequence":"first","affiliation":[{"name":"ISR\u2014Institute for Systems and Robotics, 1049-001 Lisboa, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8356-2962","authenticated-orcid":false,"given":"Bruno","family":"Damas","sequence":"additional","affiliation":[{"name":"ISR\u2014Institute for Systems and Robotics, 1049-001 Lisboa, Portugal"},{"name":"CINAV\u2014Centro de Investiga\u00e7\u00e3o Naval, 2810-001 Almada, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3991-1269","authenticated-orcid":false,"given":"Alexandre","family":"Bernardino","sequence":"additional","affiliation":[{"name":"ISR\u2014Institute for Systems and Robotics, 1049-001 Lisboa, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2022,10,22]]},"reference":[{"key":"ref_1","unstructured":"United Nations Review of Maritime Transport 2018, UN. Available online: https:\/\/unctad.org\/system\/files\/official-document\/rmt2018_en.pdf."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Gallego, A.J., Pertusa, A., and Gil, P. (2018). Automatic Ship Classification from Optical Aerial Images with Convolutional Neural Networks. Remote Sens., 10.","DOI":"10.3390\/rs10040511"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Huo, W., Huang, Y., Pei, J., Zhang, Q., Gu, Q., and Yang, J. (2018). Ship Detection from Ocean SAR Image Based on Local Contrast Variance Weighted Information Entropy. Sensors, 18.","DOI":"10.3390\/s18041196"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"2720","DOI":"10.1109\/TCSVT.2017.2775524","article-title":"A Data Set for Airborne Maritime Surveillance Environments","volume":"29","author":"Ribeiro","year":"2019","journal-title":"IEEE Trans. Circ. Syst. Video Technol."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Galdelli, A., Mancini, A., Ferr\u00e0, C., and Tassetti, A.N. (2021). A Synergic Integration of AIS Data and SAR Imagery to Monitor Fisheries and Detect Suspicious Activities. Sensors, 21.","DOI":"10.3390\/s21082756"},{"key":"ref_6","unstructured":"Airbus (2020, January 06). Airbus Ship Detection Challenge. Available online: https:\/\/www.kaggle.com\/c\/airbus-ship-detection\/overview."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Teixeira, E., Araujo, B., Costa, V., Mafra, S., and Figueiredo, F. (2022). Literature Review on Ship Localization, Classification, and Detection Methods Based on Optical Sensors and Neural Networks. Sensors, 22.","DOI":"10.3390\/s22186879"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Cruz, G., and Bernardino, A. (2016, January 24\u201327). Aerial detection in maritime scenarios using convolutional neural networks. Proceedings of the International Conference on Advanced Concepts for Intelligent Vision Systems, Lecce, Italy.","DOI":"10.1007\/978-3-319-48680-2_33"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Kang, M., Leng, X., Lin, Z., and Ji, K. (2017, January 19\u201321). A modified faster R-CNN based on CFAR algorithm for SAR ship detection. Proceedings of the 2017 International Workshop on Remote Sensing with Intelligent Processing (RSIP), Shanghai, China.","DOI":"10.1109\/RSIP.2017.7958815"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Yang, X., Sun, H., Fu, K., Yang, J., Sun, X., Yan, M., and Guo, Z. (2018). Automatic Ship Detection in Remote Sensing Images from Google Earth of Complex Scenes Based on Multiscale Rotation Dense Feature Pyramid Networks. Remote Sens., 10.","DOI":"10.3390\/rs10010132"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Matos, J., Bernardino, A., and Ribeiro, R. (2016, January 19\u201323). Robust tracking of vessels in oceanographic airborne images. Proceedings of the OCEANS 2016 MTS\/IEEE Monterey, Monterey, CA, USA.","DOI":"10.1109\/OCEANS.2016.7761468"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1109\/TSMC.1979.4310076","article-title":"A Threshold Selection Method from Gray-Level Histograms","volume":"9","author":"Otsu","year":"1979","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_13","unstructured":"Cruz, G., and Bernardino, A. (2017, January 18\u201321). Evaluating aerial vessel detector in multiple maritime surveillance scenarios. Proceedings of the OCEANS 2017, Anchorage, AL, USA."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"6565","DOI":"10.1109\/TGRS.2019.2907277","article-title":"Learning Temporal Features for Detection on Maritime Airborne Video Sequences Using Convolutional LSTM","volume":"57","author":"Cruz","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully Convolutional Networks for Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Dollar, P., and Girshick, R. (2017, January 22\u201329). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_19","unstructured":"Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., and Garnett, R. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Advances in Neural Information Processing Systems 28, Curran Associates, Inc."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs","volume":"40","author":"Chen","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1108","DOI":"10.1109\/TPAMI.2020.3014297","article-title":"YOLACT++ Better Real-Time Instance Segmentation","volume":"44","author":"Bolya","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_22","unstructured":"Venkatesh, R., and M, A. (2019). Segmenting Ships in Satellite Imagery with Squeeze and Excitation U-Net. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"9325","DOI":"10.1109\/ACCESS.2020.2964540","article-title":"Attention Mask R-CNN for Ship Detection and Segmentation From Remote Sensing Images","volume":"8","author":"Nie","year":"2020","journal-title":"IEEE Access"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"31942","DOI":"10.1109\/ACCESS.2022.3159667","article-title":"An Efficient Cascaded Model for Ship Segmentation in Aerial Images","volume":"10","author":"Pires","year":"2022","journal-title":"IEEE Access"},{"key":"ref_25","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Milletari, F., Navab, N., and Ahmadi, S.A. (2016, January 25\u201328). V-net: Fully convolutional neural networks for volumetric medical image segmentation. Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA.","DOI":"10.1109\/3DV.2016.79"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Liu, Y., Shen, C., Yu, C., and Wang, J. (2020, January 23\u201328). Efficient semantic video segmentation with per-frame inference. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58607-2_21"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Bloisi, D.D., Iocchi, L., Pennisi, A., and Tombolini, L. (2015, January 25\u201328). ARGOS-Venice Boat Classification. Proceedings of the 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Karlsruhe, Germany.","DOI":"10.1109\/AVSS.2015.7301727"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Gundogdu, E., Solmaz, B., Y\u00fccesoy, V., and Koc, A. (2016, January 20\u201324). Marvel: A large-scale image dataset for maritime vessels. Proceedings of the Asian Conference on Computer Vision, Taipei, Taiwan.","DOI":"10.1007\/978-3-319-54193-8_11"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1993","DOI":"10.1109\/TITS.2016.2634580","article-title":"Video Processing From Electro-Optical Sensors for Object Detection and Tracking in a Maritime Environment: A Survey","volume":"18","author":"Prasad","year":"2017","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"2593","DOI":"10.1109\/TMM.2018.2865686","article-title":"SeaShips: A Large-Scale Precisely Annotated Dataset for Ship Detection","volume":"20","author":"Shao","year":"2018","journal-title":"IEEE Trans. Multimed."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Iancu, B., Soloviev, V., Zelioli, L., and Lilius, J. (2021). Aboships-an inshore and offshore maritime vessel detection dataset with precise annotations. Remote Sens., 13.","DOI":"10.3390\/rs13050988"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Di, Y., Jiang, Z., and Zhang, H. (2021). A public dataset for fine-grained ship classification in optical remote sensing images. Remote Sens., 13.","DOI":"10.3390\/rs13040747"},{"key":"ref_34","unstructured":"Kr\u00e4henb\u00fchl, P., and Koltun, V. (2011, January 12\u201315). Efficient inference in fully connected crfs with gaussian edge potentials. Proceedings of the 24th International Conference on Neural Information Processing Systems, Granada, Spain."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1186\/s40537-019-0197-0","article-title":"A survey on image data augmentation for deep learning","volume":"6","author":"Shorten","year":"2019","journal-title":"J. Big Data"},{"key":"ref_36","unstructured":"Armstrong, W., Draktontaidis, S., and Lui, N. (2021). Semantic Image Segmentation of Imagery of Unmanned Spacecraft Using Synthetic Data, Stanford University. Technical Report."},{"key":"ref_37","unstructured":"Community, B.O. (2018). Blender\u2014A 3D Modelling and Rendering Package, Blender Foundation, Stichting Blender Foundation."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Liu, D., Xie, S., Li, Y., Zhao, D., and El-Alfy, E.S.M. (2017). Training Deep Neural Networks for Detecting Drinking Glasses Using Synthetic Images. Neural Information Processing, Springer International Publishing.","DOI":"10.1007\/978-3-319-70139-4"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhao, K., Zhang, R., and Ji, J. (2021). A Cascaded Model Based on EfficientDet and YOLACT++ for Instance Segmentation of Cow Collar ID Tag in an Image. Sensors, 21.","DOI":"10.3390\/s21206734"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Huang, M., Xu, G., Li, J., and Huang, J. (2021). A Method for Segmenting Disease Lesions of Maize Leaves in Real Time Using Attention YOLACT++. Agriculture, 11.","DOI":"10.3390\/agriculture11121216"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Dollar, P. (2017, January 22\u201329). Focal Loss for Dense Object Detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016). SSD: Single Shot MultiBox Detector. Computer Vision\u2014ECCV 2016, Springer International Publishing.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1016\/j.media.2016.10.004","article-title":"Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation","volume":"36","author":"Kamnitsas","year":"2017","journal-title":"Med. Image Anal."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/21\/8090\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:00:48Z","timestamp":1760144448000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/21\/8090"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,22]]},"references-count":45,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["s22218090"],"URL":"https:\/\/doi.org\/10.3390\/s22218090","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,22]]}}}