{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T01:32:06Z","timestamp":1760059926585,"version":"build-2065373602"},"reference-count":179,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2025,7,18]],"date-time":"2025-07-18T00:00:00Z","timestamp":1752796800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia (FCT)","doi-asserted-by":"publisher","award":["UIDB\/04111\/2020","UIDB\/00066\/2020","UID\/00408\/2025","CEECINST\/00002\/2021\/CP2788\/CT0001","COFAC\/ILIND\/COPELABS\/1\/2024"],"award-info":[{"award-number":["UIDB\/04111\/2020","UIDB\/00066\/2020","UID\/00408\/2025","CEECINST\/00002\/2021\/CP2788\/CT0001","COFAC\/ILIND\/COPELABS\/1\/2024"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"Centro de Tecnologias e Sistemas (CTS)","doi-asserted-by":"publisher","award":["UIDB\/04111\/2020","UIDB\/00066\/2020","UID\/00408\/2025","CEECINST\/00002\/2021\/CP2788\/CT0001","COFAC\/ILIND\/COPELABS\/1\/2024"],"award-info":[{"award-number":["UIDB\/04111\/2020","UIDB\/00066\/2020","UID\/00408\/2025","CEECINST\/00002\/2021\/CP2788\/CT0001","COFAC\/ILIND\/COPELABS\/1\/2024"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"name":"LASIGE Research Unit","award":["UIDB\/04111\/2020","UIDB\/00066\/2020","UID\/00408\/2025","CEECINST\/00002\/2021\/CP2788\/CT0001","COFAC\/ILIND\/COPELABS\/1\/2024"],"award-info":[{"award-number":["UIDB\/04111\/2020","UIDB\/00066\/2020","UID\/00408\/2025","CEECINST\/00002\/2021\/CP2788\/CT0001","COFAC\/ILIND\/COPELABS\/1\/2024"]}]},{"DOI":"10.13039\/501100001871","name":"COFAC","doi-asserted-by":"publisher","award":["UIDB\/04111\/2020","UIDB\/00066\/2020","UID\/00408\/2025","CEECINST\/00002\/2021\/CP2788\/CT0001","COFAC\/ILIND\/COPELABS\/1\/2024"],"award-info":[{"award-number":["UIDB\/04111\/2020","UIDB\/00066\/2020","UID\/00408\/2025","CEECINST\/00002\/2021\/CP2788\/CT0001","COFAC\/ILIND\/COPELABS\/1\/2024"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Instituto Lus\u00f3fono de Investiga\u00e7\u00e3o e Desenvolvimento (ILIND)","award":["UIDB\/04111\/2020","UIDB\/00066\/2020","UID\/00408\/2025","CEECINST\/00002\/2021\/CP2788\/CT0001","COFAC\/ILIND\/COPELABS\/1\/2024"],"award-info":[{"award-number":["UIDB\/04111\/2020","UIDB\/00066\/2020","UID\/00408\/2025","CEECINST\/00002\/2021\/CP2788\/CT0001","COFAC\/ILIND\/COPELABS\/1\/2024"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Eng"],"abstract":"<jats:p>Semantic segmentation is a vast field with many contributions, which can be difficult to organize and comprehend due to the amount of research available. Advancements in technology and processing power over the past decade have led to a significant increase in the number of developed models and architectures. This paper provides a brief perspective on 2D segmentation by summarizing the mechanisms of various neural network models and the tools and datasets used for their training, testing, and evaluation. Additionally, this paper discusses methods for identifying new architectures, such as Neural Architecture Search, and explores the emerging research field of continuous learning, which aims to develop models capable of learning continuously from new data.<\/jats:p>","DOI":"10.3390\/eng6070165","type":"journal-article","created":{"date-parts":[[2025,7,18]],"date-time":"2025-07-18T14:03:51Z","timestamp":1752847431000},"page":"165","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A Brief Perspective on Deep Learning Approaches for 2D Semantic Segmentation"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3922-2124","authenticated-orcid":false,"given":"Shazia","family":"Sulemane","sequence":"first","affiliation":[{"name":"Escola de Comunica\u00e7\u00e3o, Arquitectura, Artes e Tecnologias da Informa\u00e7\u00e3o (ECATI), Lus\u00f3fona University, 1749-024 Lisboa, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8487-5837","authenticated-orcid":false,"given":"Nuno","family":"Fachada","sequence":"additional","affiliation":[{"name":"Escola de Comunica\u00e7\u00e3o, Arquitectura, Artes e Tecnologias da Informa\u00e7\u00e3o (ECATI), Lus\u00f3fona University, 1749-024 Lisboa, Portugal"},{"name":"Center of Technology and Systems (UNINOVA-CTS) and Associated Lab of Intelligent Systems (LASI), 2829-516 Caparica, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9409-7736","authenticated-orcid":false,"given":"Jo\u00e3o P.","family":"Matos-Carvalho","sequence":"additional","affiliation":[{"name":"Center of Technology and Systems (UNINOVA-CTS) and Associated Lab of Intelligent Systems (LASI), 2829-516 Caparica, Portugal"},{"name":"LASIGE, Faculdade de Ci\u00eancias, Universidade de Lisboa, 1749-016 Lisboa, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2025,7,18]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Xian, M., Zhang, Y., Cheng, H., Xu, F., Zhang, B., and Ding, J. (2017). Automatic Breast Ultrasound Image Segmentation: A Survey. arXiv.","DOI":"10.1016\/j.patcog.2018.02.012"},{"key":"ref_3","first-page":"167","article-title":"Real-time object detection and semantic segmentation for autonomous driving","volume":"Volume 10608","author":"Liu","year":"2018","journal-title":"Proceedings of the MIPPR 2017: Automatic Target Recognition and Navigation"},{"key":"ref_4","unstructured":"Yasuno, M., Yasuda, N., and Aoki, M. (July, January 27). Pedestrian detection and tracking in far infrared images. Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop, Washington, DC, USA."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Jain, L.C., Peng, S.L., Alhadidi, B., and Pal, S. (2019, January 30\u201331). Pedestrian Detection\u2014A Survey. Proceedings of the First International Conference on Innovative Computing and Cutting-Edge Technologies (ICICCT 2019), Istanbul, Turkey.","DOI":"10.1007\/978-3-030-38501-9"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Deepak, G.D., and Bhat, S.K. (2024). A comparative study of breast tumour detection using a semantic segmentation network coupled with different pretrained CNNs. Comput. Methods Biomech. Biomed. Eng. Imaging Vis., 12.","DOI":"10.1080\/21681163.2024.2373996"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"319","DOI":"10.1016\/S0146-664X(81)80015-9","article-title":"A segmentation system based on thresholding","volume":"15","author":"Kohler","year":"1981","journal-title":"Comput. Graph. Image Process."},{"key":"ref_8","unstructured":"Rueda, L., Mery, D., and Kittler, J. (2007). Image Segmentation Using Automatic Seeded Region Growing and Instance-Based Learning. Progress in Pattern Recognition, Image Analysis and Applications, Springer Nature."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"108027","DOI":"10.1016\/j.mineng.2023.108027","article-title":"Processing of micro-CT images of granodiorite rock samples using convolutional neural networks (CNN), Part II: Semantic segmentation using a 2.5D CNN","volume":"195","author":"Roslin","year":"2023","journal-title":"Miner. Eng."},{"key":"ref_10","unstructured":"Lapa, P.A.F. (2019). Conditional Random Fields Improve the CNN-Based Prostate Cancer Classification Performance. [Master\u2019s Thesis, NOVA Information Management School]."},{"key":"ref_11","unstructured":"Zhang, M., Dong, B., and Li, Q. (2020, January 4\u20138). Deep active contour network for medical image segmentation. Proceedings of the Medical Image Computing and Computer Assisted Intervention\u2013MICCAI 2020: 23rd International Conference, Lima, Peru. Proceedings, Part IV 23."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Li, P., Xia, H., Zhou, B., Yan, F., and Guo, R. (2022). A Method to Improve the Accuracy of Pavement Crack Identification by Combining a Semantic Segmentation and Edge Detection Model. Appl. Sci., 12.","DOI":"10.3390\/app12094714"},{"key":"ref_13","unstructured":"Yuheng, S., and Hao, Y. (2017). Image Segmentation Algorithms Overview. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"041205","DOI":"10.1117\/1.JEI.31.4.041205","article-title":"Review of object instance segmentation based on deep learning","volume":"31","author":"Tian","year":"2021","journal-title":"J. Electron. Imaging"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Kim, D., Woo, S., Lee, J., and Kweon, I.S. (2020). Video Panoptic Segmentation. arXiv.","DOI":"10.1109\/CVPR42600.2020.00988"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Kirillov, A., He, K., Girshick, R., Rother, C., and Dollar, P. (2019, January 15\u201320). Panoptic Segmentation. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00963"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. (2016, January 27\u201330). The Cityscapes Dataset for Semantic Urban Scene Understanding. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.350"},{"key":"ref_18","unstructured":"Prakash, V.J., and Nithya, L.M. (2014). A Survey on Semi-Supervised Learning Techniques. arXiv."},{"key":"ref_19","first-page":"22243","article-title":"Big Self-Supervised Models are Strong Semi-Supervised Learners","volume":"Volume 33","author":"Larochelle","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"3054","DOI":"10.1109\/TCSVT.2024.3508768","article-title":"Pseudo Labeling Methods for Semi-Supervised Semantic Segmentation: A Review and Future Perspectives","volume":"35","author":"Ran","year":"2025","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"5782","DOI":"10.1109\/JSTARS.2022.3203750","article-title":"Semi-supervised deep learning via transformation consistency regularization for remote sensing image semantic segmentation","volume":"16","author":"Zhang","year":"2023","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_22","unstructured":"Xie, J., Shuai, B., Hu, J.F., Lin, J., and Zheng, W.S. (2018). Improving fast segmentation with teacher-student learning. arXiv."},{"key":"ref_23","unstructured":"Wang, W., Zhou, T., Porikli, F., Crandall, D.J., and Gool, L.V. (2021). A Survey on Deep Learning Technique for Video Segmentation. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Jung, S., Heo, H., Park, S., Jung, S.U., and Lee, K. (2022). Benchmarking Deep Learning Models for Instance Segmentation. Appl. Sci., 12.","DOI":"10.3390\/app12178856"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Portillo-Portillo, J., Sanchez-Perez, G., Toscano-Medina, L.K., Hernandez-Suarez, A., Olivares-Mercado, J., Perez-Meana, H., Velarde-Alvarado, P., Orozco, A.L.S., and Garc\u00eda Villalba, L.J. (2022). FASSVid: Fast and Accurate Semantic Segmentation for Video Sequences. Entropy, 24.","DOI":"10.3390\/e24070942"},{"key":"ref_26","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_28","unstructured":"Dumoulin, V., and Visin, F. (2018). A guide to convolution arithmetic for deep learning. arXiv."},{"key":"ref_29","first-page":"3523","article-title":"Image Segmentation Using Deep Learning: A Survey","volume":"44","author":"Minaee","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF00344251","article-title":"Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position","volume":"36","author":"Fukushima","year":"2004","journal-title":"Biol. Cybern."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Albawi, S., Mohammed, T.A., and Al-Zawi, S. (2017, January 21\u201323). Understanding of a convolutional neural network. Proceedings of the 2017 International Conference on Engineering and Technology (ICET), Antalya, Turkey.","DOI":"10.1109\/ICEngTechnol.2017.8308186"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Diamantaras, K., Duch, W., and Iliadis, L.S. (2010, January 15\u201318). Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition. Proceedings of the Artificial Neural Networks\u2014ICANN 2010, Thessaloniki, Greece.","DOI":"10.1007\/978-3-642-15825-4"},{"key":"ref_33","unstructured":"O\u2019Shea, K., and Nash, R. (2015). An Introduction to Convolutional Neural Networks. arXiv."},{"key":"ref_34","unstructured":"Xing, E.P., and Jebara, T. (2014, January 22\u201324). Signal recovery from Pooling Representations. Proceedings of the 31st International Conference on Machine Learning, Bejing, China. Proceedings of Machine Learning Research."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1007\/s10916-017-0845-x","article-title":"Alcoholism detection by data augmentation and Convolutional Neural Network with stochastic pooling","volume":"42","author":"Wang","year":"2017","journal-title":"J. Med. Syst."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Kassani, S.H., Kassani, P.H., Wesolowski, M.J., Schneider, K.A., and Deters, R. (2019, January 16\u201318). Breast Cancer Diagnosis with Transfer Learning and Global Pooling. Proceedings of the 2019 International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Republic of Korea.","DOI":"10.1109\/ICTC46691.2019.8939878"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"32826","DOI":"10.1109\/ACCESS.2020.2974027","article-title":"Attention guided u-net with atrous convolution for accurate retinal vessels segmentation","volume":"8","author":"Lv","year":"2020","journal-title":"IEEE Access"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Qiao, S., Chen, L.C., and Yuille, A. (2021, January 20\u201325). DetectoRS: Detecting objects with recursive feature pyramid and switchable atrous convolution. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01008"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1687","DOI":"10.1109\/JSTARS.2020.2969809","article-title":"EmergencyNet: Efficient aerial image classification for drone-based emergency monitoring using atrous convolutional feature fusion","volume":"13","author":"Kyrkou","year":"2020","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Zhou, Y., Chang, H., Lu, Y., and Lu, X. (2022). CDTNet: Improved image classification method using standard, Dilated and Transposed Convolutions. Appl. Sci., 12.","DOI":"10.3390\/app12125984"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Odena, A., Dumoulin, V., and Olah, C. (2016). Deconvolution and Checkerboard Artifacts. Distill.","DOI":"10.23915\/distill.00003"},{"key":"ref_42","first-page":"1218","article-title":"Pixel transposed convolutional networks","volume":"42","author":"Gao","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/s40649-019-0069-y","article-title":"Graph convolutional networks: A comprehensive review","volume":"6","author":"Zhang","year":"2019","journal-title":"Comput. Soc. Netw."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"2319","DOI":"10.1126\/science.290.5500.2319","article-title":"A Global Geometric Framework for Nonlinear Dimensionality Reduction","volume":"290","author":"Tenenbaum","year":"2000","journal-title":"Science"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"2323","DOI":"10.1126\/science.290.5500.2323","article-title":"Nonlinear Dimensionality Reduction by Locally Linear Embedding","volume":"290","author":"Roweis","year":"2000","journal-title":"Science"},{"key":"ref_46","unstructured":"Dietterich, T., Becker, S., and Ghahramani, Z. (2001). Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. Advances in Neural Information Processing Systems, MIT Press."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Perozzi, B., Al-Rfou, R., and Skiena, S. (2014). DeepWalk: Online Learning of Social Representations. arXiv.","DOI":"10.1145\/2623330.2623732"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Grover, A., and Leskovec, J. (2016). node2vec: Scalable Feature Learning for Networks. arXiv.","DOI":"10.1145\/2939672.2939754"},{"key":"ref_49","unstructured":"Chaudhuri, K., and Salakhutdinov, R. (2019, January 9\u201315). Simplifying Graph Convolutional Networks. Proceedings of the 36th International Conference on Machine Learning, PMLR, Long Beach, CA, USA. Proceedings of Machine Learning Research."},{"key":"ref_50","unstructured":"Daum\u00e9, H., and Singh, A. (2020, January 13\u201318). Simple and Deep Graph Convolutional Networks. Proceedings of the 37th International Conference on Machine Learning, PMLR, Online. Proceedings of Machine Learning Research."},{"key":"ref_51","unstructured":"Zhang, L., Li, X., Arnab, A., Yang, K., Tong, Y., and Torr, P.H.S. (2019). Dual Graph Convolutional Network for Semantic Segmentation. arXiv."},{"key":"ref_52","first-page":"549","article-title":"Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks","volume":"34","author":"Bian","year":"2020","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Tetko, I.V., K\u016frkov\u00e1, V., Karpov, P., and Theis, F. (2019, January 17\u201319). Graph Convolutional Networks Improve the Prediction of Cancer Driver Genes. Proceedings of the Artificial Neural Networks and Machine Learning\u2014ICANN 2019: Workshop and Special Sessions, Munich, Germany.","DOI":"10.1007\/978-3-030-30493-5"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Wang, H., Zhao, M., Xie, X., Li, W., and Guo, M. (2019, January 13\u201317). Knowledge Graph Convolutional Networks for Recommender Systems. Proceedings of the The World Wide Web Conference (WWW \u201919), San Francisco, CA, USA.","DOI":"10.1145\/3308558.3313417"},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"919","DOI":"10.1093\/bib\/bbz042","article-title":"Graph convolutional networks for computational drug development and discovery","volume":"21","author":"Sun","year":"2019","journal-title":"Briefings Bioinform."},{"key":"ref_56","first-page":"7370","article-title":"Graph Convolutional Networks for Text Classification","volume":"33","author":"Yao","year":"2019","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Ghosh, S., Das, N., Das, I., and Maulik, U. (2019). Understanding Deep Learning Techniques for Image Segmentation. arXiv.","DOI":"10.1145\/3329784"},{"key":"ref_58","unstructured":"Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M.C.H., Heinrich, M.P., Misawa, K., Mori, K., McDonagh, S.G., Hammerla, N.Y., and Kainz, B. (2018). Attention U-Net: Learning Where to Look for the Pancreas. arXiv."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"014006","DOI":"10.1117\/1.JMI.6.1.014006","article-title":"Recurrent residual U-Net for medical image segmentation","volume":"6","author":"Alom","year":"2019","journal-title":"J. Med. Imaging"},{"key":"ref_60","unstructured":"Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., and Zhou, Y. (2021). TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Alom, M.Z., Hasan, M., Yakopcic, C., Taha, T.M., and Asari, V.K. (2018). Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation. arXiv.","DOI":"10.1109\/NAECON.2018.8556686"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017). Feature Pyramid Networks for Object Detection. arXiv.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Hu, M., Li, Y., Fang, L., and Wang, S. (2021, January 20\u201325). A2-FPN: Attention Aggregation Based Feature Pyramid Network for Instance Segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01509"},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Girshick, R., He, K., and Dollar, P. (2019, January 15\u201320). Panoptic Feature Pyramid Networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00656"},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"ref_66","unstructured":"Chiappa, S., and Calandra, R. (2020, January 26\u201328). Beyond exploding and vanishing gradients: Analysing RNN training using attractors and smoothness. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR, Online. Proceedings of Machine Learning Research."},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long Short-Term Memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Cho, K., van Merrienboer, B., G\u00fcl\u00e7ehre, \u00c7., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv.","DOI":"10.3115\/v1\/D14-1179"},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Miao, Y., Gowayyed, M., and Metze, F. (2015, January 13\u201317). EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding. Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, USA.","DOI":"10.1109\/ASRU.2015.7404790"},{"key":"ref_70","doi-asserted-by":"crossref","unstructured":"Mikolov, T., Kombrink, S., Burget, L., \u010cernock\u00fd, J., and Khudanpur, S. (2011, January 22\u201327). Extensions of recurrent neural network language model. Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic.","DOI":"10.1109\/ICASSP.2011.5947611"},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Li, D., and Qian, J. (2016, January 13\u201315). Text sentiment analysis based on long short-term memory. Proceedings of the 2016 First IEEE International Conference on Computer Communication and the Internet (ICCCI), Wuhan, China.","DOI":"10.1109\/CCI.2016.7778967"},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Visin, F., Romero, A., Cho, K., Matteucci, M., Ciccone, M., Kastner, K., Bengio, Y., and Courville, A. (July, January 26). ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, USA.","DOI":"10.1109\/CVPRW.2016.60"},{"key":"ref_73","unstructured":"Bengio, Y., and LeCun, Y. (2015, January 7\u20139). Very Deep Convolutional Networks for Large-Scale Image Recognition. Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA."},{"key":"ref_74","unstructured":"Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., and Weinberger, K. (2014). Generative Adversarial Nets. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_75","unstructured":"Zhang, X., Jian, W., Chen, Y., and Yang, S. (2020). Deform-GAN: An Unsupervised Learning Model for Deformable Registration. arXiv."},{"key":"ref_76","unstructured":"Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (2017). Good Semi-supervised Learning That Requires a Bad GAN. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_77","first-page":"100004","article-title":"Generative adversarial network: An overview of theory and applications","volume":"1","author":"Aggarwal","year":"2021","journal-title":"Int. J. Inf. Manag. Data Insights"},{"key":"ref_78","doi-asserted-by":"crossref","unstructured":"Zhan, F., Zhu, H., and Lu, S. (2019, January 15\u201320). Spatial Fusion GAN for Image Synthesis. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00377"},{"key":"ref_79","doi-asserted-by":"crossref","first-page":"1882","DOI":"10.1109\/TIP.2021.3049346","article-title":"On Data Augmentation for GAN Training","volume":"30","author":"Tran","year":"2021","journal-title":"IEEE Trans. Image Process."},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Liang, X., Lee, L., Dai, W., and Xing, E.P. (2017, January 22\u201329). Dual Motion GAN for Future-Flow Embedded Video Prediction. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.194"},{"key":"ref_81","unstructured":"Neyshabur, B., Bhojanapalli, S., and Chakrabarti, A. (2017). Stabilizing GAN Training with Multiple Random Projections. arXiv."},{"key":"ref_82","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Li, M., and Yu, J. (2018, January 4\u20137). On the Convergence and Mode Collapse of GAN. Proceedings of the SIGGRAPH Asia 2018 Technical Briefs, Tokyo, Japan.","DOI":"10.1145\/3283254.3283282"},{"key":"ref_83","doi-asserted-by":"crossref","unstructured":"Oeldorf, C., and Spanakis, G. (2019, January 16\u201319). LoGANv2: Conditional Style-Based Logo Generation with Generative Adversarial Networks. Proceedings of the 2019 18th IEEE International Conference on Machine Learning And Applications (ICMLA), Boca Raton, FL, USA.","DOI":"10.1109\/ICMLA.2019.00086"},{"key":"ref_84","doi-asserted-by":"crossref","unstructured":"Andreini, P., Bonechi, S., Bianchini, M., Mecocci, A., and Scarselli, F. (2020). Image generation by GAN and style transfer for agar plate image segmentation. Comput. Methods Programs Biomed., 184.","DOI":"10.1016\/j.cmpb.2019.105268"},{"key":"ref_85","doi-asserted-by":"crossref","unstructured":"Majurski, M., Manescu, P., Padi, S., Schaub, N., Hotaling, N., Simon, C., and Bajcsy, P. (2019, January 16\u201317). Cell Image Segmentation Using Generative Adversarial Networks, Transfer Learning, and Augmentations. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA.","DOI":"10.1109\/CVPRW.2019.00145"},{"key":"ref_86","unstructured":"Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., and Garnett, R. (2015). Spatial Transformer Networks. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_87","doi-asserted-by":"crossref","unstructured":"Chowdhary, K.R. (2020). Natural Language Processing. Fundamentals of Artificial Intelligence, Springer.","DOI":"10.1007\/978-81-322-3972-7"},{"key":"ref_88","doi-asserted-by":"crossref","first-page":"106126","DOI":"10.1016\/j.engappai.2023.106126","article-title":"Vision Transformers in medical computer vision\u2014A contemplative retrospection","volume":"122","author":"Parvaiz","year":"2023","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_89","doi-asserted-by":"crossref","unstructured":"Strudel, R., Garcia, R., Laptev, I., and Schmid, C. (2021, January 10\u201317). Segmenter: Transformer for Semantic Segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00717"},{"key":"ref_90","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3505244","article-title":"Transformers in Vision: A Survey","volume":"54","author":"Khan","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_91","unstructured":"Xu, M., Dai, W., Liu, C., Gao, X., Lin, W., Qi, G.J., and Xiong, H. (2021). Spatial-Temporal Transformer Networks for Traffic Flow Forecasting. arXiv."},{"key":"ref_92","doi-asserted-by":"crossref","unstructured":"Giuliari, F., Hasan, I., Cristani, M., and Galasso, F. (2021, January 10\u201315). Transformer Networks for Trajectory Forecasting. Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy.","DOI":"10.1109\/ICPR48806.2021.9412190"},{"key":"ref_93","unstructured":"Dwivedi, V.P., and Bresson, X. (2020). A Generalization of Transformer Networks to Graphs. arXiv."},{"key":"ref_94","unstructured":"Han, K., Wang, Y., Chen, H., Chen, X., Guo, J., Liu, Z., Tang, Y., Xiao, A., Xu, C., and Xu, Y. (2020). A Survey on Visual Transformer. arXiv."},{"key":"ref_95","doi-asserted-by":"crossref","unstructured":"Ferrari, V., Hebert, M., Sminchisescu, C., and Weiss, Y. (2018, January 8\u201314). Progressive Neural Architecture Search. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01249-6"},{"key":"ref_96","first-page":"1","article-title":"A comprehensive survey of neural architecture search: Challenges and solutions","volume":"54","author":"Ren","year":"2021","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_97","unstructured":"White, C., Safari, M., Sukthanker, R., Ru, B., Elsken, T., Zela, A., Dey, D., and Hutter, F. (2023). Neural Architecture Search: Insights from 1000 Papers. arXiv."},{"key":"ref_98","unstructured":"Zoph, B., and Le, Q.V. (2016). Neural Architecture Search with Reinforcement Learning. arXiv."},{"key":"ref_99","unstructured":"Meila, M., and Zhang, T. (2021, January 18\u201324). Neural Architecture Search without Training. Proceedings of the 38th International Conference on Machine Learning, PMLR, Online. Proceedings of Machine Learning Research."},{"key":"ref_100","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3473330","article-title":"Weight-sharing neural architecture search: A battle to shrink the optimization gap","volume":"54","author":"Xie","year":"2021","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_101","unstructured":"Real, E., Aggarwal, A., Huang, Y., and Le, Q.V. (2018). Regularized Evolution for Image Classifier Architecture Search. arXiv."},{"key":"ref_102","unstructured":"Real, E., Moore, S., Selle, A., Saxena, S., Leon-Suematsu, Y.I., Le, Q.V., and Kurakin, A. (2017). Large-Scale Evolution of Image Classifiers. arXiv."},{"key":"ref_103","doi-asserted-by":"crossref","unstructured":"White, C., Neiswanger, W., and Savani, Y. (2021, January 2\u20139). Bananas: Bayesian optimization with neural architectures for neural architecture search. Proceedings of the AAAI Conference on Artificial Intelligence, Online.","DOI":"10.1609\/aaai.v35i12.17233"},{"key":"ref_104","doi-asserted-by":"crossref","first-page":"44247","DOI":"10.1109\/ACCESS.2019.2908991","article-title":"NAS-Unet: Neural Architecture Search for Medical Image Segmentation","volume":"7","author":"Weng","year":"2019","journal-title":"IEEE Access"},{"key":"ref_105","doi-asserted-by":"crossref","unstructured":"Liu, C., Chen, L.C., Schroff, F., Adam, H., Hua, W., Yuille, A.L., and Fei-Fei, L. (2019, January 15\u201320). Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00017"},{"key":"ref_106","doi-asserted-by":"crossref","first-page":"107968","DOI":"10.1016\/j.knosys.2021.107968","article-title":"Self-attention neural architecture search for semantic image segmentation","volume":"239","author":"Fan","year":"2022","journal-title":"Knowl.-Based Syst."},{"key":"ref_107","doi-asserted-by":"crossref","unstructured":"Zhang, X., Xu, H., Mo, H., Tan, J., Yang, C., Wang, L., and Ren, W. (2021, January 20\u201325). DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01374"},{"key":"ref_108","unstructured":"Liu, H., Simonyan, K., and Yang, Y. (2018). DARTS: Differentiable Architecture Search. arXiv."},{"key":"ref_109","doi-asserted-by":"crossref","unstructured":"Shaw, A., Hunter, D., Landola, F., and Sidhu, S. (2019, January 27\u201328). SqueezeNAS: Fast Neural Architecture Search for Faster Semantic Segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV) Workshops, Seoul, Republic of Korea.","DOI":"10.1109\/ICCVW.2019.00251"},{"key":"ref_110","unstructured":"Yuan, B., and Zhao, D. (2023). A Survey on Continual Semantic Segmentation: Theory, Challenge, Method and Application. arXiv."},{"key":"ref_111","unstructured":"Wu, W., Zhao, Y., Li, Z., Shan, L., Zhou, H., and Shou, M.Z. (2023). Continual Learning for Image Segmentation with Dynamic Query. arXiv."},{"key":"ref_112","unstructured":"Gonz\u00e1lez, C., Sakas, G., and Mukhopadhyay, A. (2020). What is Wrong with Continual Learning in Medical Image Segmentation?. arXiv."},{"key":"ref_113","doi-asserted-by":"crossref","first-page":"103795","DOI":"10.1016\/j.nicl.2025.103795","article-title":"Mitigating catastrophic forgetting in Multiple sclerosis lesion segmentation using elastic weight consolidation","volume":"46","author":"Valverde","year":"2025","journal-title":"NeuroImage Clin."},{"key":"ref_114","doi-asserted-by":"crossref","unstructured":"Douillard, A., Chen, Y., Dapogny, A., and Cord, M. (2021, January 20\u201325). PLOP: Learning without Forgetting for Continual Semantic Segmentation. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00403"},{"key":"ref_115","doi-asserted-by":"crossref","first-page":"4557","DOI":"10.1007\/s10462-022-10294-2","article-title":"Incremental learning with neural networks for computer vision: A survey","volume":"56","author":"Liu","year":"2022","journal-title":"Artif. Intell. Rev."},{"key":"ref_116","doi-asserted-by":"crossref","unstructured":"Tian, M., Yang, Q., and Gao, Y. (2022, January 23\u201327). Multi-scale Multi-task Distillation for Incremental 3D Medical Image Segmentation. Proceedings of the Computer Vision\u2014ECCV 2022 Workshops, Tel Aviv, Israel. Proceedings, Part III.","DOI":"10.1007\/978-3-031-25066-8_20"},{"key":"ref_117","doi-asserted-by":"crossref","unstructured":"Michieli, U., and Zanuttigh, P. (2021). Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations. arXiv.","DOI":"10.1109\/CVPR46437.2021.00117"},{"key":"ref_118","doi-asserted-by":"crossref","unstructured":"Maracani, A., Michieli, U., Toldo, M., and Zanuttigh, P. (2021). RECALL: Replay-based Continual Learning in Semantic Segmentation. arXiv.","DOI":"10.1109\/ICCV48922.2021.00694"},{"key":"ref_119","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The PASCAL Visual Object Classes (VOC) Challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_120","doi-asserted-by":"crossref","unstructured":"Mottaghi, R., Chen, X., Liu, X., Cho, N.G., Lee, S.W., Fidler, S., Urtasun, R., and Yuille, A. (2014, January 23\u201328). The Role of Context for Object Detection and Semantic Segmentation in the Wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.119"},{"key":"ref_121","unstructured":"Lepelaars, C. (2025, May 14). CamVid (Cambridge-Driving Labeled Video Database). Available online: https:\/\/www.kaggle.com\/datasets\/carlolepelaars\/camvid."},{"key":"ref_122","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16\u201321). Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"ref_123","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1177\/0278364913491297","article-title":"Vision meets Robotics: The KITTI Dataset","volume":"32","author":"Geiger","year":"2013","journal-title":"Int. J. Robot. Res. (IJRR)"},{"key":"ref_124","doi-asserted-by":"crossref","unstructured":"Fritsch, J., Kuehnl, T., and Geiger, A. (2013, January 6\u20139). A New Performance Measure and Evaluation Benchmark for Road Detection Algorithms. Proceedings of the International Conference on Intelligent Transportation Systems (ITSC), The Hague, The Netherlands.","DOI":"10.1109\/ITSC.2013.6728473"},{"key":"ref_125","doi-asserted-by":"crossref","unstructured":"Menze, M., and Geiger, A. (2015, January 7\u201312). Object Scene Flow for Autonomous Vehicles. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298925"},{"key":"ref_126","doi-asserted-by":"crossref","unstructured":"Daniilidis, K., Maragos, P., and Paragios, N. (2010, January 5\u201311). Object Segmentation by Long Term Analysis of Point Trajectories. Proceedings of the Computer Vision\u2014ECCV 2010, Heraklion, Crete, Greece.","DOI":"10.1007\/978-3-642-15561-1"},{"key":"ref_127","doi-asserted-by":"crossref","unstructured":"Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., and Yuille, A. (2014, January 23\u201328). Detect What You Can: Detecting and Representing Objects using Holistic Models and Body Parts. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.254"},{"key":"ref_128","unstructured":"Fleet, D., Pajdla, T., Schiele, B., and Tuytelaars, T. (2014, January 6\u201312). Microsoft COCO: Common Objects in Context. Proceedings of the 13th European Conference, Zurich, Switzerland."},{"key":"ref_129","doi-asserted-by":"crossref","unstructured":"Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., and Torralba, A. (2017, January 21\u201326). Scene Parsing through ADE20K Dataset. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.544"},{"key":"ref_130","unstructured":"Martin, D., Fowlkes, C., Tal, D., and Malik, J. (2001, January 7\u201314). A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. Proceedings of the Proceedings Eighth IEEE International Conference on Computer Vision, ICCV 2001, Vancouver, BC, Canada."},{"key":"ref_131","doi-asserted-by":"crossref","unstructured":"Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., and Malik, J. (2011, January 6\u201313). Semantic Contours from Inverse Detectors. Proceedings of the International Conference on Computer Vision (ICCV), Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126343"},{"key":"ref_132","doi-asserted-by":"crossref","unstructured":"Xia, J., Yokoya, N., Adriano, B., and Broni-Bediako, C. (2022). OpenEarthMap: A Benchmark Dataset for Global High-Resolution Land Cover Mapping. arXiv.","DOI":"10.1109\/WACV56688.2023.00619"},{"key":"ref_133","doi-asserted-by":"crossref","unstructured":"Hassner, T., and Liu, C. (2016). SIFT Flow: Dense Correspondence Across Scenes and Its Applications. Dense Image Correspondences for Computer Vision, Springer International Publishing.","DOI":"10.1007\/978-3-319-23048-1"},{"key":"ref_134","doi-asserted-by":"crossref","unstructured":"Gould, S., Fulton, R., and Koller, D. (October, January 29). Decomposing a Scene into Geometric and Semantically Consistent Regions. Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan.","DOI":"10.1109\/ICCV.2009.5459211"},{"key":"ref_135","doi-asserted-by":"crossref","first-page":"1993","DOI":"10.1109\/TMI.2014.2377694","article-title":"The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)","volume":"34","author":"Menze","year":"2015","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_136","doi-asserted-by":"crossref","first-page":"101950","DOI":"10.1016\/j.media.2020.101950","article-title":"CHAOS Challenge-combined (CT-MR) healthy abdominal organ segmentation","volume":"69","author":"Kavur","year":"2021","journal-title":"Med. Image Anal."},{"key":"ref_137","unstructured":"Codella, N., Rotemberg, V., Tschandl, P., Celebi, M.E., Dusza, S., Gutman, D., Helba, B., Kalloo, A., Liopyris, K., and Marchetti, M. (2019). Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC). arXiv."},{"key":"ref_138","doi-asserted-by":"crossref","unstructured":"Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., and Raskar, R. (2018, January 18\u201322). DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00031"},{"key":"ref_139","unstructured":"Etten, A.V., Lindenbaum, D., and Bacastow, T.M. (2019). SpaceNet: A Remote Sensing Dataset and Challenge Series. arXiv."},{"key":"ref_140","doi-asserted-by":"crossref","unstructured":"Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., and Schmid, C. (2012, January 7\u201313). Indoor Segmentation and Support Inference from RGBD Images. Proceedings of the 12th European Conference on Computer Vision (ECCV 2012), Florence, Italy.","DOI":"10.1007\/978-3-642-33709-3"},{"key":"ref_141","doi-asserted-by":"crossref","unstructured":"Song, S., Lichtenberg, S.P., and Xiao, J. (2015, January 7\u201312). SUN RGB-D: A RGB-D scene understanding benchmark suite. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298655"},{"key":"ref_142","doi-asserted-by":"crossref","unstructured":"Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T.A., and Nie\u00dfner, M. (2017). ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. arXiv.","DOI":"10.1109\/CVPR.2017.261"},{"key":"ref_143","unstructured":"Armeni, I., Sax, A., Zamir, A.R., and Savarese, S. (2017). Joint 2D-3D-Semantic Data for Indoor Scene Understanding. arXiv."},{"key":"ref_144","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., and Antiga, L. (2019, January 8\u201314). PyTorch: An imperative style, high-performance deep learning library. Proceedings of the NIPS\u201919: 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_145","unstructured":"Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., and Devin, M. (2025, May 14). TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Available online: https:\/\/www.tensorflow.org."},{"key":"ref_146","unstructured":"Micikevicius, P., Narang, S., Alben, J., Diamos, G., Elsen, E., Garcia, D., Ginsburg, B., Houston, M., Kuchaiev, O., and Venkatesh, G. (2018). Mixed precision training. arXiv."},{"key":"ref_147","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_148","doi-asserted-by":"crossref","unstructured":"Matos-Carvalho, J.P., Correia, S.D., and Tomic, S. (2023, January 20\u201322). Sensitivity Analysis of LSTM Networks for Fall Detection Wearable Sensors. Proceedings of the 2023 6th Conference on Cloud and Internet of Things (CIoT), Lisbon, Portugal.","DOI":"10.1109\/CIoT57267.2023.10084906"},{"key":"ref_149","doi-asserted-by":"crossref","unstructured":"Correia, S.D., Matos-Carvalho, J.P., and Tomic, S. (2024, January 4\u20136). Quantization with Gate Disclosure for Embedded Artificial Intelligence Applied to Fall Detection. Proceedings of the GoodIT \u201924 2024 International Conference on Information Technology for Social Good, Bremen, Germany.","DOI":"10.1145\/3677525.3678644"},{"key":"ref_150","unstructured":"Chatterjee, P., and Esposito, M. (2023). Chapter 9\u2014Artificial intelligence for chest imaging against COVID-19: An insight into image segmentation methods. Artificial Intelligence in Healthcare and COVID-19, Academic Press. Intelligent Data-Centric Systems."},{"key":"ref_151","doi-asserted-by":"crossref","first-page":"2032924","DOI":"10.1080\/08839514.2022.2032924","article-title":"A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D Images","volume":"36","author":"Ulku","year":"2022","journal-title":"Appl. Artif. Intell."},{"key":"ref_152","doi-asserted-by":"crossref","unstructured":"Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., and Savarese, S. (2019, January 15\u201320). Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"ref_153","doi-asserted-by":"crossref","unstructured":"Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., and Sang, N. (2018, January 18\u201323). Learning a discriminative feature network for semantic segmentation. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00199"},{"key":"ref_154","doi-asserted-by":"crossref","unstructured":"Setiawan, A.W. (2020, January 17\u201318). Image Segmentation Metrics in Skin Lesion: Accuracy, Sensitivity, Specificity, Dice Coefficient, Jaccard Index, and Matthews Correlation Coefficient. Proceedings of the 2020 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM), Surabaya, Indonesia.","DOI":"10.1109\/CENIM51130.2020.9297970"},{"key":"ref_155","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2014). Fully Convolutional Networks for Semantic Segmentation. arXiv.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_156","doi-asserted-by":"crossref","unstructured":"Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., and Torr, P.H.S. (2015). Conditional Random Fields as Recurrent Neural Networks. arXiv.","DOI":"10.1109\/ICCV.2015.179"},{"key":"ref_157","doi-asserted-by":"crossref","unstructured":"Dai, J., He, K., and Sun, J. (2015). BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation. arXiv.","DOI":"10.1109\/ICCV.2015.191"},{"key":"ref_158","doi-asserted-by":"crossref","unstructured":"Lin, G., Shen, C., Reid, I.D., and van den Hengel, A. (2015). Efficient piecewise training of deep structured models for semantic segmentation. arXiv.","DOI":"10.1109\/CVPR.2016.348"},{"key":"ref_159","doi-asserted-by":"crossref","unstructured":"Liu, Z., Li, X., Luo, P., Loy, C.C., and Tang, X. (2015). Semantic Image Segmentation via Deep Parsing Network. arXiv.","DOI":"10.1109\/ICCV.2015.162"},{"key":"ref_160","unstructured":"Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L. (2016). DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. arXiv."},{"key":"ref_161","unstructured":"Wu, Z., Shen, C., and van den Hengel, A. (2016). Wider or Deeper: Revisiting the ResNet Model for Visual Recognition. arXiv."},{"key":"ref_162","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2016). Pyramid Scene Parsing Network. arXiv.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_163","doi-asserted-by":"crossref","unstructured":"Peng, C., Zhang, X., Yu, G., Luo, G., and Sun, J. (2017, January 21\u201326). Large Kernel Matters\u2014Improve Semantic Segmentation by Global Convolutional Network. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.189"},{"key":"ref_164","doi-asserted-by":"crossref","unstructured":"Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., and Sang, N. (2018). BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation. arXiv.","DOI":"10.1007\/978-3-030-01261-8_20"},{"key":"ref_165","doi-asserted-by":"crossref","unstructured":"Li, Y., Song, L., Chen, Y., Li, Z., Zhang, X., Wang, X., and Sun, J. (2020). Learning Dynamic Routing for Semantic Segmentation. arXiv.","DOI":"10.1109\/CVPR42600.2020.00858"},{"key":"ref_166","unstructured":"Wang, W., and Howard, A. (2021). MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context. arXiv."},{"key":"ref_167","unstructured":"Jeevan, P., Viswanathan, K., and Sethi, A. (2022). WaveMix-Lite: A Resource-efficient Neural Network for Image Analysis. arXiv."},{"key":"ref_168","doi-asserted-by":"crossref","unstructured":"Wu, J., Kuang, H., Lu, Q., Lin, Z., Shi, Q., Liu, X., and Zhu, X. (2022). M-FasterSeg: An Efficient Semantic Segmentation Network Based on Neural Architecture Search. arXiv.","DOI":"10.1016\/j.engappai.2022.104962"},{"key":"ref_169","doi-asserted-by":"crossref","unstructured":"Bhardwaj, K., Cheng, H.P., Priyadarshi, S., and Li, Z. (2023). ZiCo-BC: A Bias Corrected Zero-Shot NAS for Vision Tasks. arXiv.","DOI":"10.1109\/ICCVW60793.2023.00145"},{"key":"ref_170","doi-asserted-by":"crossref","unstructured":"Jeong, J., Yu, J., Park, G., Han, D., and Yoo, Y. (2023). GeNAS: Neural Architecture Search with Better Generalization. arXiv.","DOI":"10.24963\/ijcai.2023\/101"},{"key":"ref_171","doi-asserted-by":"crossref","unstructured":"Xiong, Z., Amein, M., Therrien, O., Gross, W.J., and Meyer, B.H. (2023). FMAS: Fast Multi-Objective SuperNet Architecture Search for Semantic Segmentation. arXiv.","DOI":"10.1109\/CASES55004.2022.00024"},{"key":"ref_172","unstructured":"Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., and Clark, J. (2021). Learning Transferable Visual Models From Natural Language Supervision. arXiv."},{"key":"ref_173","unstructured":"Liu, H., Li, C., Wu, Q., and Lee, Y.J. (2023). Visual Instruction Tuning. arXiv."},{"key":"ref_174","doi-asserted-by":"crossref","unstructured":"Xu, M., Zhang, Z., Wei, F., Lin, Y., Cao, Y., Hu, H., and Bai, X. (2022). A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model. arXiv.","DOI":"10.1007\/978-3-031-19818-2_42"},{"key":"ref_175","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., and Lo, W.Y. (2023). Segment Anything. arXiv.","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"ref_176","doi-asserted-by":"crossref","unstructured":"Dong, X., Bao, J., Zheng, Y., Zhang, T., Chen, D., Yang, H., Zeng, M., Zhang, W., Yuan, L., and Chen, D. (2023). MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining. arXiv.","DOI":"10.1109\/CVPR52729.2023.01058"},{"key":"ref_177","doi-asserted-by":"crossref","unstructured":"Liu, Q., Wen, Y., Han, J., Xu, C., Xu, H., and Liang, X. (2022). Open-world Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding. arXiv.","DOI":"10.1007\/978-3-031-20044-1_16"},{"key":"ref_178","unstructured":"Baranchuk, D., Rubachev, I., Voynov, A., Khrulkov, V., and Babenko, A. (2022). Label-Efficient Semantic Segmentation with Diffusion Models. arXiv."},{"key":"ref_179","unstructured":"Zou, X., Yang, J., Zhang, H., Li, F., Li, L., Wang, J., Wang, L., Gao, J., and Lee, Y.J. (2023). Segment Everything Everywhere All at Once. arXiv."}],"container-title":["Eng"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2673-4117\/6\/7\/165\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:12:21Z","timestamp":1760033541000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2673-4117\/6\/7\/165"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,18]]},"references-count":179,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2025,7]]}},"alternative-id":["eng6070165"],"URL":"https:\/\/doi.org\/10.3390\/eng6070165","relation":{},"ISSN":["2673-4117"],"issn-type":[{"type":"electronic","value":"2673-4117"}],"subject":[],"published":{"date-parts":[[2025,7,18]]}}}