{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T03:18:56Z","timestamp":1775618336404,"version":"3.50.1"},"reference-count":186,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2019,8,30]],"date-time":"2019-08-30T00:00:00Z","timestamp":1567123200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001843","name":"SERB","doi-asserted-by":"crossref","award":["SB\/S3\/EECE\/054\/2016"],"award-info":[{"award-number":["SB\/S3\/EECE\/054\/2016"]}],"id":[{"id":"10.13039\/501100001843","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Centre for Microprocessor Application for Training Education and Research, CSE Department, Jadavpur University"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Comput. Surv."],"published-print":{"date-parts":[[2020,7,31]]},"abstract":"<jats:p>The machine learning community has been overwhelmed by a plethora of deep learning--based approaches. Many challenging computer vision tasks, such as detection, localization, recognition, and segmentation of objects in an unconstrained environment, are being efficiently addressed by various types of deep neural networks, such as convolutional neural networks, recurrent networks, adversarial networks, and autoencoders. Although there have been plenty of analytical studies regarding the object detection or recognition domain, many new deep learning techniques have surfaced with respect to image segmentation techniques. This article approaches these various deep learning techniques of image segmentation from an analytical perspective. The main goal of this work is to provide an intuitive understanding of the major techniques that have made a significant contribution to the image segmentation domain. Starting from some of the traditional image segmentation approaches, the article progresses by describing the effect that deep learning has had on the image segmentation domain. Thereafter, most of the major segmentation algorithms have been logically categorized with paragraphs dedicated to their unique contribution. With an ample amount of intuitive explanations, the reader is expected to have an improved ability to visualize the internal dynamics of these processes.<\/jats:p>","DOI":"10.1145\/3329784","type":"journal-article","created":{"date-parts":[[2019,9,3]],"date-time":"2019-09-03T12:47:00Z","timestamp":1567514820000},"page":"1-35","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":354,"title":["Understanding Deep Learning Techniques for Image Segmentation"],"prefix":"10.1145","volume":"52","author":[{"given":"Swarnendu","family":"Ghosh","sequence":"first","affiliation":[{"name":"Jadavpur University, Kolkata, WB, India"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2426-9915","authenticated-orcid":false,"given":"Nibaran","family":"Das","sequence":"additional","affiliation":[{"name":"Jadavpur University, Kolkata, WB, India"}]},{"given":"Ishita","family":"Das","sequence":"additional","affiliation":[{"name":"Jadavpur University, Kolkata, WB, India"}]},{"given":"Ujjwal","family":"Maulik","sequence":"additional","affiliation":[{"name":"Jadavpur University, Kolkata, WB, India"}]}],"member":"320","published-online":{"date-parts":[[2019,8,30]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2012.120"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015706.1015764"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jvcir.2017.02.010"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/IGARSS.2016.7730798"},{"key":"e_1_2_2_5_1","volume-title":"Proceedings of the 2001 International Conference on Image Processing","volume":"2","author":"Albiol Alberto","unstructured":"Alberto Albiol, Luis Torres, and Edward J. Delp. 2001. An unsupervised color image segmentation algorithm for face detection applications. In Proceedings of the 2001 International Conference on Image Processing, Vol. 2. IEEE, Los Alamitos, CA, 681--684."},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0177544"},{"key":"e_1_2_2_7_1","volume-title":"A Quality Analysis of OpenStreetMap Data. Master\u2019s Thesis","author":"Ather Aamer","unstructured":"Aamer Ather. 2009. A Quality Analysis of OpenStreetMap Data. Master\u2019s Thesis. University College London, London, UK."},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2644615"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.14358\/PERS.72.6.687"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/938978.939161"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","unstructured":"Yoshua Bengio Pascal Lamblin Dan Popovici and Hugo Larochelle. 2007. Greedy layer-wise training of deep networks. In Advances in Neural Information Processing Systems. 153--160.","DOI":"10.5555\/2976456.2976476"},{"key":"e_1_2_2_12_1","first-page":"1996","volume-title":"Simp\u00f3sio Brasileiro de Sensoriamento Remoto 8","author":"Bins L. Sant\u2019Anna","year":"1996","unstructured":"L. Sant\u2019Anna Bins, L. M. Garcia Fonseca, G. J. Erthal, and F. Mitsuo Ii. 1996. Satellite imagery segmentation: A region growing approach. Simp\u00f3sio Brasileiro de Sensoriamento Remoto 8, 1996 (1996), 677--680."},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2014.2383320"},{"key":"e_1_2_2_14_1","unstructured":"Ali Borji Ming-Ming Cheng Qibin Hou Huaizu Jiang and Jia Li. 2014. Salient object detection: A survey. arXiv:1411.5878."},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2015.2487833"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2008.04.005"},{"key":"e_1_2_2_17_1","first-page":"162","article-title":"Method and system for compositing images to produce a cropped image","volume":"7","author":"Cahill Nathan D.","year":"2007","unstructured":"Nathan D. Cahill and Lawrence A. Ray. 2007. Method and system for compositing images to produce a cropped image. US Patent 7,162,102.","journal-title":"US Patent"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.dib.2017.04.004"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.477"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CBMS.2007.48"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/3157382.3157438"},{"key":"e_1_2_2_22_1","volume-title":"Yuille","author":"Chen Liang-Chieh","year":"2014","unstructured":"Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2014. Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv:1412.7062."},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2699184"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.179"},{"key":"e_1_2_2_25_1","unstructured":"Liang-Chieh Chen George Papandreou Florian Schroff and Hartwig Adam. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587."},{"key":"e_1_2_2_26_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3640--3649","author":"Chen Liang-Chieh","unstructured":"Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu, and Alan L. Yuille. 2016. Attention to scale: Scale-aware semantic image segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3640--3649."},{"key":"e_1_2_2_27_1","doi-asserted-by":"crossref","unstructured":"Liang-Chieh Chen Yukun Zhu George Papandreou Florian Schroff and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. arXiv:1802.02611.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/42.511759"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00371-013-0867-4"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2014.2345401"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.compmedimag.2005.10.001"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/794189.794503"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.350"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.191"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.343"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","unstructured":"Jifeng Dai Yi Li Kaiming He and Jian Sun. 2016. R-FCN: Object detection via region-based fully convolutional networks. In Advances in Neural Information Processing Systems. 379--387.","DOI":"10.5555\/3157096.3157139"},{"key":"e_1_2_2_37_1","volume-title":"Recent Developments in Machine Learning and Data Analytics","author":"Das Aritra","unstructured":"Aritra Das, Swarnendu Ghosh, Ritesh Sarkhel, Sandipan Choudhuri, Nibaran Das, and Mita Nasipuri. 2019. Combining multilevel contexts of superpixel using convolutional neural networks to perform natural scene labeling. In Recent Developments in Machine Learning and Data Analytics. Springer, 297--306."},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2004.03.003"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2004.01.001"},{"key":"e_1_2_2_40_1","doi-asserted-by":"crossref","unstructured":"Ilke Demir Krzysztof Koperski David Lindenbaum Guan Pang Jing Huang Saikat Basu Forest Hughes Devis Tuia and Ramesh Raskar. 2018. DeepGlobe 2018: A challenge to parse the earth through satellite images. arXiv:1805.06561.","DOI":"10.1109\/CVPRW.2018.00031"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMCB.2010.2045371"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553411"},{"key":"e_1_2_2_43_1","unstructured":"Vincent Dumoulin and Francesco Visin. 2016. A guide to convolution arithmetic for deep learning. arXiv:1603.07285."},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-009-0275-4"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2012.231"},{"key":"e_1_2_2_46_1","volume-title":"ISPRS 2D Semantic Labeling Contest. Retrieved","author":"International Society for Photogrammetry and Remote Sensing. {n.d.}.","year":"2019","unstructured":"International Society for Photogrammetry and Remote Sensing. {n.d.}. ISPRS 2D Semantic Labeling Contest. Retrieved August 1, 2019 from http:\/\/www2.isprs.org\/commissions\/comm3\/wg4\/semantic-labeling.html"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cmpb.2012.03.009"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.5555\/645317.649319"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.5555\/2074226.2074247"},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1016\/0031-3203(81)90028-5"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.438"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP.2016.7532769"},{"key":"e_1_2_2_53_1","doi-asserted-by":"crossref","unstructured":"Alberto Garcia-Garcia Sergio Orts-Escolano Sergiu Oprea Victor Villena-Martinez and Jose Garcia-Rodriguez. 2017. A review on deep learning techniques applied to semantic segmentation. arXiv:1704.06857.","DOI":"10.1016\/j.asoc.2018.05.018"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.5555\/2354409.2354978"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11432-017-9189-6"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","unstructured":"Ross Girshick. 2015. Fast R-CNN. arXiv:1504.08083. 10.1109\/ICCV.2015.169","DOI":"10.1109\/ICCV.2015.169"},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.81"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2009.5459211"},{"key":"e_1_2_2_59_1","unstructured":"Xiao Han. 2017. Automatic liver lesion segmentation using a deep convolutional neural network method. arXiv:1704.07239. https:\/\/competitions.codalab.org\/competitions\/17094."},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2011.6126343"},{"key":"e_1_2_2_61_1","volume-title":"Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV\u201917)","author":"He Kaiming","year":"2017","unstructured":"Kaiming He, Georgia Gkioxari, Piotr Doll\u00e1r, and Ross Girshick. 2017. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV\u201917). IEEE, Los Alamitos, CA, 2980--2988."},{"key":"e_1_2_2_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2015.2389824"},{"key":"e_1_2_2_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_2_64_1","doi-asserted-by":"publisher","DOI":"10.5555\/3045118.3045183"},{"key":"e_1_2_2_65_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2018.10.009"},{"key":"e_1_2_2_66_1","unstructured":"Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167."},{"key":"e_1_2_2_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/RBME.2013.2295804"},{"key":"e_1_2_2_68_1","volume-title":"Efros","author":"Isola Phillip","year":"2017","unstructured":"Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. arXiv:1611.07004."},{"key":"e_1_2_2_69_1","volume-title":"Altaani","author":"Jassim Firas Ajil","year":"2013","unstructured":"Firas Ajil Jassim and Fawzi H. Altaani. 2013. Hybridization of Otsu method and median filter for color image segmentation. arXiv:1305.1052."},{"key":"e_1_2_2_70_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2017.156"},{"key":"e_1_2_2_71_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24078-7_33"},{"key":"e_1_2_2_72_1","doi-asserted-by":"publisher","DOI":"10.5555\/791222.791951"},{"key":"e_1_2_2_73_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2016.10.004"},{"key":"e_1_2_2_74_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2018.8462533"},{"key":"e_1_2_2_75_1","volume-title":"Intelligent Computing in Signal Processing and Pattern Recognition","author":"Kang Jiayin","unstructured":"Jiayin Kang, Xiao Li, Qingxian Luan, Jinzhu Liu, and Lequan Min. 2006. Dental plaque quantification using cellular neural network-based image segmentation. In Intelligent Computing in Signal Processing and Pattern Recognition. Springer, 797--802."},{"key":"e_1_2_2_76_1","doi-asserted-by":"publisher","DOI":"10.1109\/CINC.2009.246"},{"key":"e_1_2_2_77_1","unstructured":"Kai Kang and Xiaogang Wang. 2014. Fully convolutional neural networks for crowd segmentation. arXiv:1411.4464."},{"key":"e_1_2_2_78_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2014","unstructured":"Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980."},{"key":"e_1_2_2_79_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.98"},{"key":"e_1_2_2_80_1","doi-asserted-by":"publisher","unstructured":"Philipp Kr\u00e4henb\u00fchl and Vladlen Koltun. 2011. Efficient inference in fully connected CRFs with Gaussian edge potentials. In Advances in Neural Information Processing Systems. 109--117.","DOI":"10.5555\/2986459.2986472"},{"key":"e_1_2_2_81_1","doi-asserted-by":"publisher","unstructured":"Bolei Zhou Hang Zhao Xavier Puig Sanja Fidler Adela Barriuso and Antonio Torralba. 2016. Semantic understanding of scenes through the ADE20K dataset. arXiv:1608.05442. 10.1007\/s11263-018-1140-0","DOI":"10.1007\/s11263-018-1140-0"},{"key":"e_1_2_2_82_1","doi-asserted-by":"publisher","DOI":"10.5555\/2999134.2999257"},{"key":"e_1_2_2_83_1","volume-title":"Proceedings of the Asian Conference on Intelligent Information and Database Systems. 255--265","author":"Alfonso","unstructured":"Alfonso B. Labao and Prospero C. Naval. 2017. Weakly-labelled semantic segmentation of fish objects in underwater videos using a deep residual network. In Proceedings of the Asian Conference on Intelligent Information and Database Systems. 255--265."},{"key":"e_1_2_2_84_1","article-title":"Colour image segmentation a survey","volume":"14","author":"Skarbek W. Ladys Law","year":"1994","unstructured":"W. Ladys Law Skarbek and Andreas Koschan. 1994. Colour image segmentation a survey. IEEE Transactions on Circuits and Systems for Video Technology 14, 7 (1994).","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"e_1_2_2_85_1","unstructured":"Rodney LaLonde and Ulas Bagci. 2018. Capsules for object segmentation. arXiv:1804.04241."},{"key":"e_1_2_2_86_1","doi-asserted-by":"publisher","DOI":"10.3390\/rs8040329"},{"key":"e_1_2_2_87_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_2_2_88_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2010.969"},{"key":"e_1_2_2_89_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2005.272"},{"key":"e_1_2_2_90_1","doi-asserted-by":"publisher","DOI":"10.5244\/C.29.109"},{"key":"e_1_2_2_91_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2007.1177"},{"key":"e_1_2_2_92_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.684"},{"key":"e_1_2_2_93_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.43"},{"key":"e_1_2_2_94_1","doi-asserted-by":"crossref","unstructured":"Yi Li Haozhi Qi Jifeng Dai Xiangyang Ji and Yichen Wei. 2016. Fully convolutional instance-aware semantic segmentation. arXiv:1611.07709.","DOI":"10.1109\/CVPR.2017.472"},{"key":"e_1_2_2_95_1","doi-asserted-by":"publisher","DOI":"10.1109\/83.392347"},{"key":"e_1_2_2_96_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.549"},{"key":"e_1_2_2_97_1","unstructured":"Min Lin Qiang Chen and Shuicheng Yan. 2013. Network in network. arXiv:1312.4400."},{"key":"e_1_2_2_98_1","volume-title":"Proceedings of the European Conference on Computer Vision. 740--755","author":"Lin Tsung-Yi","unstructured":"Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll\u00e1r, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision. 740--755."},{"key":"e_1_2_2_99_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2017.07.005"},{"key":"e_1_2_2_100_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICB.2016.7550055"},{"key":"e_1_2_2_101_1","doi-asserted-by":"publisher","DOI":"10.5555\/3294771.3294916"},{"key":"e_1_2_2_102_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2006.04.045"},{"key":"e_1_2_2_103_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.162"},{"key":"e_1_2_2_104_1","doi-asserted-by":"publisher","DOI":"10.1109\/TITB.2010.2091279"},{"key":"e_1_2_2_105_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"e_1_2_2_106_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISBI.2018.8363708"},{"key":"e_1_2_2_107_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBME.2017.2697916"},{"key":"e_1_2_2_108_1","unstructured":"Pauline Luc Camille Couprie Soumith Chintala and Jakob Verbeek. 2016. Semantic segmentation using adversarial networks. arXiv:1611.08408."},{"key":"e_1_2_2_109_1","doi-asserted-by":"publisher","DOI":"10.1109\/IGARSS.2017.8127684"},{"key":"e_1_2_2_110_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2016.07.009"},{"key":"e_1_2_2_111_1","volume-title":"Proceedings of the 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom\u201916)","author":"Mandal Rupesh","year":"2016","unstructured":"Rupesh Mandal and Nupur Choudhury. 2016. Automatic video surveillance for theft detection in ATM machines: An enhanced approach. In Proceedings of the 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom\u201916). IEEE, Los Alamitos, CA, 2821--2826."},{"key":"e_1_2_2_112_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00071"},{"key":"e_1_2_2_113_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_35"},{"key":"e_1_2_2_114_1","volume-title":"Proceedings of the 8th International Conference on Computer Vision","volume":"2","author":"Martin D.","unstructured":"D. Martin, C. Fowlkes, D. Tal, and J. Malik. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the 8th International Conference on Computer Vision, Vol. 2. 416--423."},{"key":"e_1_2_2_115_1","doi-asserted-by":"publisher","DOI":"10.5555\/2029556.2029563"},{"key":"e_1_2_2_116_1","doi-asserted-by":"publisher","unstructured":"L. R. Medsker and L. C. Jain. 2001. Recurrent Neural Networks: Design and Applications. CRC Press Boca Raton FL.","DOI":"10.5555\/553011"},{"key":"e_1_2_2_117_1","doi-asserted-by":"publisher","DOI":"10.1016\/0031-3203(87)90069-0"},{"key":"e_1_2_2_118_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2014.2377694"},{"key":"e_1_2_2_119_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.544"},{"key":"e_1_2_2_120_1","doi-asserted-by":"publisher","DOI":"10.1145\/266180.266390"},{"key":"e_1_2_2_121_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cmpb.2010.04.007"},{"key":"e_1_2_2_122_1","volume-title":"Medical Imaging 2018: Biomedical Applications in Molecular, Structural, and Functional Imaging","author":"Moriya Takayasu","unstructured":"Takayasu Moriya, Holger R. Roth, Shota Nakamura, Hirohisa Oda, Kai Nagara, Masahiro Oda, and Kensaku Mori. 2018. Unsupervised segmentation of 3D medical images based on clustering and deep representation learning. In Medical Imaging 2018: Biomedical Applications in Molecular, Structural, and Functional Imaging, Vol. 10578. International Society for Optics and Photonics, Bellingham, WA, 1057820."},{"key":"e_1_2_2_123_1","volume-title":"Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR\u201918)","author":"Mundhenk T. Nathan","unstructured":"T. Nathan Mundhenk, Daniel Ho, and Barry Y. Chen. 2018. Improvements to context based self-supervised learning. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR\u201918)."},{"key":"e_1_2_2_124_1","doi-asserted-by":"publisher","DOI":"10.5555\/3104322.3104425"},{"key":"e_1_2_2_125_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2018.00201"},{"key":"e_1_2_2_126_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.534"},{"key":"e_1_2_2_127_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.178"},{"key":"e_1_2_2_128_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46466-4_5"},{"key":"e_1_2_2_129_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0262-8856(00)00097-4"},{"key":"e_1_2_2_130_1","volume-title":"Proceedings of the 11th Indian Conference on Computer Vision, Graphics, and Image Processing (ICVGIP\u201918)","author":"Pal Anisha","unstructured":"Anisha Pal, Shourya Jaiswal, Swarnendu Ghosh, Nibaran Das, and Mita Nasipuri. {n.d.}. SegFast: A faster SqueezeNet based semantic image segmentation technique using depth-wise separable convolutions. In Proceedings of the 11th Indian Conference on Computer Vision, Graphics, and Image Processing (ICVGIP\u201918). ACM, New York, NY, 7."},{"key":"e_1_2_2_131_1","doi-asserted-by":"publisher","DOI":"10.1016\/0031-3203(93)90135-J"},{"key":"e_1_2_2_132_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.203"},{"key":"e_1_2_2_133_1","unstructured":"Adam Paszke Abhishek Chaurasia Sangpil Kim and Eugenio Culurciello. 2016. ENet: A deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147."},{"key":"e_1_2_2_134_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2536--2544","author":"Pathak Deepak","unstructured":"Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A. Efros. 2016. Context encoders: Feature learning by inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2536--2544."},{"key":"e_1_2_2_135_1","doi-asserted-by":"crossref","unstructured":"Chao Peng Xiangyu Zhang Gang Yu Guiming Luo and Jian Sun. 2017. Large kernel matters\u2014Improve semantic segmentation by global convolutional network. arXiv:1703.02719.","DOI":"10.1109\/CVPR.2017.189"},{"key":"e_1_2_2_136_1","doi-asserted-by":"publisher","unstructured":"Pedro O. Pinheiro Ronan Collobert and Piotr Doll\u00e1r. 2015. Learning to segment object candidates. In Advances in Neural Information Processing Systems. 1990--1998.","DOI":"10.5555\/2969442.2969462"},{"key":"e_1_2_2_137_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_5"},{"key":"e_1_2_2_138_1","unstructured":"Jordi Pont-Tuset Federico Perazzi Sergi Caelles Pablo Arbel\u00e1ez Alex Sorkine-Hornung and Luc Van Gool. 2017. The 2017 Davis Challenge on video object segmentation. arXiv:1704.00675."},{"key":"e_1_2_2_139_1","volume-title":"Proceedings of the IEEE International Symposium on Biomedical Imaging(ISBI\u201918)","author":"Porwal Prasanna","year":"2018","unstructured":"Prasanna Porwal, Samiksha Pachade, Ravi Kamble, Manesh Kokare, Girish Deshmukh, Vivek Sahasrabuddhe, et al. 2018. Diabetic retinopathy: Segmentation and grading challenge workshop. In Proceedings of the IEEE International Symposium on Biomedical Imaging(ISBI\u201918).https:\/\/idrid.grand-challenge.org\/organizers\/"},{"key":"e_1_2_2_140_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2017.2689724"},{"key":"e_1_2_2_141_1","doi-asserted-by":"crossref","unstructured":"P. Radau Y. Lu K. Connelly G. Paul A. Dick and G. Wright. 2009. Evaluation framework for algorithms segmenting short axis cardiac MRI. MIDAS Journal 49 (2009).","DOI":"10.54294\/g80ruo"},{"key":"e_1_2_2_142_1","volume-title":"Black","author":"Ranjan Anurag","year":"2018","unstructured":"Anurag Ranjan, Varun Jampani, Kihwan Kim, Deqing Sun, Jonas Wulff, and Michael J. Black. 2018. Adversarial collaboration: Joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. arXiv:1805.09806."},{"key":"e_1_2_2_143_1","unstructured":"Mahdyar Ravanbakhsh Moin Nabi Hossein Mousavi Enver Sangineto and Nicu Sebe. 2016. Plug-and-play CNN for crowd motion analysis: An application in abnormal event detection. arXiv:1610.00307."},{"key":"e_1_2_2_144_1","volume-title":"Zemel","author":"Ren Mengye","year":"2017","unstructured":"Mengye Ren and Richard S. Zemel. 2017. End-to-end instance segmentation with recurrent attention. arXiv:1605.09410."},{"key":"e_1_2_2_145_1","doi-asserted-by":"publisher","unstructured":"Shaoqing Ren Kaiming He Ross Girshick and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 91--99.","DOI":"10.5555\/2969239.2969250"},{"key":"e_1_2_2_146_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46466-4_19"},{"key":"e_1_2_2_147_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_2_2_148_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3234--3243","author":"Ros German","unstructured":"German Ros, Laura Sellart, Joanna Materzynska, David Vazquez, and Antonio M. Lopez. 2016. The Synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3234--3243."},{"key":"e_1_2_2_149_1","doi-asserted-by":"publisher","DOI":"10.2514\/6.2016-5539"},{"key":"e_1_2_2_150_1","doi-asserted-by":"publisher","DOI":"10.1038\/323533a0"},{"key":"e_1_2_2_151_1","doi-asserted-by":"publisher","DOI":"10.5555\/3294996.3295142"},{"key":"e_1_2_2_152_1","first-page":"250","article-title":"Edge detection techniques for image segmentation\u2014A survey of soft computing approaches","volume":"1","author":"Senthilkumaran N.","year":"2009","unstructured":"N. Senthilkumaran and R. Rajesh. 2009. Edge detection techniques for image segmentation\u2014A survey of soft computing approaches. International Journal of Recent Trends in Engineering 1, 2 (2009), 250--254.","journal-title":"International Journal of Recent Trends in Engineering"},{"key":"e_1_2_2_153_1","first-page":"3","article-title":"Automated medical image segmentation techniques","volume":"35","author":"Sharma Neeraj","year":"2010","unstructured":"Neeraj Sharma and Lalit M. Aggarwal. 2010. Automated medical image segmentation techniques. Journal of Medical Physics\/Association of Medical Physicists of India 35, 1 (2010), 3.","journal-title":"Journal of Medical Physics\/Association of Medical Physicists of India"},{"key":"e_1_2_2_154_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.868688"},{"key":"e_1_2_2_155_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2015.2465960"},{"key":"e_1_2_2_156_1","doi-asserted-by":"publisher","DOI":"10.1007\/11744023_1"},{"key":"e_1_2_2_157_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2008.2011119"},{"key":"e_1_2_2_158_1","volume-title":"Proceedings of the 2017 OCEANS--Anchorage Conference. IEEE","author":"Song Yan","year":"2017","unstructured":"Yan Song, Yuemei Zhu, Guangliang Li, Chen Feng, Bo He, and Tianhong Yan. 2017. Side scan sonar segmentation using deep convolutional neural network. In Proceedings of the 2017 OCEANS--Anchorage Conference. IEEE, Los Alamitos, CA, 1--4."},{"key":"e_1_2_2_159_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2004.825627"},{"key":"e_1_2_2_160_1","doi-asserted-by":"publisher","DOI":"10.5555\/338958.2813155"},{"key":"e_1_2_2_161_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2010.07.013"},{"key":"e_1_2_2_162_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2002.806231"},{"key":"e_1_2_2_163_1","volume-title":"Proceedings of the MLITS NIPS Workshop.","author":"Treml Michael","year":"2016","unstructured":"Michael Treml, Jos\u00e9 Arjona-Medina, Thomas Unterthiner, Rupesh Durgesh, Felix Friedmann, Peter Schuberth, et al. 2016. Speeding up semantic segmentation for autonomous driving. In Proceedings of the MLITS NIPS Workshop."},{"key":"e_1_2_2_164_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00066"},{"key":"e_1_2_2_165_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-013-0620-5"},{"key":"e_1_2_2_166_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2011.6126456"},{"key":"e_1_2_2_167_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2005.02.002"},{"key":"e_1_2_2_168_1","volume-title":"IDD: A dataset for exploring problems of autonomous navigation in unconstrained environments. arXiv:1811.10200.","author":"Varma G.","year":"2018","unstructured":"G. Varma, A. Subramanian, A. Namboodiri, M. Chandraker, and C. V. Jawahar. 2018. IDD: A dataset for exploring problems of autonomous navigation in unconstrained environments. arXiv:1811.10200."},{"key":"e_1_2_2_169_1","volume-title":"Coco-text: Dataset and benchmark for text detection and recognition in natural images. arXiv:1601.07140.","author":"Veit Andreas","year":"2016","unstructured":"Andreas Veit, Tomas Matera, Lukas Neumann, Jiri Matas, and Serge Belongie. 2016. Coco-text: Dataset and benchmark for text detection and recognition in natural images. arXiv:1601.07140."},{"key":"e_1_2_2_170_1","volume-title":"Proceedings of the 2002 7th IEEE International Workshop on Cellular Neural Networks and Their Applications (CNNA\u201902)","author":"Vilarino David L.","unstructured":"David L. Vilarino, Diego Cabello, and Victor M. Brea. 2002. An analogic CNN-algorithm of pixel level snakes for tracking and surveillance tasks. In Proceedings of the 2002 7th IEEE International Workshop on Cellular Neural Networks and Their Applications (CNNA\u201902). IEEE, Los Alamitos, CA, 84--91."},{"key":"e_1_2_2_171_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2011.6126402"},{"key":"e_1_2_2_172_1","doi-asserted-by":"publisher","DOI":"10.1109\/51.566151"},{"key":"e_1_2_2_173_1","unstructured":"Xide Xia and Brian Kulis. 2017. W-Net: A deep model for fully unsupervised image segmentation. arXiv:1711.08506."},{"key":"e_1_2_2_174_1","doi-asserted-by":"publisher","DOI":"10.1117\/12.2030637"},{"key":"e_1_2_2_175_1","unstructured":"Ning Xu Linjie Yang Yuchen Fan Dingcheng Yue Yuchen Liang Jianchao Yang and Thomas Huang. 2018. YouTube-VOS: A large-scale video object segmentation benchmark. arXiv:1809.03327."},{"key":"e_1_2_2_176_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.407"},{"key":"e_1_2_2_177_1","unstructured":"Fisher Yu and Vladlen Koltun. 2015. Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122."},{"key":"e_1_2_2_178_1","unstructured":"Fisher Yu Wenqi Xian Yingying Chen Fangchen Liu Mike Liao Vashisht Madhavan and Trevor Darrell. 2018. BDD100K: A diverse driving video database with scalable annotation tooling. arXiv:1805.04687."},{"key":"e_1_2_2_179_1","doi-asserted-by":"publisher","DOI":"10.1109\/LGRS.2013.2261453"},{"key":"e_1_2_2_180_1","volume-title":"Proceedings of the European Conference on Computer Vision. 818--833","author":"Matthew","unstructured":"Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision. 818--833."},{"key":"e_1_2_2_181_1","volume-title":"Proceedings of the MICCAI Workshop on Multimodal Brain Tumor Segmentation Challenge (BRATS'14)","author":"Zikic Darko","year":"2014","unstructured":"Darko Zikic, Yani Ioannou, Matthew Brown, and Antonio Criminisi. 2014. Segmentation of brain tumor tissues with convolutional neural networks. In Proceedings of the MICCAI Workshop on Multimodal Brain Tumor Segmentation Challenge (BRATS'14). 36--39."},{"key":"e_1_2_2_182_1","doi-asserted-by":"crossref","unstructured":"Xiaohang Zhan Xingang Pan Ziwei Liu Dahua Lin and Chen Change Loy. 2019. Self-supervised learning via conditional motion propagation. arXiv:1903.11412.","DOI":"10.1109\/CVPR.2019.00198"},{"key":"e_1_2_2_183_1","doi-asserted-by":"publisher","DOI":"10.5555\/645531.656002"},{"key":"e_1_2_2_184_1","volume-title":"Proceedings of the European Conference on Computer Vision. 649--666","author":"Zhang Richard","unstructured":"Richard Zhang, Phillip Isola, and Alexei A. Efros. 2016. Colorful image colorization. In Proceedings of the European Conference on Computer Vision. 649--666."},{"key":"e_1_2_2_185_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11633-017-1053-3"},{"key":"e_1_2_2_186_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.660"}],"container-title":["ACM Computing Surveys"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3329784","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3329784","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,25]],"date-time":"2025-06-25T13:25:45Z","timestamp":1750857945000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3329784"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,8,30]]},"references-count":186,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,7,31]]}},"alternative-id":["10.1145\/3329784"],"URL":"https:\/\/doi.org\/10.1145\/3329784","relation":{},"ISSN":["0360-0300","1557-7341"],"issn-type":[{"value":"0360-0300","type":"print"},{"value":"1557-7341","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,8,30]]},"assertion":[{"value":"2018-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-05-01","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-08-30","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}