{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:08:43Z","timestamp":1750219723397,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":32,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,11,1]],"date-time":"2023-11-01T00:00:00Z","timestamp":1698796800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,11,2]]},"DOI":"10.1145\/3606041.3618067","type":"proceedings-article","created":{"date-parts":[[2023,11,1]],"date-time":"2023-11-01T22:06:27Z","timestamp":1698876387000},"page":"33-39","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Exploiting Temporal Information in Real-time Portrait Video Segmentation"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0315-2056","authenticated-orcid":false,"given":"Weichen","family":"Xu","sequence":"first","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4797-4155","authenticated-orcid":false,"given":"Yezhi","family":"Shen","sequence":"additional","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7741-5277","authenticated-orcid":false,"given":"Qian","family":"Lin","sequence":"additional","affiliation":[{"name":"HP Inc., Palo Alto, CA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5608-8249","authenticated-orcid":false,"given":"Jan P.","family":"Allebach","sequence":"additional","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3863-3220","authenticated-orcid":false,"given":"Fengqing","family":"Zhu","sequence":"additional","affiliation":[{"name":"Purdue University, West Lafayette, IN, USA"}]}],"member":"320","published-online":{"date-parts":[[2023,11]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"L. Chen G. Papandreou F. Schroff and H. Adam. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).  L. Chen G. Papandreou F. Schroff and H. Adam. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)."},{"key":"e_1_3_2_1_2_1","unstructured":"X. Chen D. Qi and J. Shen. 2019. Boundary-aware network for fast and high-accuracy portrait segmentation. arXiv preprint arXiv:1901.03814 (2019).  X. Chen D. Qi and J. Shen. 2019. Boundary-aware network for fast and high-accuracy portrait segmentation. arXiv preprint arXiv:1901.03814 (2019)."},{"key":"e_1_3_2_1_3_1","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (June","author":"Cheng B.","year":"2021","unstructured":"B. Cheng , R. Girshick , P. Doll\u00e1r , A. C. Berg , and A. Kirillov . 2021. Boundary IoU: Improving object-centric image segmentation evaluation . Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (June 2021 ), 15334--15342. B. Cheng, R. Girshick, P. Doll\u00e1r, A. C. Berg, and A. Kirillov. 2021. Boundary IoU: Improving object-centric image segmentation evaluation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (June 2021), 15334--15342."},{"key":"e_1_3_2_1_4_1","volume-title":"Pp-humanseg: Connectivity-aware portrait segmentation with a large-scale teleconferencing video dataset.","author":"Chu Lutao","year":"2022","unstructured":"Lutao Chu , Yi Liu , Zewu Wu , Shiyu Tang , Guowei Chen , Yuying Hao , Juncai Peng , Zhiliang Yu , Zeyu Chen , Baohua Lai , 2022 . Pp-humanseg: Connectivity-aware portrait segmentation with a large-scale teleconferencing video dataset. (2022), 202--209. Lutao Chu, Yi Liu, Zewu Wu, Shiyu Tang, Guowei Chen, Yuying Hao, Juncai Peng, Zhiliang Yu, Zeyu Chen, Baohua Lai, et al. 2022. Pp-humanseg: Connectivity-aware portrait segmentation with a large-scale teleconferencing video dataset. (2022), 202--209."},{"key":"e_1_3_2_1_5_1","volume-title":"2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (May","author":"Du X.","year":"2019","unstructured":"X. Du , X. Wang , D. Li , J. Zhu , S. Tasci , C. Upright , S. Walsh , and L. Davis . 2019. Boundary-sensitive network for portrait segmentation . 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (May 2019 ), 1--8. Lille, France. X. Du, X. Wang, D. Li, J. Zhu, S. Tasci, C. Upright, S. Walsh, and L. Davis. 2019. Boundary-sensitive network for portrait segmentation. 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (May 2019), 1--8. Lille, France."},{"key":"e_1_3_2_1_6_1","volume-title":"Attention mechanisms in computer vision: A survey. Computational visual media","author":"Guo Meng-Hao","year":"2022","unstructured":"Meng-Hao Guo , Tian-Xing Xu , Jiang-Jiang Liu , Zheng-Ning Liu , Peng-Tao Jiang , Tai-Jiang Mu , Song-Hai Zhang , Ralph R Martin , Ming-Ming Cheng , and Shi-Min Hu. 2022. Attention mechanisms in computer vision: A survey. Computational visual media , Vol. 8 , 3 ( 2022 ), 331--368. Meng-Hao Guo, Tian-Xing Xu, Jiang-Jiang Liu, Zheng-Ning Liu, Peng-Tao Jiang, Tai-Jiang Mu, Song-Hai Zhang, Ralph R Martin, Ming-Ming Cheng, and Shi-Min Hu. 2022. Attention mechanisms in computer vision: A survey. Computational visual media , Vol. 8, 3 (2022), 331--368."},{"key":"e_1_3_2_1_7_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (June","author":"He K.","year":"2016","unstructured":"K. He , X. Zhang , S. Ren , and J. Sun . 2016. Deep residual learning for image recognition . Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (June 2016 ), 770--778. Las Vegas, NV. K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (June 2016), 770--778. Las Vegas, NV."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00140"},{"key":"e_1_3_2_1_9_1","volume-title":"DDCNet: A Lightweight Network with Variable Receptive Field for Real-Time Portrait Segmentation in Complex Environment. In Computer Graphics International Conference. Springer, 465--476","author":"Huang Dongjin","year":"2022","unstructured":"Dongjin Huang , Di Wu , Jinhua Liu , and Yushan Lv . 2022 . DDCNet: A Lightweight Network with Variable Receptive Field for Real-Time Portrait Segmentation in Complex Environment. In Computer Graphics International Conference. Springer, 465--476 . Dongjin Huang, Di Wu, Jinhua Liu, and Yushan Lv. 2022. DDCNet: A Lightweight Network with Variable Receptive Field for Real-Time Portrait Segmentation in Complex Environment. In Computer Graphics International Conference. Springer, 465--476."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.3390\/e23020197"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2104.09752"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP40778.2020.9190790"},{"key":"e_1_3_2_1_13_1","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (June","author":"Lin S.","year":"2021","unstructured":"S. Lin , A. Ryabtsev , S. Sengupta , B. L. Curless , S. M. Seitz , and I. Kemelmacher-Shlizerman . 2021. Real-time high-resolution background matting . Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (June 2021 ), 8762--8771. S. Lin, A. Ryabtsev, S. Sengupta, B. L. Curless, S. M. Seitz, and I. Kemelmacher-Shlizerman. 2021. Real-time high-resolution background matting. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (June 2021), 8762--8771."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2108.11515"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00941"},{"volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (March 2020)","author":"Park H.","key":"e_1_3_2_1_16_1","unstructured":"H. Park , L. Sjosund , Y. Yoo , N. Monet , J. Bang , and N. Kwak . 2020. SINet: Extreme lightweight portrait segmentation networks with spatial squeeze module and information blocking decoder . Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (March 2020) , 2066--2074. Snowmass Village, CO. H. Park, L. Sjosund, Y. Yoo, N. Monet, J. Bang, and N. Kwak. 2020. SINet: Extreme lightweight portrait segmentation networks with spatial squeeze module and information blocking decoder. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (March 2020), 2066--2074. Snowmass Village, CO."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1612.02646"},{"key":"e_1_3_2_1_18_1","volume-title":"International Conference on Medical Image Computing and Computer-assisted Intervention (November","author":"Ronneberger O.","year":"2015","unstructured":"O. Ronneberger , P. Fischer , and T. Brox . 2015. U-Net: Convolutional networks for biomedical image segmentation . International Conference on Medical Image Computing and Computer-assisted Intervention (November 2015 ), 234--241. Munich, Germany. O. Ronneberger, P. Fischer, and T. Brox. 2015. U-Net: Convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-assisted Intervention (November 2015), 234--241. Munich, Germany."},{"key":"e_1_3_2_1_19_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (June","author":"Sandler M.","year":"2018","unstructured":"M. Sandler , A. Howard , M. Zhu , A. Zhmoginov , and L. Chen . 2018. MobileNetV2: Inverted residuals and linear bottlenecks . Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (June 2018 ), 4510--4520. Salt Lake City, UT. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. Chen. 2018. MobileNetV2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (June 2018), 4510--4520. Salt Lake City, UT."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.12814"},{"key":"e_1_3_2_1_21_1","volume-title":"Efficient Models for Real-Time Person Segmentation on Mobile Phones. 2021 29th European Signal Processing Conference (August","author":"Strohmayer J.","year":"2021","unstructured":"J. Strohmayer , J. Knapp , and M. Kampel . 2021 . Efficient Models for Real-Time Person Segmentation on Mobile Phones. 2021 29th European Signal Processing Conference (August 2021 ), 651--655. J. Strohmayer, J. Knapp, and M. Kampel. 2021. Efficient Models for Real-Time Person Segmentation on Mobile Phones. 2021 29th European Signal Processing Conference (August 2021), 651--655."},{"key":"e_1_3_2_1_22_1","unstructured":"Supervisely. [n. d.]. Supervisely Person Dataset. https:\/\/github.com\/supervisely-ecosystem\/persons.  Supervisely. [n. d.]. Supervisely Person Dataset. https:\/\/github.com\/supervisely-ecosystem\/persons."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"crossref","unstructured":"Jingdong Wang Ke Sun Tianheng Cheng Borui Jiang Chaorui Deng Yang Zhao Dong Liu Yadong Mu Mingkui Tan Xinggang Wang etal 2020. Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence Vol. 43 10 (2020) 3349--3364.  Jingdong Wang Ke Sun Tianheng Cheng Borui Jiang Chaorui Deng Yang Zhao Dong Liu Yadong Mu Mingkui Tan Xinggang Wang et al. 2020. Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence Vol. 43 10 (2020) 3349--3364.","DOI":"10.1109\/TPAMI.2020.2983686"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2021.108143"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2014.273"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP.2019.8803063"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.2352\/EI.2022.34.8.IMAGE-263"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-021-01515-2"},{"key":"e_1_3_2_1_29_1","volume-title":"Proceedings of the European conference on computer vision (September","author":"Yu C.","year":"2018","unstructured":"C. Yu , J. Wang , C. Peng , C. Gao , G. Yu , and N. Sang . 2018. Bisenet: Bilateral segmentation network for real-time semantic segmentation . Proceedings of the European conference on computer vision (September 2018 ), 325--341. Munich, Germany. C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang. 2018. Bisenet: Bilateral segmentation network for real-time semantic segmentation. Proceedings of the European conference on computer vision (September 2018), 325--341. Munich, Germany."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cag.2019.03.007"},{"key":"e_1_3_2_1_31_1","volume-title":"Lightweight Portrait Segmentation Via Edge-Optimized Attention. In ICASSP 2023--2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1--5.","author":"Zhang Xinyue","year":"2023","unstructured":"Xinyue Zhang , Guodong Wang , Lijuan Yang , and Chenglizhao Chen . 2023 . Lightweight Portrait Segmentation Via Edge-Optimized Attention. In ICASSP 2023--2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1--5. Xinyue Zhang, Guodong Wang, Lijuan Yang, and Chenglizhao Chen. 2023. Lightweight Portrait Segmentation Via Edge-Optimized Attention. In ICASSP 2023--2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1--5."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2019.00281"}],"event":{"name":"MM '23: The 31st ACM International Conference on Multimedia","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Ottawa ON Canada","acronym":"MM '23"},"container-title":["Proceedings of the 4th International Workshop on Human-centric Multimedia Analysis"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3606041.3618067","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3606041.3618067","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:36:20Z","timestamp":1750178180000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3606041.3618067"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11]]},"references-count":32,"alternative-id":["10.1145\/3606041.3618067","10.1145\/3606041"],"URL":"https:\/\/doi.org\/10.1145\/3606041.3618067","relation":{},"subject":[],"published":{"date-parts":[[2023,11]]},"assertion":[{"value":"2023-11-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}