{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T03:02:11Z","timestamp":1776740531113,"version":"3.51.2"},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2019,5,31]],"date-time":"2019-05-31T00:00:00Z","timestamp":1559260800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61802197, 61702278, 61872082, 61472184, 61703212, and 61602252"],"award-info":[{"award-number":["61802197, 61702278, 61872082, 61472184, 61703212, and 61602252"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CNS-1837146"],"award-info":[{"award-number":["CNS-1837146"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Natural Science Foundation of Jiangsu Province of China","award":["BK20160964, BK20160902, BK20160971, and BK20160967"],"award-info":[{"award-number":["BK20160964, BK20160902, BK20160971, and BK20160967"]}]},{"name":"Jiangsu Innovation and Entrepreneurship (Shuangchuang) Program"},{"name":"Project through the Priority Academic Program Development (PAPD) of Jiangsu Higher Education Institutions"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2019,5,31]]},"abstract":"<jats:p>In this study, we propose an effective and efficient algorithm for unconstrained video object segmentation, which is achieved in a Markov random field (MRF). In the MRF graph, each node is modeled as a superpixel and labeled as either foreground or background during the segmentation process. The unary potential is computed for each node by learning a transductive SVM classifier under supervision by a few labeled frames. The pairwise potential is used for the spatial-temporal smoothness. In addition, a high-order potential based on the multinomial event model is employed to enhance the appearance consistency throughout the frames. To minimize this intractable feature, we also introduce a more efficient technique that simply extends the original MRF structure. The proposed approach was evaluated in experiments with different measures and the results based on a benchmark demonstrated its effectiveness compared with other state-of-the-art algorithms.<\/jats:p>","DOI":"10.1145\/3321507","type":"journal-article","created":{"date-parts":[[2019,6,6]],"date-time":"2019-06-06T12:28:42Z","timestamp":1559824122000},"page":"1-15","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Appearance-consistent Video Object Segmentation Based on a Multinomial Event Model"],"prefix":"10.1145","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4448-2617","authenticated-orcid":false,"given":"Yadang","family":"Chen","sequence":"first","affiliation":[{"name":"Nanjing University of Information Science and Technology, Nanjing, China"}]},{"given":"Chuanyan","family":"Hao","sequence":"additional","affiliation":[{"name":"Nanjing University of Posts and Telecommunications, Nanjing, China"}]},{"given":"Alex X.","family":"Liu","sequence":"additional","affiliation":[{"name":"Michigan State University, USA and Nanjing University, Nanjing, China"}]},{"given":"Enhua","family":"Wu","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2019,6,5]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Federico Perazzi, and Jordi Pont-Tuset.","author":"Caelles Sergi","year":"2018"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2019.2952984"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-017-4520-5"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the British Machine Vision Conference.","author":"Faktor A."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299114"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.81"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2010.5539893"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916)","author":"He K."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3063532"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10593-2_43"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.","author":"Jampani V."},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.","author":"Jang W. D."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.82"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.374"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3231598"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.784"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-008-0202-0"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the 25th International Conference on Neural Information Processing Systems. Curran Associates Inc., 1097--1105","author":"Krizhevsky Alex"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.273"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2005.16"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.87"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the AAAI Workshop on Learning for Text Categorization.","author":"McCallum Andrew","year":"1998"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.242"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3159170"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.223"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.","author":"Perazzi F."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.","author":"Perazzi F."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.369"},{"key":"e_1_2_1_29_1","unstructured":"Jordi Pont-Tuset Federico Perazzi Sergi Caelles Pablo Arbel\u00e1ez Alexander Sorkine-Hornung and Luc Van Gool. 2017. The 2017 DAVIS challenge on video object segmentation. Retrieved from arXiv:1704.00675.  Jordi Pont-Tuset Federico Perazzi Sergi Caelles Pablo Arbel\u00e1ez Alexander Sorkine-Hornung and Luc Van Gool. 2017. The 2017 DAVIS challenge on video object segmentation. Retrieved from arXiv:1704.00675."},{"key":"e_1_2_1_30_1","volume-title":"Very deep convolutional networks for large-scale image recognition. (Apr","author":"Simonyan Karen","year":"2015"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-011-0512-5"},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.","author":"Tsai Yi-Hsuan"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3241053"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2016.2527378"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298961"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298835"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.107"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-016-0906-5"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 20th International Conference on Neural Information Processing Systems (NIPS\u201907)","author":"Xu Zenglin"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2015.2500820"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3321507","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3321507","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3321507","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:54:38Z","timestamp":1750204478000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3321507"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,31]]},"references-count":40,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2019,5,31]]}},"alternative-id":["10.1145\/3321507"],"URL":"https:\/\/doi.org\/10.1145\/3321507","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,5,31]]},"assertion":[{"value":"2018-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-06-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}