{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T09:02:49Z","timestamp":1760346169481,"version":"3.41.0"},"reference-count":23,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2010,8,1]],"date-time":"2010-08-01T00:00:00Z","timestamp":1280620800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2010,8]]},"abstract":"<jats:p>Popular content in video sharing websites (e.g., YouTube) is usually replicated via identical copies or near-duplicates. These duplicates are usually studied because they pose a threat to site owners in terms of wasted disk space, or privacy infringements. Furthermore, this content might potentially hinder the users' experience in these websites. The research presented in this article focuses around the central argument that there is no agreement on the technical definition of what these near-duplicates are, and, more importantly, there is no strong evidence that users of video sharing websites would like this content to be removed. Most scholars define near-duplicate video clips (NDVC) by means of non-semantic features (e.g., different image\/audio quality), while a few also include semantic features (i.e., different videos of similar content). However, it is unclear what features contribute to the human perception of near-duplicate videos. The findings of four large scale online surveys that were carried out in the context of our research confirm the relevance of both types of features. Some of our findings confirm the adopted definitions of NDVC whereas other findings are surprising: Near-duplicate videos with different image quality, audio quality, or with\/without overlays were perceived as NDVC. However, the same could not be verified when videos differed by more than one of these features at the same time. With respect to semantics, it is yet unclear the exact role that it plays in relation to the features that make videos alike. From a user's perspective, participants preferred in most cases to see only one of the NDVC in the search results of a video search query and they were more tolerant to changes in the audio than in the video tracks. Based on all these findings, we propose a new user-centric NDVC definition and present implications for how duplicate content should be dealt with by video sharing Web sites.<\/jats:p>","DOI":"10.1145\/1823746.1823749","type":"journal-article","created":{"date-parts":[[2010,8,31]],"date-time":"2010-08-31T13:05:55Z","timestamp":1283259955000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["Looking at near-duplicate videos from a human-centric perspective"],"prefix":"10.1145","volume":"6","author":[{"given":"Rodrigo De","family":"Oliveira","sequence":"first","affiliation":[{"name":"Telefonica Research, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mauro","family":"Cherubini","sequence":"additional","affiliation":[{"name":"Telefonica Research, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nuria","family":"Oliver","sequence":"additional","affiliation":[{"name":"Telefonica Research, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2010,8,27]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2007.09.016"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1459359.1459480"},{"key":"e_1_2_2_3_1","unstructured":"Bruce B. Green P. R. and Georgeson M. A. 1996. Visual Perception. 3rd Ed. Psychology Press. Bruce B. Green P. R. and Georgeson M. A. 1996. Visual Perception. 3rd Ed. Psychology Press."},{"volume-title":"Proceedings of the 18th International Florida Artificial Intelligence Research Society Conference. I. Russell and Z. Markov, Eds. 245--251","author":"Celebi M. E.","key":"e_1_2_2_4_1","unstructured":"Celebi , M. E. and Aslandogan , Y. A . 2005. Human perception-driven, similarity-based access to image databases . In Proceedings of the 18th International Florida Artificial Intelligence Research Society Conference. I. Russell and Z. Markov, Eds. 245--251 . Celebi, M. E. and Aslandogan, Y. A. 2005. Human perception-driven, similarity-based access to image databases. In Proceedings of the 18th International Florida Artificial Intelligence Research Society Conference. I. Russell and Z. Markov, Eds. 245--251."},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1298306.1298309"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1631272.1631489"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1743384.1743437"},{"volume-title":"Proceedings of the SPIE\/ACM Conference on Multimedia Computing and Networking (MMCN).","author":"Gill P.","key":"e_1_2_2_8_1","unstructured":"Gill , P. , Li , Z. , Arlitt , M. , and Mahanti , A . 2008. Characterizing users sessions on youtube . In Proceedings of the SPIE\/ACM Conference on Multimedia Computing and Networking (MMCN). Gill, P., Li, Z., Arlitt, M., and Mahanti, A. 2008. Characterizing users sessions on youtube. In Proceedings of the SPIE\/ACM Conference on Multimedia Computing and Networking (MMCN)."},{"volume-title":"Proceedings of Neural Networks for Signal Processing. 385--394","author":"Guyader N.","key":"e_1_2_2_9_1","unstructured":"Guyader , N. , Borgne , H. L. , H\u00e9rault , J. , and Gu\u00e9rin-Dugu\u00e9 , A . 2002. Towards the introduction of human perception in a natural scene classification system . In Proceedings of Neural Networks for Signal Processing. 385--394 . Guyader, N., Borgne, H. L., H\u00e9rault, J., and Gu\u00e9rin-Dugu\u00e9, A. 2002. Towards the introduction of human perception in a natural scene classification system. In Proceedings of Neural Networks for Signal Processing. 385--394."},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242804"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1180639.1180654"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1462027.1462029"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1435497.1435498"},{"key":"e_1_2_2_14_1","doi-asserted-by":"crossref","unstructured":"Payne J. S. and Stonham T. J. 2001. Can texture and image content retrieval methods match humanperception&quest; In Proceedings of Intelligent Multimedia Video and Speech Processing. 154--157. Payne J. S. and Stonham T. J. 2001. Can texture and image content retrieval methods match humanperception&quest; In Proceedings of Intelligent Multimedia Video and Speech Processing. 154--157.","DOI":"10.1109\/ISIMP.2001.925355"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1006\/jvci.1999.0413"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISM.2009.93"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/1454159.1454232"},{"volume-title":"Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB'07)","author":"Shen H. T.","key":"e_1_2_2_18_1","unstructured":"Shen , H. T. , Zhou , X. , Huang , Z. , Shao , J. , and Zhou , X . 2007. Uqlips: a real-time near-duplicate video clip detection system . In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB'07) . VLDB Endowment, 1374--1377. Shen, H. T., Zhou, X., Huang, Z., Shao, J., and Zhou, X. 2007. Uqlips: a real-time near-duplicate video clip detection system. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB'07). VLDB Endowment, 1374--1377."},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-12275-0_22"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.84.4.327"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1291233.1291280"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1631058.1631073"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2009.2021794"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1823746.1823749","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1823746.1823749","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T14:47:16Z","timestamp":1750258036000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1823746.1823749"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,8]]},"references-count":23,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2010,8]]}},"alternative-id":["10.1145\/1823746.1823749"],"URL":"https:\/\/doi.org\/10.1145\/1823746.1823749","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"type":"print","value":"1551-6857"},{"type":"electronic","value":"1551-6865"}],"subject":[],"published":{"date-parts":[[2010,8]]},"assertion":[{"value":"2010-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2010-08-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}