{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T02:01:51Z","timestamp":1760061711357},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2010,8,19]],"date-time":"2010-08-19T00:00:00Z","timestamp":1282176000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/2.0"},{"start":{"date-parts":[[2010,8,19]],"date-time":"2010-08-19T00:00:00Z","timestamp":1282176000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/2.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Braz Comput Soc"],"published-print":{"date-parts":[[2010,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Videos have become a predominant part of users\u2019 daily lives on the Web, especially with the emergence of online video sharing systems such as YouTube. Since users can independently share videos in these systems, some videos can be duplicates (i.e., identical or very similar videos). Despite having the same content, there are some potential context differences in duplicates, for example, in their associated metadata (i.e., tags, title) and their popularity scores (i.e., number of views, comments). Quantifying these differences is important to understand how users associate metadata to videos and to understand possible reasons that influence the popularity of videos, which is crucial for video information retrieval mechanisms, association of advertisements to videos, and performance issues related to the use of caches and content distribution networks (CDNs). This work presents a wide quantitative characterization of the context differences among identical contents. Using a large video sample collected from YouTube, we construct a dataset of duplicates. Our measurement analysis provides several interesting findings that can have implications for how videos should be retrieved in video sharing websites as well as for advertising systems that need to understand the role that users play when they create content in services such as YouTube.<\/jats:p>","DOI":"10.1007\/s13173-010-0019-x","type":"journal-article","created":{"date-parts":[[2010,8,18]],"date-time":"2010-08-18T18:10:55Z","timestamp":1282155055000},"page":"201-214","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Equal but different: a contextual analysis of duplicated videos on\u00a0YouTube"],"prefix":"10.1007","volume":"16","author":[{"given":"Tiago","family":"Rodrigues","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fabr\u00edcio","family":"Benevenuto","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Virg\u00edlio","family":"Almeida","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jussara","family":"Almeida","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marcos","family":"Gon\u00e7alves","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2010,8,19]]},"reference":[{"key":"19_CR1","unstructured":"Adar E, Zhang L, Adamic L, Lukose R (2004) Implicit structure and the dynamics of blogspace. In: Workshop on the Weblogging Ecosystem"},{"key":"19_CR2","volume-title":"Modern information retrieval","author":"R Baeza-Yates","year":"1999","unstructured":"Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. ACM\/Addison-Wesley, New York\/Reading"},{"key":"19_CR3","doi-asserted-by":"crossref","unstructured":"Benevenuto F, Duarte F, Rodrigues T, Almeida V, Almeida J, Ross K (2008) Understanding video interactions in youtube. In: ACM int\u2019l conference on multimedia (MM)","DOI":"10.1145\/1459359.1459480"},{"key":"19_CR4","doi-asserted-by":"crossref","unstructured":"Benevenuto F, Rodrigues T, Almeida V, Almeida J, Zhang C, Ross K (2008) Identifying video spammers in online social networks. In: Workshop on adversarial information retrieval on the web (AIRWeb)","DOI":"10.1145\/1451983.1451996"},{"key":"19_CR5","doi-asserted-by":"crossref","unstructured":"Benevenuto F, Rodrigues T, Almeida V, Almeida J, Gon\u00e7alves M (2009) Detecting spammers and content promoters in online video social networks. In: Int\u2019l ACM SIGIR","DOI":"10.1109\/INFCOMW.2009.5072127"},{"key":"19_CR6","doi-asserted-by":"crossref","unstructured":"Benevenuto F, Rodrigues T, Almeida V, Almeida J, Ross K (2009) Video interactions in online video social networks. In: ACM trans on multimedia computing, communications and applications (TOMCCAP)","DOI":"10.1145\/1596990.1596994"},{"key":"19_CR7","doi-asserted-by":"crossref","unstructured":"Cha M, Kwak H, Rodriguez P, Ahn Y, Moon S (2007) I tube, you tube, everybody tubes: analyzing the world\u2019s largest user generated content video system. In: ACM SIGCOMM conference on Internet measurement (IMC)","DOI":"10.1145\/1298306.1298309"},{"key":"19_CR8","doi-asserted-by":"crossref","unstructured":"Cherubini M, Oliveira R, Oliver N (2009) Understanding near-duplicate videos: a\u00a0user-centric approach. In: ACM int\u2019l conference on multimedia (MM)","DOI":"10.1145\/1631272.1631280"},{"key":"19_CR9","unstructured":"Comscore (2010) http:\/\/www.comscore.com\/. June 2010"},{"key":"19_CR10","unstructured":"Comscore (2010) Youtube now 25 percent of all Google searches. http:\/\/tinyurl.com\/4t32l4. June 2010"},{"key":"19_CR11","unstructured":"del.icio.us web site (2010) http:\/\/www.delicious.com. June 2010"},{"key":"19_CR12","unstructured":"Flickr web site (2010) http:\/\/www.flickr.com. June 2010"},{"key":"19_CR13","doi-asserted-by":"crossref","unstructured":"Gill P, Arlitt M, Li Z, Mahanti A (2007) Youtube traffic characterization: a\u00a0view from the edge. In: ACM SIGCOMM conference on Internet measurement (IMC)","DOI":"10.1145\/1298306.1298310"},{"key":"19_CR14","doi-asserted-by":"crossref","unstructured":"Golbeck J (2008) Trust and nuanced profile similarity in online social networks. Technical report","DOI":"10.1145\/1594173.1594174"},{"issue":"2","key":"19_CR15","doi-asserted-by":"publisher","first-page":"196","DOI":"10.1109\/TMM.2008.2009673","volume":"11","author":"A Hauptmann","year":"2009","unstructured":"Hauptmann A, Wu X, Ngo C, Tan H (2009) Real-time near-duplicate elimination for web video search with content and context. IEEE Trans Multimedia 11(2):196\u2013207","journal-title":"IEEE Trans Multimedia"},{"key":"19_CR16","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1109\/MIC.2007.125","volume":"11","author":"P Heymann","year":"2007","unstructured":"Heymann P, Koutrika G, Garcia-Molina H (2007) Fighting spam on social web sites: a\u00a0survey of approaches and future challenges. IEEE Internet Comput 11:36\u201345","journal-title":"IEEE Internet Comput"},{"key":"19_CR17","doi-asserted-by":"crossref","unstructured":"Huang Z, Wang L, Shen H, Shao J, Zhou X (2009) Online near-duplicate video clip detection and retrieval: an accurate and fast system. In: IEEE int\u2019l conference on data engineering (ICDE)","DOI":"10.1109\/ICDE.2009.17"},{"key":"19_CR18","unstructured":"Ispell (2010) http:\/\/www.gnu.org\/software\/ispell\/ispell.html. June 2010"},{"key":"19_CR19","volume-title":"The art of computer systems performance analysis: techniques for experimental design, measurement, simulation, and modeling","author":"R Jain","year":"1991","unstructured":"Jain R (1991) The art of computer systems performance analysis: techniques for experimental design, measurement, simulation, and modeling. Wiley, New York"},{"key":"19_CR20","volume-title":"Readings in information retrieval","year":"1997","unstructured":"Jones KS, Willett P (eds) (1997) Readings in information retrieval. Morgan Kaufmann, San Mateo"},{"key":"19_CR21","doi-asserted-by":"crossref","unstructured":"Koutrika G, Effendi F, Gy\u00f6ngyi Z, Heymann P, Garcia-Molina H (2007) Combating spam in tagging systems. In: Workshop on adversarial information retrieval on the Web (AIRWeb)","DOI":"10.1145\/1244408.1244420"},{"key":"19_CR22","unstructured":"Lerman K, Jones L (2007) Social browsing on Flickr. In: Int\u2019l conference on weblogs and social media (ICWSM)"},{"key":"19_CR23","doi-asserted-by":"crossref","unstructured":"Li X, Guo L, Zhao Y (2008) Tag-based social interest discovery. In: Int\u2019l World Wide Web conference (WWW)","DOI":"10.1145\/1367497.1367589"},{"key":"19_CR24","doi-asserted-by":"crossref","unstructured":"Marshall CC (2009) No bull, no spin: a\u00a0comparison of tags with other forms of user metadata. In: ACM\/IEEE conference on digital libraries (JCDL)","DOI":"10.1145\/1555400.1555438"},{"key":"19_CR25","doi-asserted-by":"crossref","unstructured":"Oliveira R, Cherubini M, Oliver N (2009) Human perception of near-duplicate videos. In: Int\u2019l conference on human-computer interaction (INTERACT)","DOI":"10.1145\/1631272.1631280"},{"key":"19_CR26","volume-title":"Information retrieval","author":"C Rijsbergen","year":"1979","unstructured":"Rijsbergen C (1979) Information retrieval. Butterworth, Stoneham"},{"key":"19_CR27","unstructured":"Rodrigues T, Benevenuto F, Almeida V, Almeida J, Gon\u00e7alves M (2009) Uma an\u00e1lise contextual de conte\u00fado duplicado no youtube. In: Simp\u00f3sio Brasileiro de sistemas multim\u00eddia e Web (WebMedia)"},{"key":"19_CR28","doi-asserted-by":"crossref","unstructured":"Suchanek F, Vojnovic M, Gunawardena D (2008) Social tags: meaning and suggestions. In: ACM conference on information and knowledge management (CIKM)","DOI":"10.1145\/1458082.1458114"},{"key":"19_CR29","doi-asserted-by":"crossref","unstructured":"Tan H-K, Ngo C-W, Hong R, Chua T-S (2009) Scalable detection of partial near-duplicate videos by visual-temporal consistency. In: ACM international conference on multimedia (MM)","DOI":"10.1145\/1631272.1631295"},{"key":"19_CR30","doi-asserted-by":"crossref","unstructured":"Wu X, Hauptmann A, Ngo C (2007) Practical elimination of near-duplicates from web video search. In: Int\u2019l conference on multimedia","DOI":"10.1145\/1291233.1291280"},{"key":"19_CR31","doi-asserted-by":"crossref","unstructured":"Zhu J, Hoi S, Lyu M, Yan S (2008) Near-duplicate keyframe retrieval by nonrigid image matching. In: ACM int\u2019l conference on multimedia (MM)","DOI":"10.1145\/1459359.1459366"},{"key":"19_CR32","doi-asserted-by":"crossref","unstructured":"Zink M, Suh K, Gu Y, Kurose J (2008) Watch global, cache local: Youtube network traces at a campus network\u2014measurements and implications. In: IEEE multimedia computing and networking (MMCN)","DOI":"10.1117\/12.774903"}],"container-title":["Journal of the Brazilian Computer Society"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13173-010-0019-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s13173-010-0019-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s13173-010-0019-x","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13173-010-0019-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,11,6]],"date-time":"2021-11-06T13:03:27Z","timestamp":1636203807000},"score":1,"resource":{"primary":{"URL":"https:\/\/journal-bcs.springeropen.com\/articles\/10.1007\/s13173-010-0019-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,8,19]]},"references-count":32,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2010,9]]}},"alternative-id":["19"],"URL":"https:\/\/doi.org\/10.1007\/s13173-010-0019-x","relation":{},"ISSN":["0104-6500","1678-4804"],"issn-type":[{"value":"0104-6500","type":"print"},{"value":"1678-4804","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,8,19]]},"assertion":[{"value":"8 March 2010","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 July 2010","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 August 2010","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}