{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T21:11:19Z","timestamp":1768597879929,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":35,"publisher":"ACM","license":[{"start":{"date-parts":[[2011,10,28]],"date-time":"2011-10-28T00:00:00Z","timestamp":1319760000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2011,10,28]]},"DOI":"10.1145\/2065023.2065031","type":"proceedings-article","created":{"date-parts":[[2011,11,16]],"date-time":"2011-11-16T10:40:21Z","timestamp":1321440021000},"page":"27-34","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Detection of near-duplicate user generated contents"],"prefix":"10.1145","author":[{"given":"Enrique","family":"Vall\u00e9s","sequence":"first","affiliation":[{"name":"Universidad Polit\u00e9cnica de Valencia, Valencia, Spain"}]},{"given":"Paolo","family":"Rosso","sequence":"additional","affiliation":[{"name":"Universidad Polit\u00e9cnica de Valencia, Valencia, Spain"}]}],"member":"320","published-online":{"date-parts":[[2011,10,28]]},"reference":[{"key":"e_1_3_2_1_1_1","first-page":"5","volume-title":"The 6th SEAAIR Annual Conference","author":"Ahmad H.","year":"2006","unstructured":"H. Ahmad . Plagiarism detection systems : An evaluation of several systems . In The 6th SEAAIR Annual Conference , pages 5 -- 7 , Langkawi , 2006 . H. Ahmad. Plagiarism detection systems : An evaluation of several systems. In The 6th SEAAIR Annual Conference, pages 5--7, Langkawi, 2006."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2034691.2034742"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICMLA.2009.22"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1774088.1774481"},{"key":"e_1_3_2_1_5_1","volume-title":"Detecting student copying in a corpus of science laboratory reports: simple and smart approaches","author":"Atwell E.","year":"2002","unstructured":"E. Atwell , P. Gent , J. Medori , and C. Souter . Detecting student copying in a corpus of science laboratory reports: simple and smart approaches , 2002 . E. Atwell, P. Gent, J. Medori, and C. Souter. Detecting student copying in a corpus of science laboratory reports: simple and smart approaches, 2002."},{"key":"e_1_3_2_1_6_1","first-page":"37","volume-title":"Proc. of the 23rd International Conference on Computational Linguistics, COLING-2010","author":"Barr\u00f3n-Cede\u00f1o A.","year":"2010","unstructured":"A. Barr\u00f3n-Cede\u00f1o , P. Rosso , E. Agirre , and G. Labaka . Plagiarism detection across distant language pairs . In Proc. of the 23rd International Conference on Computational Linguistics, COLING-2010 , pages 37 -- 45 , Beijing, China , 2010 . A. Barr\u00f3n-Cede\u00f1o, P. Rosso, E. Agirre, and G. Labaka. Plagiarism detection across distant language pairs. In Proc. of the 23rd International Conference on Computational Linguistics, COLING-2010, pages 37--45, Beijing, China, 2010."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-00382-0_42"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/223784.223855"},{"key":"e_1_3_2_1_10_1","first-page":"21","volume-title":"Compression and Complexity of Sequences","author":"Broder A. Z.","year":"1997","unstructured":"A. Z. Broder . On the resemblance and containment of documents . Compression and Complexity of Sequences , pages 21 -- 29 , 1997 . A. Z. Broder. On the resemblance and containment of documents. Compression and Complexity of Sequences, pages 21--29, 1997."},{"key":"e_1_3_2_1_11_1","first-page":"1157","volume-title":"Selected papers from the sixth international conference on World Wide Web","author":"Broder A. Z.","year":"1997","unstructured":"A. Z. Broder , S. C. Glassman , M. S. Manasse , and G. Zweig . Syntactic clustering of the web . In Selected papers from the sixth international conference on World Wide Web , pages 1157 -- 1166 , Essex, UK , 1997 . Elsevier Science Publishers Ltd . A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig. Syntactic clustering of the web. In Selected papers from the sixth international conference on World Wide Web, pages 1157--1166, Essex, UK, 1997. Elsevier Science Publishers Ltd."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1008992.1009131"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/775152.775226"},{"key":"e_1_3_2_1_14_1","first-page":"147","volume-title":"Proceedings of the 7th International Conference on Information Technology Based Higher Education and Training (ITHET '06)","author":"Dierderich J.","year":"2006","unstructured":"J. Dierderich . Computational methods to detect plagiarism in assessment . Proceedings of the 7th International Conference on Information Technology Based Higher Education and Training (ITHET '06) , pages 147 -- 154 , 2006 . J. Dierderich. Computational methods to detect plagiarism in assessment. Proceedings of the 7th International Conference on Information Technology Based Higher Education and Training (ITHET '06), pages 147--154, 2006."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/951953.952397"},{"key":"e_1_3_2_1_16_1","first-page":"10","volume-title":"SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 09)","author":"Grozea C.","year":"2009","unstructured":"C. Grozea , C. Gehl , and M. Popescu . ENCOPLOT: Pairwise sequences matching in linear applied to plagiarism detection. In B. Stein, P. Rosso, E. Stamatatos, M. Koppel, and E. Agirre, editors , SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 09) , pages 10 -- 18 , San Sebastian, Spain , 2009 . CEURWS.org. http:\/\/ceur-ws.org\/Vol-502. C. Grozea, C. Gehl, and M. Popescu. ENCOPLOT: Pairwise sequences matching in linear applied to plagiarism detection. In B. Stein, P. Rosso, E. Stamatatos, M. Koppel, and E. Agirre, editors, SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 09), pages 10--18, San Sebastian, Spain, 2009. CEURWS.org. http:\/\/ceur-ws.org\/Vol-502."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1835449.1835520"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1341531.1341560"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/11846406_83"},{"key":"e_1_3_2_1_20_1","volume-title":"Notebook Papers of CLEF 2010 LABs and Workshops","author":"Kasprzak J.","year":"2010","unstructured":"J. Kasprzak and M. Brandejs . Improving the reliability of the plagiarism detection system . In Notebook Papers of CLEF 2010 LABs and Workshops , University of Padova, Padova, Italy , 2010 . J. Kasprzak and M. Brandejs. Improving the reliability of the plagiarism detection system. In Notebook Papers of CLEF 2010 LABs and Workshops, University of Padova, Padova, Italy, 2010."},{"key":"e_1_3_2_1_21_1","volume-title":"CEAS","author":"Kolcz A.","year":"2007","unstructured":"A. Kolcz and A. Chowdhury . Hardening fingerprinting by context . In CEAS , 2007 . A. Kolcz and A. Chowdhury. Hardening fingerprinting by context. In CEAS, 2007."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-007-0171-z"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.v60:1"},{"key":"e_1_3_2_1_24_1","first-page":"514","article-title":"Duplicate and near duplicate documents detection: A review","volume":"32","author":"Kumar J. P.","year":"2009","unstructured":"J. P. Kumar and G. P . Duplicate and near duplicate documents detection: A review . European Journal of Scientific Research , 32 : 514 -- 527 , 2009 . J. P. Kumar and G. P. Duplicate and near duplicate documents detection: A review. European Journal of Scientific Research, 32:514--527, 2009.","journal-title":"European Journal of Scientific Research"},{"key":"e_1_3_2_1_25_1","volume-title":"Practice and Policies Conference","author":"Lyon C.","year":"2004","unstructured":"C. Lyon , R. Barrett , and J. Malcolm . A theoretical basis to the automated detection of copying between texts, and its practical implementation in the Ferret plagiarism and collusion detector. In Plagiarism: Prevention , Practice and Policies Conference , Newcastle, UK , 2004 . C. Lyon, R. Barrett, and J. Malcolm. A theoretical basis to the automated detection of copying between texts, and its practical implementation in the Ferret plagiarism and collusion detector. In Plagiarism: Prevention, Practice and Policies Conference, Newcastle, UK, 2004."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242592"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5120\/2374-3128"},{"key":"e_1_3_2_1_28_1","volume-title":"Notebook Papers of CLEF 2010 LABs and Workshops","author":"Muhr M.","year":"2010","unstructured":"M. Muhr , R. Kern , Z. M., and M. Granitzer . External and intrinsic plagiarism detection using a cross-lingual retrieval and segmentation system . In Notebook Papers of CLEF 2010 LABs and Workshops , University of Padova, Italy , 2010 . M. Muhr, R. Kern, Z. M., and M. Granitzer. External and intrinsic plagiarism detection using a cross-lingual retrieval and segmentation system. In Notebook Papers of CLEF 2010 LABs and Workshops, University of Padova, Italy, 2010."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/1947599.1947616"},{"key":"e_1_3_2_1_30_1","first-page":"309","volume-title":"ACL'11","author":"Ott M.","year":"2011","unstructured":"M. Ott , Y. Choi , C. Cardie , and J. T. Hancock . Finding deceptive opinion spam by any stretch of the imagination . ACL'11 , pages 309 -- 319 , 2011 . M. Ott, Y. Choi, C. Cardie, and J. T. Hancock. Finding deceptive opinion spam by any stretch of the imagination. ACL'11, pages 309--319, 2011."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-009-9114-z"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/872757.872770"},{"key":"e_1_3_2_1_33_1","volume-title":"Proceedings of the 2nd International Conference on Theory and Practice of Digital Libraries","author":"Shivakumar N.","year":"1995","unstructured":"N. Shivakumar and H. Garcia-Molina . SCAM: A copy detection mechanism for digital documents . Proceedings of the 2nd International Conference on Theory and Practice of Digital Libraries , 1995 . N. Shivakumar and H. Garcia-Molina. SCAM: A copy detection mechanism for digital documents. Proceedings of the 2nd International Conference on Theory and Practice of Digital Libraries, 1995."},{"key":"e_1_3_2_1_34_1","first-page":"38","volume-title":"Proceedings of the SEPLN'09 Workshop on Uncovering Plagiarism, Authorship and Social Software","author":"Stamatatos E.","year":"2009","unstructured":"E. Stamatatos . Intrinsic plagiarism detection using character n-gram profiles . Proceedings of the SEPLN'09 Workshop on Uncovering Plagiarism, Authorship and Social Software , pages 38 -- 46 , 2009 . E. Stamatatos. Intrinsic plagiarism detection using character n-gram profiles. Proceedings of the SEPLN'09 Workshop on Uncovering Plagiarism, Authorship and Social Software, pages 38--46, 2009."},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-010-9115-y"},{"key":"e_1_3_2_1_37_1","first-page":"34","volume-title":"SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 09)","author":"Balaguer E. Vall\u00e9s","year":"2009","unstructured":"E. Vall\u00e9s Balaguer . Putting ourselves in SME's shoes: Automatic detection of plagiarism by the WCopyFind tool. In B. Stein, P. Rosso, E. Stamatatos, M. Koppel, and E. Agirre, editors , SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 09) , pages 34 -- 35 , San Sebastian, Spain , 2009 . E. Vall\u00e9s Balaguer. Putting ourselves in SME's shoes: Automatic detection of plagiarism by the WCopyFind tool. In B. Stein, P. Rosso, E. Stamatatos, M. Koppel, and E. Agirre, editors, SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 09), pages 34--35, San Sebastian, Spain, 2009."}],"event":{"name":"CIKM '11: International Conference on Information and Knowledge Management","location":"Glasgow Scotland, UK","acronym":"CIKM '11","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 3rd international workshop on Search and mining user-generated contents"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2065023.2065031","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2065023.2065031","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T09:48:37Z","timestamp":1750240117000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2065023.2065031"}},"subtitle":["the SMS spam collection"],"short-title":[],"issued":{"date-parts":[[2011,10,28]]},"references-count":35,"alternative-id":["10.1145\/2065023.2065031","10.1145\/2065023"],"URL":"https:\/\/doi.org\/10.1145\/2065023.2065031","relation":{},"subject":[],"published":{"date-parts":[[2011,10,28]]},"assertion":[{"value":"2011-10-28","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}