{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T01:43:37Z","timestamp":1773107017257,"version":"3.50.1"},"reference-count":47,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2013,6,1]],"date-time":"2013-06-01T00:00:00Z","timestamp":1370044800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2013,6]]},"abstract":"<jats:p>To paraphrase means to rewrite content while preserving the original meaning. Paraphrasing is important in fields such as text reuse in journalism, anonymizing work, and improving the quality of customer-written reviews. This article contributes to paraphrase acquisition and focuses on two aspects that are not addressed by current research: (1) acquisition via crowdsourcing, and (2) acquisition of passage-level samples. The challenge of the first aspect is automatic quality assurance; without such a means the crowdsourcing paradigm is not effective, and without crowdsourcing the creation of test corpora is unacceptably expensive for realistic order of magnitudes. The second aspect addresses the deficit that most of the previous work in generating and evaluating paraphrases has been conducted using sentence-level paraphrases or shorter; these short-sample analyses are limited in terms of application to plagiarism detection, for example. We present the Webis Crowd Paraphrase Corpus 2011 (Webis-CPC-11), which recently formed part of the PAN 2010 international plagiarism detection competition. This corpus comprises passage-level paraphrases with 4067 positive samples and 3792 negative samples that failed our criteria, using Amazon's Mechanical Turk for crowdsourcing. In this article, we review the lessons learned at PAN 2010, and explain in detail the method used to construct the corpus. The empirical contributions include machine learning experiments to explore if passage-level paraphrases can be identified in a two-class classification problem using paraphrase similarity features, and we find that a k-nearest-neighbor classifier can correctly distinguish between paraphrased and nonparaphrased samples with 0.980 precision at 0.523 recall. This result implies that just under half of our samples must be discarded (remaining 0.477 fraction), but our cost analysis shows that the automation we introduce results in a 18% financial saving and over 100 hours of time returned to the researchers when repeating a similar corpus design. On the other hand, when building an unrelated corpus requiring, say, 25% training data for the automated component, we show that the financial outcome is cost neutral, while still returning over 70 hours of time to the researchers. The work presented here is the first to join the paraphrasing and plagiarism communities.<\/jats:p>","DOI":"10.1145\/2483669.2483676","type":"journal-article","created":{"date-parts":[[2013,7,1]],"date-time":"2013-07-01T12:27:28Z","timestamp":1372681648000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":40,"title":["Paraphrase acquisition via crowdsourcing and machine learning"],"prefix":"10.1145","volume":"4","author":[{"given":"Steven","family":"Burrows","sequence":"first","affiliation":[{"name":"Bauhaus-Universit\u00e4t Weimar, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Martin","family":"Potthast","sequence":"additional","affiliation":[{"name":"Bauhaus-Universit\u00e4t Weimar, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Benno","family":"Stein","sequence":"additional","affiliation":[{"name":"Bauhaus-Universit\u00e4t Weimar, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2013,7]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the 1st SIGIR Workshop on the Future of IR Evaluation, S. Geva, J. Kamps, C. Peters, T. Sakai, A. Trotman, and E. Voorhees, Eds., IR Publications","author":"Alonso O.","unstructured":"Alonso , O. and Mizzaro , S . 2009. Can we get rid of trec assessors&quest; Using mechanical turk for relevance assessment . In Proceedings of the 1st SIGIR Workshop on the Future of IR Evaluation, S. Geva, J. Kamps, C. Peters, T. Sakai, A. Trotman, and E. Voorhees, Eds., IR Publications , Boston, MA, 15--16. Alonso, O. and Mizzaro, S. 2009. Can we get rid of trec assessors&quest; Using mechanical turk for relevance assessment. In Proceedings of the 1st SIGIR Workshop on the Future of IR Evaluation, S. Geva, J. Kamps, C. Peters, T. Sakai, A. Trotman, and E. Voorhees, Eds., IR Publications, Boston, MA, 15--16."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 17th Conference on International Language Resources and Evaluation (LREC'10)","author":"Ambati V.","unstructured":"Ambati , V. , Vogel , S. , and Carbonell , J . 2010. Active learning and crowd-sourcing for machine translation . In Proceedings of the 17th Conference on International Language Resources and Evaluation (LREC'10) , N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner, and D. Tapias, Eds., European Language Resources Association, 2169--2174. Ambati, V., Vogel, S., and Carbonell, J. 2010. Active learning and crowd-sourcing for machine translation. In Proceedings of the 17th Conference on International Language Resources and Evaluation (LREC'10), N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner, and D. Tapias, Eds., European Language Resources Association, 2169--2174."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/1892211.1892215"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1142055.1142067"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073445.1073448"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1866029.1866078"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.3115\/1219079.1219089"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1177\/1354856507084420"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5210\/fm.v13i6.2159"},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the 3rd International Workshop on Paraphrasing, M. Dras and K. Yamamoto, Eds., 1--8.","author":"Brockett C.","unstructured":"Brockett , C. and Dolan , W. B . 2005. Support vector machines for paraphrase identification and corpus construction . In Proceedings of the 3rd International Workshop on Paraphrasing, M. Dras and K. Yamamoto, Eds., 1--8. Brockett, C. and Dolan, W. B. 2005. Support vector machines for paraphrase identification and corpus construction. In Proceedings of the 3rd International Workshop on Paraphrasing, M. Dras and K. Yamamoto, Eds., 1--8."},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 1st NAACL HLT Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, C. Callison-Burch and M. Dredze, Eds., Association for Computational Linguistics, 217--221","author":"Buzek O.","unstructured":"Buzek , O. , Resnik , P. , and Bederson , B . 2010. Error driven paraphrase annotation using mechanical turk . In Proceedings of the 1st NAACL HLT Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, C. Callison-Burch and M. Dredze, Eds., Association for Computational Linguistics, 217--221 . Buzek, O., Resnik, P., and Bederson, B. 2010. Error driven paraphrase annotation using mechanical turk. In Proceedings of the 1st NAACL HLT Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, C. Callison-Burch and M. Dredze, Eds., Association for Computational Linguistics, 217--221."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/1699510.1699548"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 1st NAACL HLT Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, C. Callison-Burch and M. Dredze, Eds., Association for Computational Linguistics, 1--12","author":"Callison-Burch C.","unstructured":"Callison-Burch , C. and Dredze , M . 2010. Creating speech and language data with amazon's mechanical turk . In Proceedings of the 1st NAACL HLT Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, C. Callison-Burch and M. Dredze, Eds., Association for Computational Linguistics, 1--12 . Callison-Burch, C. and Dredze, M. 2010. Creating speech and language data with amazon's mechanical turk. In Proceedings of the 1st NAACL HLT Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, C. Callison-Burch and M. Dredze, Eds., Association for Computational Linguistics, 1--12."},{"key":"e_1_2_1_14_1","unstructured":"Chen D. and Dolan W. 2011. Collecting highly parallel data for paraphrase evaluation. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies D. Lin Y. Matsumoto and R. Mihalcea Eds. Association for Computational Linguistics 190--200.   Chen D. and Dolan W. 2011. Collecting highly parallel data for paraphrase evaluation. In Proceedings of the 49 th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies D. Lin Y. Matsumoto and R. Mihalcea Eds. Association for Computational Linguistics 190--200."},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","unstructured":"Chevelu J. Lavergne T. Lepage Y. and Moudenc T. 2009. Introduction of a new paraphrase generation tool based on monte-carlo sampling. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP K.-Y. Su J. Su and J.Wiebe Eds. The Association for Computer Linguistics 249--252.   Chevelu J. Lavergne T. Lepage Y. and Moudenc T. 2009. Introduction of a new paraphrase generation tool based on monte-carlo sampling. In Proceedings of the 47 th Annual Meeting of the Association for Computational Linguistics and the 4 th International Joint Conference on Natural Language Processing of the AFNLP K.-Y. Su J. Su and J.Wiebe Eds. The Association for Computer Linguistics 249--252.","DOI":"10.3115\/1667583.1667660"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073110"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1162\/coli.08-003-R1-07-044"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCGI.2007.4"},{"key":"e_1_2_1_19_1","first-page":"12","article-title":"New functions for unsupervised asymmetrical paraphrase detection","volume":"2","author":"Cordeiro J.","year":"2007","unstructured":"Cordeiro , J. , Dias , G. , and Brazdil , P. 2007 b. New functions for unsupervised asymmetrical paraphrase detection . J. Softw. 2 , 4, 12 -- 23 . Cordeiro, J., Dias, G., and Brazdil, P. 2007b. New functions for unsupervised asymmetrical paraphrase detection. J. Softw. 2, 4, 12--23.","journal-title":"J. Softw."},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 1st ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, B. Dolan and I. Dagan, Eds., Association for Computational Linguistics, 13--18","author":"Corley C.","unstructured":"Corley , C. and Mihalcea , R . 2005. Measuring the semantic similarity of texts . In Proceedings of the 1st ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, B. Dolan and I. Dagan, Eds., Association for Computational Linguistics, 13--18 . Corley, C. and Mihalcea, R. 2005. Measuring the semantic similarity of texts. In Proceedings of the 1st ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, B. Dolan and I. Dagan, Eds., Association for Computational Linguistics, 13--18."},{"key":"e_1_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Das D. and Smith N. A. 2009. Paraphrase identification as probabilistic quasi-synchronous recognition. In Proceedings of the Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP K.-Y. Su J. Su and J. Wiebe Eds. Association for Computational Linguistics 468--476.   Das D. and Smith N. A. 2009. Paraphrase identification as probabilistic quasi-synchronous recognition. In Proceedings of the Conference of the 47 th Annual Meeting of the ACL and the 4 th International Joint Conference on Natural Language Processing of the AFNLP K.-Y. Su J. Su and J. Wiebe Eds. Association for Computational Linguistics 468--476.","DOI":"10.3115\/1687878.1687944"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 1st NAACL HLT Workshop on Creating Speech and Language Data With Amazon's Mechanical Turk, C. Callison-Burch and M. Dredze, Eds., Association for Computational Linguistics, 66--70","author":"Denkowski M.","unstructured":"Denkowski , M. , Al-Haj , H. , and Lavie , A . 2010. Turker-assisted paraphrasing for english-arabic machine translation . In Proceedings of the 1st NAACL HLT Workshop on Creating Speech and Language Data With Amazon's Mechanical Turk, C. Callison-Burch and M. Dredze, Eds., Association for Computational Linguistics, 66--70 . Denkowski, M., Al-Haj, H., and Lavie, A. 2010. Turker-assisted paraphrasing for english-arabic machine translation. In Proceedings of the 1st NAACL HLT Workshop on Creating Speech and Language Data With Amazon's Mechanical Turk, C. Callison-Burch and M. Dredze, Eds., Association for Computational Linguistics, 66--70."},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 3rd International Workshop on Paraphrasing, M. Dras and K. Yamamoto, Eds., 1--8.","author":"Dolan W. B.","unstructured":"Dolan , W. B. and Brockett , C . 2005. Automatically constructing a corpus of sentential paraphrases . In Proceedings of the 3rd International Workshop on Paraphrasing, M. Dras and K. Yamamoto, Eds., 1--8. Dolan, W. B. and Brockett, C. 2005. Automatically constructing a corpus of sentential paraphrases. In Proceedings of the 3rd International Workshop on Paraphrasing, M. Dras and K. Yamamoto, Eds., 1--8."},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the Northumbria Conference\u2014Educating for the Future. 1--6.","author":"Dordoy A.","year":"2002","unstructured":"Dordoy , A. 2002 . Cheating and plagiarism: Student and staff perceptions at Northumbria . In Proceedings of the Northumbria Conference\u2014Educating for the Future. 1--6. Dordoy, A. 2002. Cheating and plagiarism: Student and staff perceptions at Northumbria. In Proceedings of the Northumbria Conference\u2014Educating for the Future. 1--6."},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 11th Annual Research Colloquium of the UK Special-Interest Group for Computational Lingusitics. 1--7.","author":"Fernando S.","unstructured":"Fernando , S. and Stevenson , M . 2008. A semantic approach to paraphrase identification . In Proceedings of the 11th Annual Research Colloquium of the UK Special-Interest Group for Computational Lingusitics. 1--7. Fernando, S. and Stevenson, M. 2008. A semantic approach to paraphrase identification. In Proceedings of the 11th Annual Research Colloquium of the UK Special-Interest Group for Computational Lingusitics. 1--7."},{"key":"e_1_2_1_27_1","doi-asserted-by":"crossref","unstructured":"Franklyn-Stokes A. and Newstead S. E. 1995. Undergraduate cheating: Who does what and why&quest; Studies Higher Educ. 20 2 159--172.  Franklyn-Stokes A. and Newstead S. E. 1995. Undergraduate cheating: Who does what and why&quest; Studies Higher Educ. 20 2 159--172.","DOI":"10.1080\/03075079512331381673"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 12th International Conference on Computers in Education, E. McKay, Ed., 1275--1284","author":"Hamilton M.","unstructured":"Hamilton , M. , Tahaghoghi , S. M. M. , and Walker , C . 2004. Educating students about plagiarism avoidance\u2014A computer science perspective . In Proceedings of the 12th International Conference on Computers in Education, E. McKay, Ed., 1275--1284 . Hamilton, M., Tahaghoghi, S. M. M., and Walker, C. 2004. Educating students about plagiarism avoidance\u2014A computer science perspective. In Proceedings of the 12th International Conference on Computers in Education, E. McKay, Ed., 1275--1284."},{"key":"e_1_2_1_29_1","volume-title":"Weka: A machine learning workbench. In Proceedings of the 2nd Australia and New Zealand Conference on Intelligent Information Systems","author":"Holmes G.","year":"1994","unstructured":"Holmes , G. , Donkin , A. , and Witten , I . 1994 . Weka: A machine learning workbench. In Proceedings of the 2nd Australia and New Zealand Conference on Intelligent Information Systems , J. Sitte, Ed., IEEE Computer Society Press , 357--361. Holmes, G., Donkin, A., and Witten, I. 1994. Weka: A machine learning workbench. In Proceedings of the 2nd Australia and New Zealand Conference on Intelligent Information Systems, J. Sitte, Ed., IEEE Computer Society Press, 357--361."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1357054.1357127"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(02)00222-9"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/11816508_52"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.2753\/MIS0742-1222260108"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1162\/coli_a_00002"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.5555\/1667884.1667889"},{"key":"e_1_2_1_36_1","first-page":"1050","article-title":"Plagiarism\u2014A survey","volume":"12","author":"Maurer H.","year":"2006","unstructured":"Maurer , H. , Kappe , F. , and Zaka , B. 2006 . Plagiarism\u2014A survey . J. Univer. Comput. Sci. 12 , 8, 1050 -- 1084 . Maurer, H., Kappe, F., and Zaka, B. 2006. Plagiarism\u2014A survey. J. Univer. Comput. Sci. 12, 8, 1050--1084.","journal-title":"J. Univer. Comput. Sci."},{"key":"e_1_2_1_37_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.21913\/IJEI.v1i1.14","article-title":"Cheating among college and university students: A North American perspective","volume":"1","author":"Mccabe D. L.","year":"2005","unstructured":"Mccabe , D. L. 2005 . Cheating among college and university students: A North American perspective . Int. J. Educ. Integr. 1 , 1, 1 -- 11 . Mccabe, D. L. 2005. Cheating among college and university students: A North American perspective. Int. J. Educ. Integr. 1, 1, 1--11.","journal-title":"Int. J. Educ. Integr."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073135"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the Demonstration Papers at HLT-NAACL, D. Palmer, J. Polifroni, and D. Roy, Eds., Association for Computational Linguistics, 38--41","author":"Pedersen T.","unstructured":"Pedersen , T. , Patwardhan , S. , and Michelizzi , J . 2004. WordNet: Similarity: Measuring the relatedness of concepts . In Proceedings of the Demonstration Papers at HLT-NAACL, D. Palmer, J. Polifroni, and D. Roy, Eds., Association for Computational Linguistics, 38--41 . Pedersen, T., Patwardhan, S., and Michelizzi, J. 2004. WordNet: Similarity: Measuring the relatedness of concepts. In Proceedings of the Demonstration Papers at HLT-NAACL, D. Palmer, J. Polifroni, and D. Roy, Eds., Association for Computational Linguistics, 38--41."},{"key":"e_1_2_1_40_1","volume-title":"Notebook Papers of CLEF 10 Labs and Workshops, M. Braschler and D. Harman, Eds.","author":"Potthast M.","unstructured":"Potthast , M. , Barr\u00f3n-Cede\u00f1o , A. , Eiselt , A. , Stein , B. , and Rosso , P . 2010a. Overview of the 2nd international competition on plagiarism detection . In Notebook Papers of CLEF 10 Labs and Workshops, M. Braschler and D. Harman, Eds. , Potthast, M., Barr\u00f3n-Cede\u00f1o, A., Eiselt, A., Stein, B., and Rosso, P. 2010a. Overview of the 2nd international competition on plagiarism detection. In Notebook Papers of CLEF 10 Labs and Workshops, M. Braschler and D. Harman, Eds.,"},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 23rd International Conference on Computational Linguistics (COLING'10)","author":"Potthast M.","unstructured":"Potthast , M. , Stein , B. , Barr\u00f3n-Cede\u00f1o , A. , and Rosso , P . 2010b. An evaluation framework for plagiarism detection . In Proceedings of the 23rd International Conference on Computational Linguistics (COLING'10) . C.-R. Huang and D. Jurafsky, Eds., Association for Computational Linguistics, 997--1005. Potthast, M., Stein, B., Barr\u00f3n-Cede\u00f1o, A., and Rosso, P. 2010b. An evaluation framework for plagiarism detection. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING'10). C.-R. Huang and D. Jurafsky, Eds., Association for Computational Linguistics, 997--1005."},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the 11th Conference on Empirical Methods in Natural Language Processing, D. Jurafsky and E. Gaussier, Eds., Association for Computational Linguistics, 18--26","author":"Qiu L.","unstructured":"Qiu , L. , Kan , M.-Y. , and Chua , T . -S. 2006. Paraphrase recognition via dissimilarity significance classification . In Proceedings of the 11th Conference on Empirical Methods in Natural Language Processing, D. Jurafsky and E. Gaussier, Eds., Association for Computational Linguistics, 18--26 . Qiu, L., Kan, M.-Y., and Chua, T.-S. 2006. Paraphrase recognition via dissimilarity significance classification. In Proceedings of the 11th Conference on Empirical Methods in Natural Language Processing, D. Jurafsky and E. Gaussier, Eds., Association for Computational Linguistics, 18--26."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/1978942.1979148"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/544414.544468"},{"key":"e_1_2_1_45_1","volume-title":"Proceedings of the 13th Conference on Empirical Methods in Natural Language Processing, M. Lapata and H. T. Ng, Eds., Association for Computational Linguistics, 254--263","author":"Snow R.","unstructured":"Snow , R. , O'Connor , B. , Jurafsky , D. , and Ng , A. Y . 2008. Cheap and fast\u2014But is it good&quest;: Evaluating non-expert annotations for natural language tasks . In Proceedings of the 13th Conference on Empirical Methods in Natural Language Processing, M. Lapata and H. T. Ng, Eds., Association for Computational Linguistics, 254--263 . Snow, R., O'Connor, B., Jurafsky, D., and Ng, A. Y. 2008. Cheap and fast\u2014But is it good&quest;: Evaluating non-expert annotations for natural language tasks. In Proceedings of the 13th Conference on Empirical Methods in Natural Language Processing, M. Lapata and H. T. Ng, Eds., Association for Computational Linguistics, 254--263."},{"key":"e_1_2_1_46_1","volume-title":"Probability Estimation Trees in Weka&quest","author":"Su J.","year":"2007","unstructured":"Su , J. 2007. Probability Estimation Trees in Weka&quest ; Wekalist mailing list. https:\/\/list.scms.waikato.ac.nz\/pipermail\/wekalist\/ 2007 -September\/011343.html. Su, J. 2007. Probability Estimation Trees in Weka&quest; Wekalist mailing list. https:\/\/list.scms.waikato.ac.nz\/pipermail\/wekalist\/2007-September\/011343.html."},{"key":"e_1_2_1_48_1","volume-title":"Proceedings of the 4th Australasian Language Technology Workshop, L. Cavedon and I. Zukerman, Eds., Australasian Language Technology Association, 131--138","author":"Wan S.","unstructured":"Wan , S. , Dras , M. , Dale , R. , and Paris , C. 2006. Using dependency-based features to take the \u201cpara-farce\u201d out of paraphrase . In Proceedings of the 4th Australasian Language Technology Workshop, L. Cavedon and I. Zukerman, Eds., Australasian Language Technology Association, 131--138 . Wan, S., Dras, M., Dale, R., and Paris, C. 2006. Using dependency-based features to take the \u201cpara-farce\u201d out of paraphrase. In Proceedings of the 4th Australasian Language Technology Workshop, L. Cavedon and I. Zukerman, Eds., Australasian Language Technology Association, 131--138."},{"key":"e_1_2_1_49_1","doi-asserted-by":"crossref","unstructured":"Zhao S. Lan X. Liu T. and Li S. 2009. Application-driven statistical paraphrase generation. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP K.-Y. Su J. Su and J. Wiebe Eds. Association for Computer Linguistics 834--842.   Zhao S. Lan X. Liu T. and Li S. 2009. Application-driven statistical paraphrase generation. In Proceedings of the 47 th Annual Meeting of the Association for Computational Linguistics and the 4 th International Joint Conference on Natural Language Processing of the AFNLP K.-Y. Su J. Su and J. Wiebe Eds. Association for Computer Linguistics 834--842.","DOI":"10.3115\/1690219.1690263"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2483669.2483676","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2483669.2483676","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T20:14:36Z","timestamp":1750277676000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2483669.2483676"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,6]]},"references-count":47,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2013,6]]}},"alternative-id":["10.1145\/2483669.2483676"],"URL":"https:\/\/doi.org\/10.1145\/2483669.2483676","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"value":"2157-6904","type":"print"},{"value":"2157-6912","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,6]]},"assertion":[{"value":"2011-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-07-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}