{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,13]],"date-time":"2025-12-13T07:04:53Z","timestamp":1765609493672},"reference-count":49,"publisher":"Cambridge University Press (CUP)","issue":"1","license":[{"start":{"date-parts":[[2011,11,21]],"date-time":"2011-11-21T00:00:00Z","timestamp":1321833600000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2013,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Authorship Analysis aims to extract information about the authorship of documents from features within those documents. Typically, this is performed as a classification task with the aim of identifying the author of a document, given a set of documents of known authorship. Alternatively, unsupervised methods have been developed primarily as visualisation tools to assist the manual discovery of clusters of authorship within a corpus by analysts. However, there is a need in many fields for more sophisticated unsupervised methods to automate the discovery, profiling and organisation of related information through clustering of documents by authorship. An automated and unsupervised methodology for clustering documents by authorship is proposed in this paper. The methodology is named NUANCE, for<jats:italic>n<\/jats:italic>-gram Unsupervised Automated Natural Cluster Ensemble. Testing indicates that the derived clusters have a strong correlation to the true authorship of unseen documents.<\/jats:p>","DOI":"10.1017\/s1351324911000313","type":"journal-article","created":{"date-parts":[[2011,11,21]],"date-time":"2011-11-21T10:01:50Z","timestamp":1321869710000},"page":"95-120","source":"Crossref","is-referenced-by-count":26,"title":["Automated unsupervised authorship analysis using evidence accumulation clustering"],"prefix":"10.1017","volume":"19","author":[{"given":"ROBERT","family":"LAYTON","sequence":"first","affiliation":[]},{"given":"PAUL","family":"WATTERS","sequence":"additional","affiliation":[]},{"given":"RICHARD","family":"DAZELEY","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2011,11,21]]},"reference":[{"key":"S1351324911000313_ref44","doi-asserted-by":"publisher","DOI":"10.1145\/1326561.1326564"},{"key":"S1351324911000313_ref47","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2005.845141"},{"key":"S1351324911000313_ref40","doi-asserted-by":"publisher","DOI":"10.2307\/1217208"},{"key":"S1351324911000313_ref39","doi-asserted-by":"publisher","DOI":"10.1016\/0377-0427(87)90125-7"},{"key":"S1351324911000313_ref35","doi-asserted-by":"publisher","DOI":"10.1080\/15567280601142178"},{"key":"S1351324911000313_ref33","first-page":"2409","volume-title":"Proceedings of the International Conference on Image Processing (ICIP)","author":"Parag","year":"2009"},{"key":"S1351324911000313_ref32","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1145\/988672.988678","volume-title":"Proceedings of the 13th International Conference on World Wide Web","author":"Novak","year":"2004"},{"key":"S1351324911000313_ref49","first-page":"59","volume-title":"Lecture Notes in Computer Science","author":"Zheng","year":"2003"},{"key":"S1351324911000313_ref28","first-page":"149","article-title":"Forensic characteristics of phishing \u2013 petty theft or organized crime?","volume":"1","author":"McCombie","year":"2008","journal-title":"WEBIST"},{"key":"S1351324911000313_ref43","first-page":"60","volume-title":"Proceedings of the Cybercrime and Trustworthy Computing Workshop","author":"Turville","year":"2010"},{"key":"S1351324911000313_ref14","doi-asserted-by":"publisher","DOI":"10.2307\/2982671"},{"key":"S1351324911000313_ref41","doi-asserted-by":"publisher","DOI":"10.1002\/asi.21001"},{"key":"S1351324911000313_ref26","doi-asserted-by":"publisher","DOI":"10.1145\/1121949.1121951"},{"key":"S1351324911000313_ref25","article-title":"Recentred local profiles for authorship attribution","author":"Layton","year":"2011","journal-title":"Journal of Natural Language Engineering"},{"key":"S1351324911000313_ref38","first-page":"410","volume-title":"Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)","author":"Rosenberg","year":"2007"},{"key":"S1351324911000313_ref21","unstructured":"Ke\u0161elj V. , Peng F. , Cercone N. , and Thomas C. 2003. N-gram-based author profiles for authorship attribution. In Proceedings of the Pacific Association for Computational Linguistics, pp. 255\u2013264."},{"key":"S1351324911000313_ref19","volume-title":"Authorship Attribution","author":"Juola","year":"2008"},{"key":"S1351324911000313_ref18","first-page":"175","volume-title":"Proceedings of 2004 Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities (ALLC\/ACH 2004)","author":"Juola","year":"2004"},{"key":"S1351324911000313_ref15","doi-asserted-by":"publisher","DOI":"10.1007\/BF01830689"},{"key":"S1351324911000313_ref46","doi-asserted-by":"publisher","DOI":"10.1108\/13685201111098860"},{"key":"S1351324911000313_ref12","first-page":"450","volume-title":"Proceedings of International Symposium on Distributed Computing and Applications to Business, Engineering and Science","author":"Gao","year":"2010"},{"key":"S1351324911000313_ref20","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1145\/1455770.1455774","volume-title":"Proceedings of the 15th ACM Conference on Computer and Communications Security","author":"Kanich","year":"2008"},{"key":"S1351324911000313_ref7","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4419-1325-8_1"},{"key":"S1351324911000313_ref36","unstructured":"Raghavan S. , Kovashka A. and Mooney R. 2010. Authorship attribution using probabilistic context-free grammars. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL-2010), Association for Computational Linguistics, pp. 38\u201342."},{"key":"S1351324911000313_ref23","first-page":"1","volume-title":"2010 Second Cybercrime and Trustworthy Computing Workshop","author":"Layton","year":"2010"},{"key":"S1351324911000313_ref4","doi-asserted-by":"publisher","DOI":"10.1145\/1461928.1461959"},{"key":"S1351324911000313_ref37","volume-title":"Information Retrieval","author":"Rijsbergen","year":"1979"},{"key":"S1351324911000313_ref30","first-page":"1","volume-title":"Proceedings of the IEEE 2nd Annual eCrime Researchers Summit (eCrime '07)","author":"Moore","year":"2007"},{"key":"S1351324911000313_ref27","doi-asserted-by":"publisher","DOI":"10.1093\/llc\/fqq013"},{"key":"S1351324911000313_ref16","doi-asserted-by":"publisher","DOI":"10.1002\/0471725250"},{"key":"S1351324911000313_ref11","first-page":"303","volume-title":"Structural, Syntactic, and Statistical Pattern Recognition","author":"Fred","year":"2002"},{"key":"S1351324911000313_ref31","first-page":"275","article-title":"Inference in an authorship problem","volume":"58","author":"Mosteller","year":"1963","journal-title":"Journal of the American Statistical Association"},{"key":"S1351324911000313_ref5","doi-asserted-by":"publisher","DOI":"10.1109\/UIC-ATC.2009.63"},{"key":"S1351324911000313_ref1","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2005.81"},{"key":"S1351324911000313_ref3","first-page":"52","volume-title":"Proceedings of the Cybercrime and Trustworthy Computing Workshop","author":"Alazab","year":"2010"},{"key":"S1351324911000313_ref6","volume-title":"Proceedings of the Text REtrieval Conference (TREC-3)","author":"Cavnar","year":"1994"},{"key":"S1351324911000313_ref29","first-page":"295","article-title":"Mining online diaries for blogger identification","volume":"1","author":"Mohtasseb","year":"2009","journal-title":"Proceedings of the World Congress on Engineering"},{"key":"S1351324911000313_ref8","unstructured":"Cohen D. and Narayanaswamy K. 2004. Survey\/analysis of Levels I, II, and III attack attribution techniques. Technical Report, Cs3 Inc, Memphis, TN, USA."},{"key":"S1351324911000313_ref45","unstructured":"Vlachos A. , Korhonen A. and Ghahramani Z. 2009. Unsupervised and constrained Dirichlet process mixture models for verb clustering. In Proceedings of the Workshop on Geometrical Models of Natural Language Semantics, Association for Computational Linguistics, pp. 74\u201382."},{"key":"S1351324911000313_ref10","article-title":"Identifying authorship by byte-level n-grams: The source code author profile (SCAP) method","volume":"6","author":"Frantzeskou","year":"2007","journal-title":"International Journal of Digital Evidence"},{"key":"S1351324911000313_ref2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1361684.1361685","article-title":"Writeprints: a stylometric approach to identity-level identification and similarity detection in cyberspace","volume":"26","author":"Abbasi","year":"2008","journal-title":"ACM Transactions on Information Systems"},{"key":"S1351324911000313_ref9","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-14980-1_37"},{"key":"S1351324911000313_ref13","first-page":"2070","article-title":"A survey: clustering ensembles techniques","volume":"38","author":"Ghaemi","year":"2009","journal-title":"Proceedings of World Academy of Science, Engineering and Technology"},{"key":"S1351324911000313_ref48","doi-asserted-by":"publisher","DOI":"10.1002\/asi.20316"},{"key":"S1351324911000313_ref17","doi-asserted-by":"publisher","DOI":"10.1016\/j.diin.2010.03.003"},{"key":"S1351324911000313_ref34","unstructured":"Project Gutenberg Organisation. 2011. Project Gutenberg. http:\/\/www.gutenberg.org\/"},{"key":"S1351324911000313_ref22","unstructured":"Koppel M. and Schler J. 2004. Authorship verification as a one-class classification problem. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML '04), pp. 62\u201368. ISBN 1-58113-838-5."},{"key":"S1351324911000313_ref42","first-page":"525","volume-title":"Proceedings of KDD Workshop on Text Mining","author":"Steinbach","year":"2000"},{"key":"S1351324911000313_ref24","first-page":"1","volume-title":"eCrime Researchers Summit (eCrime), 2010","author":"Layton","year":"2011"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324911000313","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,12,17]],"date-time":"2021-12-17T11:20:56Z","timestamp":1639740056000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324911000313\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,11,21]]},"references-count":49,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2013,1]]}},"alternative-id":["S1351324911000313"],"URL":"https:\/\/doi.org\/10.1017\/s1351324911000313","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"value":"1351-3249","type":"print"},{"value":"1469-8110","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,11,21]]}}}