{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,28]],"date-time":"2025-04-28T16:33:29Z","timestamp":1745858009760},"reference-count":22,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2021,12,6]],"date-time":"2021-12-06T00:00:00Z","timestamp":1638748800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,10,22]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>It has been claimed that traditional interpretations of Zeta based on graphs with bisector lines (Craig and Kinney, 2009, Shakespeare, Computers, and the Mystery of Authorship. Cambridge: Cambridge University Press.) are unsound, that validation with a held-out segments is inadequate, that counting types rather than tokens is suspect, and that the relationship between stronger Zeta scores and shorter word n-grams is an artifact of the method rather than a research finding (Rizvi, 2019a, The interpretation of zeta test results. Digital Scholarship in the Humanities, 34(2): 401\u201318). All of these claims are unsound. The separation of base and counter segments in a Zeta analysis is in fact an important research finding, the traditional interpretation of Zeta results based on a bisector line remain sound, and validation with a held-out segments is an appropriate method. Zeta\u2019s reliance on consistency rather than frequency is not a bug but valuable feature that provides an important complementary method to those based on frequency. The results of extensive testing on corpora of prose, poetry, and drama show that increasing word n-gram length (but not character n-gram length) is often negatively correlated with classification accuracy. Higher Zeta scores are often correlated with better results, though the results vary a great deal depending on the corpus, the classification method, and the type and length of n-gram. Most importantly, these results show that Zeta gives strong results that are competitive with methods like Cosine Delta and Support Vector Machine on these classification tasks.<\/jats:p>","DOI":"10.1093\/llc\/fqab095","type":"journal-article","created":{"date-parts":[[2021,10,19]],"date-time":"2021-10-19T13:29:19Z","timestamp":1634650159000},"page":"1002-1021","source":"Crossref","is-referenced-by-count":2,"title":["Zeta revisited"],"prefix":"10.1093","volume":"37","author":[{"given":"David L","family":"Hoover","sequence":"first","affiliation":[{"name":"Department of English, New York University , New York, NY, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2021,12,6]]},"reference":[{"issue":"2","key":"2022102215083806200_fqab095-B1","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1093\/llc\/fqt028","article-title":"Language chunking, data sparseness, and the value of a long marker list: explorations with word n-grams and authorial attribution","volume":"29","author":"Antonia","year":"2014","journal-title":"Literary and Linguistic Computing"},{"issue":"1","key":"2022102215083806200_fqab095-B2","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/llc\/fqi067","article-title":"All the way through: testing for authorship in different frequency strata","volume":"22","author":"Burrows","year":"2007","journal-title":"Literary and Linguistic Computing"},{"key":"2022102215083806200_fqab095-B3","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1186\/s12864-019-6413-7","article-title":"The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation","volume":"21","author":"Chicco","year":"2020","journal-title":"BMC Genomics"},{"key":"2022102215083806200_fqab095-B4","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511605437","volume-title":"Shakespeare, Computers, and the Mystery of Authorship","author":"Craig","year":"2009"},{"issue":"1","key":"2022102215083806200_fqab095-B5","doi-asserted-by":"crossref","first-page":"107","DOI":"10.32614\/RJ-2016-007","article-title":"Stylometry with R: a package for computational text analysis","volume":"8","author":"Eder","year":"2016","journal-title":"R Journal"},{"key":"2022102215083806200_fqab095-B6","first-page":"139","volume-title":"The New Oxford Shakespeare Authorship Companion","author":"Elliott","year":"2017"},{"issue":"2","key":"2022102215083806200_fqab095-B7","article-title":"The end of the irrelevant text: electronic texts, linguistics, and literary theory","volume":"1","author":"Hoover","year":"2007","journal-title":"Digital Humanities Quarterly"},{"key":"2022102215083806200_fqab095-B8","doi-asserted-by":"crossref","first-page":"250","DOI":"10.1007\/978-1-137-06574-2_15","volume-title":"Language and Style: Essays in Honour of Mick Short","author":"Hoover","year":"2010"},{"key":"2022102215083806200_fqab095-B9","volume-title":"The Craig Zeta Spreadsheet, DH2010","author":"Hoover","year":"2010"},{"issue":"3","key":"2022102215083806200_fqab095-B10","doi-asserted-by":"crossref","first-page":"324","DOI":"10.1080\/0013838X.2012.668791","article-title":"The Tutor\u2019s Story: A Case Study of Mixed Authorship","volume":"93","author":"Hoover","year":"2012","journal-title":"English Studies"},{"key":"2022102215083806200_fqab095-B11","volume-title":"Literary Studies in the Digital Age: An Evolving Anthology","author":"Hoover","year":"2013"},{"key":"2022102215083806200_fqab095-B12","doi-asserted-by":"crossref","first-page":"230","DOI":"10.5749\/j.ctt1cn6thb.23","volume-title":"Debates in the Digital Humanities: 2016","author":"Hoover","year":"2016"},{"key":"2022102215083806200_fqab095-B13","volume-title":"Modes of Composition and the Durability of Style in Literature","author":"Hoover","year":"2021"},{"key":"2022102215083806200_fqab095-B14","author":"Hoover","year":"2021"},{"key":"2022102215083806200_fqab095-B15","doi-asserted-by":"crossref","DOI":"10.4324\/9780203698914","volume-title":"Digital Literary Studies: Corpus Approaches to Poetry, Prose, and Drama","author":"Hoover","year":"2014"},{"key":"2022102215083806200_fqab095-B16","first-page":"123","volume-title":"Doing Digital Humanities","author":"Hoover","year":"2016"},{"key":"2022102215083806200_fqab095-B17","first-page":"182","volume-title":"The New Oxford Shakespeare Authorship Companion","author":"Jackson","year":"2017"},{"key":"2022102215083806200_fqab095-B22"},{"key":"2022102215083806200_fqab095-B18","author":"Moore","year":"2004"},{"issue":"2","key":"2022102215083806200_fqab095-B19","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1093\/llc\/fqy038","article-title":"The interpretation of zeta test results","volume":"34","author":"Rizvi","year":"2019","journal-title":"Digital Scholarship in the Humanities"},{"issue":"2","key":"2022102215083806200_fqab095-B20","doi-asserted-by":"crossref","first-page":"419","DOI":"10.1093\/llc\/fqy039","article-title":"An improvement to zeta","volume":"34","author":"Rizvi","year":"2019","journal-title":"Digital Scholarship in the Humanities"},{"key":"2022102215083806200_fqab095-B21","author":"Taylor, G. and Egan, G."}],"container-title":["Digital Scholarship in the Humanities"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/dsh\/article-pdf\/37\/4\/1002\/46608083\/fqab095.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/dsh\/article-pdf\/37\/4\/1002\/46608083\/fqab095.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,22]],"date-time":"2022-10-22T15:13:55Z","timestamp":1666451635000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/dsh\/article\/37\/4\/1002\/6454202"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,6]]},"references-count":22,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2021,12,6]]},"published-print":{"date-parts":[[2022,10,22]]}},"URL":"https:\/\/doi.org\/10.1093\/llc\/fqab095","relation":{},"ISSN":["2055-7671","2055-768X"],"issn-type":[{"value":"2055-7671","type":"print"},{"value":"2055-768X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,12,1]]},"published":{"date-parts":[[2021,12,6]]}}}