{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T02:35:45Z","timestamp":1777430145564,"version":"3.51.4"},"reference-count":49,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2024,11,26]],"date-time":"2024-11-26T00:00:00Z","timestamp":1732579200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Computation of Literary Style"},{"DOI":"10.13039\/501100004835","name":"Zhejiang University","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004835","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>This study utilized machine learning algorithms and entropy-based features to identify translators of two English translations of Hongloumeng, a great classical Chinese novel written in the mid-18th century. The translations under examination were completed, respectively, by David Hawkes and the Yangs (Yang Hsien-yi and Gladys Yang). Two feature sets were extracted as input for the identification of translator styles: wordform features (wordform unigrams, bigrams, and trigrams) and part-of-speech (POS) features (POS unigrams, bigrams, and trigrams). Additionally, four machine learning classifiers were tested: linear support vector machines (SVMs), linear discriminant analysis (LDA), random forest (RF), and multilayer perceptron (MLP). Analysis of feature importance and SHAP value identified the most influential features within each classifier. Results showed that LDA achieved the best performance, with 81 per cent accuracy in distinguishing between translations, showing promise for translator identification. In contrast, MLP struggled to reliably differentiate between translations, achieving only 50 per cent accuracy. Furthermore, POS features had the greatest influence in SVM and LDA, while wordform features dominated in RF. SHAP analysis revealed that Hawkes\u2019 translation tended to exhibit higher POS unigram and lower POS trigram entropy compared to the Yangs\u2019. This increased contribution of POS unigrams and trigrams suggests a link to explicitation differences in translation. In summary, the combination of machine learning and entropy-based stylometric features shows potential for automatic translator identification and analysis.<\/jats:p>","DOI":"10.1093\/llc\/fqae074","type":"journal-article","created":{"date-parts":[[2024,11,26]],"date-time":"2024-11-26T16:07:56Z","timestamp":1732637276000},"page":"138-150","source":"Crossref","is-referenced-by-count":3,"title":["Translator attribution of <i>Hongloumeng<\/i>: using entropy-based features and machining learning algorithm"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-7843-825X","authenticated-orcid":false,"given":"Ruitao","family":"Hu","sequence":"first","affiliation":[{"name":"School of International Studies, Zhejiang University , Hangzhou, Zhejiang, 310058,","place":["P.R. China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gui","family":"Wang","sequence":"additional","affiliation":[{"name":"School of International Studies, Zhejiang University , Hangzhou, Zhejiang, 310058,","place":["P.R. China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-2270-3258","authenticated-orcid":false,"given":"Bin","family":"Shao","sequence":"additional","affiliation":[{"name":"School of International Studies, Zhejiang University , Hangzhou, Zhejiang, 310058,","place":["P.R. China"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2024,11,26]]},"reference":[{"key":"2025040220160838600_fqae074-B1","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1080\/09296174.2011.533592","article-title":"Entropy-Based Assessment of Written Albanian Language","volume":"18","author":"alZahir","year":"2011","journal-title":"Journal of Quantitative Linguistics"},{"key":"2025040220160838600_fqae074-B2","author":"Anthony","year":"2022"},{"key":"2025040220160838600_fqae074-B3","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1075\/target.7.2.03bak","article-title":"Corpora in Translation Studies: An Overview and Some Suggestions for Future Research","volume":"7","author":"Baker","year":"1995","journal-title":"Target. International Journal of Translation Studies"},{"key":"2025040220160838600_fqae074-B4","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1075\/target.12.2.04bak","article-title":"Towards a Methodology for Investigating the Style of a Literary Translator","volume":"12","author":"Baker","year":"2000","journal-title":"Target. International Journal of Translation Studies"},{"key":"2025040220160838600_fqae074-B5","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1093\/llc\/fqi039","article-title":"A New Approach to the Study of Translationese: Machine-Learning the Difference between Original and Translated Text","volume":"21","author":"Baroni","year":"2006","journal-title":"Literary and Linguistic Computing"},{"key":"2025040220160838600_fqae074-B6","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1016\/j.isprsjprs.2016.01.011","article-title":"Random Forest in Remote Sensing: A Review of Applications and Future Directions","volume":"114","author":"Belgiu","year":"2016","journal-title":"ISPRS Journal of Photogrammetry and Remote Sensing"},{"key":"2025040220160838600_fqae074-B7","doi-asserted-by":"crossref","first-page":"275","DOI":"10.3390\/e19060275","article-title":"The Entropy of Words\u2014Learnability and Expressivity across More than 1000 Languages\u2019,","volume":"19","author":"Bentz","year":"2017","journal-title":"Entropy"},{"key":"2025040220160838600_fqae074-B8","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1007\/s11749-016-0481-7","article-title":"A Random Forest Guided Tour","volume":"25","author":"Biau","year":"2016","journal-title":"TEST"},{"key":"2025040220160838600_fqae074-B9","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1007\/978-3-319-02300-7_4","volume-title":"Support Vector Machines Applications","author":"Biggio","year":"2014"},{"key":"2025040220160838600_fqae074-B10","volume-title":"Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit","author":"Bird","year":"2009"},{"key":"2025040220160838600_fqae074-B11","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random Forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Machine Learning"},{"key":"2025040220160838600_fqae074-B12","doi-asserted-by":"crossref","first-page":"1418","DOI":"10.1093\/llc\/fqad061","article-title":"Who Could Be behind QAnon? Authorship Attribution with Supervised Machine-Learning","volume":"38","author":"Cafiero","year":"2023","journal-title":"Digital Scholarship in the Humanities"},{"key":"2025040220160838600_fqae074-B13","volume-title":"Hong Lou Meng","author":"Cao","year":"2008"},{"key":"2025040220160838600_fqae074-B14","first-page":"170","article-title":"Fast and Accurate Text Classification via Multiple Linear Discriminant Projections","volume":"12","author":"Chakrabarti","year":"2003","journal-title":"The VLDB Journal The International Journal on Very Large Data Bases"},{"key":"2025040220160838600_fqae074-B15","first-page":"528","article-title":"Entropy in Different Text Types","volume":"32","author":"Chen","year":"2016","journal-title":"Digital Scholarship in the Humanities"},{"key":"2025040220160838600_fqae074-B16","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-Vector Networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Machine Learning"},{"key":"2025040220160838600_fqae074-B17","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1093\/llc\/fqac024","article-title":"Seeing Various Adventures through a Mirror: Detecting Translator\u2019s Stylistic Visibility in Chinese Translations of Alice\u2019s Adventure in Wonderland","volume":"38","author":"Fang","year":"2023","journal-title":"Digital Scholarship in the Humanities"},{"key":"2025040220160838600_fqae074-B18","first-page":"83","article-title":"Research on Logical Explicitation in the English Translation of Chinese Novels: Exemplified by Adverbial Clauses Introduced by \u201cbecause\u201d in D. Hawkes\u2019 Translation of Hong Lou Meng","volume":"37","author":"Feng","year":"2016","journal-title":"Shandong Foreign Language Teaching"},{"key":"2025040220160838600_fqae074-B19","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1080\/09296174.2013.830551","article-title":"Authorship Attribution Using Entropy","volume":"20","author":"Grabchak","year":"2013","journal-title":"Journal of Quantitative Linguistics"},{"key":"2025040220160838600_fqae074-B20","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-84858-7","volume-title":"The Elements of Statistical Learning","author":"Hastie","year":"2009"},{"key":"2025040220160838600_fqae074-B151","volume-title":"The Story of the Stone. The Golden Days","author":"Hawkes","year":"1979"},{"key":"2025040220160838600_fqae074-B152","volume-title":"The Story of the Stone, The Crab-Flower Club","author":"Hawkes","year":"1979"},{"key":"2025040220160838600_fqae074-B153","volume-title":"The Story of the Stone. The Warning Voice","author":"Hawkes","year":"1981"},{"key":"2025040220160838600_fqae074-B7304139","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1017\/S1351324920000182","article-title":"Investigating Translated Chinese and its Variants Using Machine Learning\u2019","volume":"27","author":"Hu","year":"2021","journal-title":"Natural Language Engineering"},{"key":"2025040220160838600_fqae074-B22","first-page":"111","article-title":"Performance Analysis of Various Activation Functions in Generalized MLP Architectures of Neural Networks","volume":"1","author":"Karlik","year":"2011","journal-title":"International Journal of Artificial Intelligence and Expert Systems"},{"key":"2025040220160838600_fqae074-B23","author":"Kyle","year":"2016"},{"key":"2025040220160838600_fqae074-B24","doi-asserted-by":"crossref","first-page":"557","DOI":"10.7202\/003425ar","article-title":"Core Patterns of Lexical Use in a Comparable Corpus of English Narrative Prose","volume":"43","author":"Laviosa","year":"2002","journal-title":"Meta"},{"key":"2025040220160838600_fqae074-B25","doi-asserted-by":"crossref","first-page":"813","DOI":"10.1093\/llc\/fqab091","article-title":"How Do Machine Translators Measure up to Human Literary Translators in Stylometric Tests?\u2019,","volume":"37","author":"Lee","year":"2022","journal-title":"Digital Scholarship in the Humanities"},{"key":"2025040220160838600_fqae074-B26","doi-asserted-by":"crossref","first-page":"e0265633","DOI":"10.1371\/journal.pone.0265633","article-title":"Entropy-Based Discrimination between Translated Chinese and Original Chinese Using Data Mining Techniques","volume":"17","author":"Liu","year":"2022","journal-title":"PLOS ONE"},{"key":"2025040220160838600_fqae074-B27","doi-asserted-by":"crossref","first-page":"e0253454","DOI":"10.1371\/journal.pone.0253454","article-title":"Syntactic Complexity in Translated and Non-Translated Texts: A Corpus-Based Study of Simplification\u2019","volume":"16","author":"Liu","year":"2021","journal-title":"PLOS ONE"},{"key":"2025040220160838600_fqae074-B28","doi-asserted-by":"crossref","first-page":"103364","DOI":"10.1016\/j.lingua.2022.103364","article-title":"Simplification in Translated Chinese: An Entropy-Based Approach","volume":"275","author":"Liu","year":"2022","journal-title":"Lingua"},{"key":"2025040220160838600_fqae074-B29","author":"Lundberg","year":"2016"},{"key":"2025040220160838600_fqae074-B30","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1016\/j.csl.2018.05.002","article-title":"The Translator\u2019s Visibility: Detecting Translatorial Fingerprints in Contemporaneous Parallel Translations","volume":"52","author":"Lynch","year":"2018","journal-title":"Computer Speech & Language"},{"key":"2025040220160838600_fqae074-B31","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1093\/llc\/fqx046","article-title":"Distributed Language Representation for Authorship Attribution","volume":"33","author":"Kocher","year":"2018","journal-title":"Digital Scholarship in the Humanities"},{"key":"2025040220160838600_fqae074-B32","author":"Kurokawa","year":"2009"},{"key":"2025040220160838600_fqae074-B33","doi-asserted-by":"publisher","first-page":"1370","DOI":"10.2991\/iemss-17.2017.252","author":"Ma","year":"2017"},{"key":"2025040220160838600_fqae074-B34","doi-asserted-by":"crossref","first-page":"658","DOI":"10.1093\/llc\/fqac054","article-title":"Translator Attribution for Arabic Using Machine Learning","volume":"38","author":"Mohamed","year":"2023","journal-title":"Digital Scholarship in the Humanities"},{"key":"2025040220160838600_fqae074-B35","doi-asserted-by":"crossref","first-page":"830","DOI":"10.1093\/llc\/fqab092","article-title":"Linguistic Features Evaluation for Hadith Authenticity through Automatic Machine Learning","volume":"37","author":"Mohamed","year":"2022","journal-title":"Digital Scholarship in the Humanities"},{"key":"2025040220160838600_fqae074-B36","doi-asserted-by":"crossref","first-page":"1083","DOI":"10.1016\/j.patcog.2007.07.022","article-title":"A Comparison of Generalized Linear Discriminant Analysis Algorithms","volume":"41","author":"Park","year":"2008","journal-title":"Pattern Recognition"},{"key":"2025040220160838600_fqae074-B37","first-page":"2825","article-title":"Scikit-Learn: Machine Learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"Journal of Machine Learning Research"},{"key":"2025040220160838600_fqae074-B38","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1017\/S0962492900002919","article-title":"Approximation Theory of the MLP Model in Neural Networks","volume":"8","author":"Pinkus","year":"1999","journal-title":"Acta Numerica"},{"key":"2025040220160838600_fqae074-B39","first-page":"1","article-title":"Functional Random Forest with Applications in Dose-Response Predictions\u2019,","volume":"9","author":"Rahman","year":"2019","journal-title":"Scientific Reports"},{"key":"2025040220160838600_fqae074-B40","author":"Scott","year":"2022"},{"key":"2025040220160838600_fqae074-B41","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1002\/j.1538-7305.1951.tb01366.x","article-title":"Prediction and Entropy of Printed English","volume":"30","author":"Shannon","year":"1951","journal-title":"Bell System Technical Journal"},{"key":"2025040220160838600_fqae074-B42","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1093\/llc\/fqr039","article-title":"Looking for Translator\u2019s Fingerprints: A Corpus-Based Study on Chinese Translations of Ulysses","volume":"27","author":"Wang","year":"2012","journal-title":"Literary and Linguistic Computing"},{"key":"2025040220160838600_fqae074-B43","volume-title":"A Dream of Red Mansions","author":"Yang","year":"2010","edition":"6th printing"},{"key":"2025040220160838600_fqae074-B44","first-page":"85","article-title":"Is This English Translation of Hong Lou Meng by Joly Himself? \u2014 A Corpus-based Investigation of Translator Style\u2019,","volume":"11","author":"Zhang","year":"2014","journal-title":"Foreign Languages in China"},{"key":"2025040220160838600_fqae074-B45","first-page":"92","volume-title":"Information Retrieval Technology. Lecture Notes in Computer Science","author":"Zhao","year":"2006"},{"key":"2025040220160838600_fqae074-B46","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1080\/09296174.2017.1348014","article-title":"British Cultural Complexity: An Entropy-Based Approach","volume":"25","author":"Zhu","year":"2018","journal-title":"Journal of Quantitative Linguistics"}],"container-title":["Digital Scholarship in the Humanities"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/dsh\/article-pdf\/40\/1\/138\/60819433\/fqae074.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/dsh\/article-pdf\/40\/1\/138\/60819433\/fqae074.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,2]],"date-time":"2025-04-02T20:48:12Z","timestamp":1743626892000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/dsh\/article\/40\/1\/138\/7908958"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,26]]},"references-count":49,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,11,26]]},"published-print":{"date-parts":[[2025,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/llc\/fqae074","relation":{},"ISSN":["2055-7671","2055-768X"],"issn-type":[{"value":"2055-7671","type":"print"},{"value":"2055-768X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,4]]},"published":{"date-parts":[[2024,11,26]]}}}