{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T04:40:20Z","timestamp":1760071220101,"version":"build-2065373602"},"reference-count":22,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2025,10,4]],"date-time":"2025-10-04T00:00:00Z","timestamp":1759536000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["www.mdpi.com"],"crossmark-restriction":true},"short-container-title":["Entropy"],"abstract":"<jats:p>In this paper, we apply an information-theoretic method proposed by Ryabko and Savina (therefore called the RS-method), based on the use of data compression, to recognize the individual author\u2019s style of a writer across four languages from different language groups and families. In this paper, the presented method was used to study fiction texts in Russian (East Slavic group of languages of the Indo-European language family), Amharic (South Ethiosemitic group of the Semitic language family), Chinese (Sinitic group of the Sino-Tibetan language family) and English (West Germanic language group of the Indo-European language family). It was found that the amount of data necessary for recognizing an author\u2019s style is almost the same for all four languages, i.e., the amount of data is invariant across different language groups. The results obtained are of interest to computer science, literary studies, linguistics and, in particular, computational linguistics.<\/jats:p>","DOI":"10.3390\/e27101039","type":"journal-article","created":{"date-parts":[[2025,10,6]],"date-time":"2025-10-06T08:10:51Z","timestamp":1759738251000},"page":"1039","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["The Amount of Data Required to Recognize a Writer\u2019s Style Is Consistent Across Different Languages of the World"],"prefix":"10.3390","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7232-9644","authenticated-orcid":false,"given":"Boris","family":"Ryabko","sequence":"first","affiliation":[{"name":"Federal Research Center for Information and Computational Technologies, 6300090 Novosibirsk, Russia"},{"name":"Department of Information Technologies, Novosibirsk State University, 6300090 Novosibirsk, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9765-9714","authenticated-orcid":false,"given":"Nadezhda","family":"Savina","sequence":"additional","affiliation":[{"name":"Department of Information Technologies, Novosibirsk State University, 6300090 Novosibirsk, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-8054-9846","authenticated-orcid":false,"given":"Yeshewas Getachew","family":"Lulu","sequence":"additional","affiliation":[{"name":"Department of Information Technologies, Novosibirsk State University, 6300090 Novosibirsk, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yunfei","family":"Han","sequence":"additional","affiliation":[{"name":"Department of Information Technologies, Novosibirsk State University, 6300090 Novosibirsk, Russia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"328","DOI":"10.2307\/2981893","article-title":"The Analysis of Literary Style\u2014A Review","volume":"148","author":"Holmes","year":"1985","journal-title":"J. R. Stat. Soc. Ser. A (Gen.)"},{"key":"ref_2","unstructured":"Ray, B. (2015). Style: An Introduction to History, Theory, Research, and Pedagogy, WAC Clearinghouse."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Aquilina, M. (2014). The Event of Style in Literature, Palgrave Macmillan.","DOI":"10.1057\/9781137426925"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1023\/B:CHUM.0000009225.28847.77","article-title":"Change of writing style with time","volume":"38","author":"Can","year":"2004","journal-title":"Comput. Humanit."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"378","DOI":"10.1002\/asi.20316","article-title":"A framework for authorship identification of online messages: Writing-style features and classification techniques","volume":"57","author":"Zheng","year":"2006","journal-title":"J. Am. Soc. Inf. Sci. Technol."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"967","DOI":"10.4236\/ojml.2021.116075","article-title":"An Empirical Investigation of Authorial Writing Styles Based on a Vietnamese Corpus","volume":"11","author":"Nguyen","year":"2021","journal-title":"Open J. Mod. Linguist."},{"key":"ref_7","first-page":"3","article-title":"Three approaches to quantitative definition of information","volume":"1","author":"Kolmogorov","year":"1965","journal-title":"Probl. Inf. Transm."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Ryabko, B., Astola, J., and Malyutov, M. (2016). Compression-Based Methods of Statistical Analysis and Prediction of Time Series, Springer.","DOI":"10.1007\/978-3-319-32253-7"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Ryabko, B. (2017, January 25\u201330). Using data-compressors for statistical analysis of problems on homogeneity testing and classification. Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany.","DOI":"10.1109\/ISIT.2017.8006502"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Ryabko, B., and Savina, N. (2021). Using Data Compression to Build a Method for Statistically Verified Attribution of Literary Texts. Entropy, 23.","DOI":"10.3390\/e23101302"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Ryabko, B., and Savina, N. (2022). Information-Theoretical Method for Assessing the Quality of Translations. Entropy, 24.","DOI":"10.3390\/e24121739"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1523","DOI":"10.1109\/TIT.2005.844059","article-title":"Clustering by compression","volume":"51","author":"Cilibrasi","year":"2005","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1162\/0148926042728449","article-title":"Algorithmic clustering of music based on string compression","volume":"28","author":"Cilibrasi","year":"2004","journal-title":"Comput. Music"},{"key":"ref_14","first-page":"83","article-title":"Using compression\u2014based language models for text categorization","volume":"Volume 13","author":"Teahan","year":"2003","journal-title":"Language Modeling for Information Retrieval"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"375","DOI":"10.1162\/089120100561746","article-title":"Using compression models to segment Chinese text","volume":"26","author":"Teahan","year":"2000","journal-title":"Comput. Linguist."},{"key":"ref_16","unstructured":"Kendall, M., and Stjuart, A. (1961). Inference and relationship. The Advanced Theory of Statistics, Hafner Publisher."},{"key":"ref_17","unstructured":"Pavlov, I. (2025, August 24). 7-Zip Compression Utility. Available online: https:\/\/www.7-zip.org\/."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Yong, H., and Peng, J. (2008). Chinese Lexicography: A History from 1046 BC to AD 1911, OUP Oxford.","DOI":"10.1093\/oso\/9780199539826.001.0001"},{"key":"ref_19","unstructured":"Norman, J. (1988). Chinese, Cambridge University Press."},{"key":"ref_20","unstructured":"Hilary, M. (2015). Diversity in Sinitic Languages, Oxford University Press."},{"key":"ref_21","unstructured":"Hartmann, J. (1980). Amharische Grammatik, Steiner. \u00c4thiopische Forschungen."},{"key":"ref_22","first-page":"117","article-title":"Amharic as lingua franca in Ethiopia","volume":"20","author":"Meyer","year":"2006","journal-title":"Lissan J. Afr. Lang. Linguist."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/27\/10\/1039\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T04:17:16Z","timestamp":1760069836000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/27\/10\/1039"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,4]]},"references-count":22,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2025,10]]}},"alternative-id":["e27101039"],"URL":"https:\/\/doi.org\/10.3390\/e27101039","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2025,10,4]]}}}