{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,19]],"date-time":"2026-02-19T14:12:18Z","timestamp":1771510338912,"version":"3.50.1"},"reference-count":39,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2024,6,26]],"date-time":"2024-06-26T00:00:00Z","timestamp":1719360000000},"content-version":"vor","delay-in-days":177,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100007251","name":"National Research University Higher School of Economics","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100007251","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Complexity"],"published-print":{"date-parts":[[2024,1]]},"abstract":"<jats:p>The present paper introduces a novel object of study, a language fractal structure; we hypothesize that a set of embeddings of all <jats:italic>n<\/jats:italic>\u2010grams of a natural language constitutes a representative sample of this fractal set. (We use the term <jats:italic>Hailonakea<\/jats:italic> to refer to the sum total of all language fractal structures, over all <jats:italic>n<\/jats:italic>). The paper estimates intrinsic (genuine) dimensions of language fractal structures for the Russian and English languages. To this end, we employ methods based on (1) topological data analysis and (2) a minimum spanning tree of a data graph for a cloud of points considered (Steele theorem). For both languages, for all <jats:italic>n<\/jats:italic>, the intrinsic dimensions appear to be noninteger values (typical for fractal sets), close to 9 for both of the Russian and English language.<\/jats:p>","DOI":"10.1155\/2024\/8863360","type":"journal-article","created":{"date-parts":[[2024,6,26]],"date-time":"2024-06-26T13:00:45Z","timestamp":1719406845000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["A Language and Its Dimensions: Intrinsic Dimensions of Language Fractal Structures"],"prefix":"10.1155","volume":"2024","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5891-6597","authenticated-orcid":false,"given":"Vasilii A.","family":"Gromov","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7102-4443","authenticated-orcid":false,"given":"Nikita S.","family":"Borodin","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0009-0007-7119-4665","authenticated-orcid":false,"given":"Asel S.","family":"Yerbolova","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2024,6,26]]},"reference":[{"key":"e_1_2_14_1_2","doi-asserted-by":"publisher","DOI":"10.1155\/2017\/9212538"},{"key":"e_1_2_14_2_2","doi-asserted-by":"crossref","DOI":"10.1201\/9781003272649","volume-title":"Graph Learning and Network Science for Natural Language Processing","author":"Garg M.","year":"2022"},{"key":"e_1_2_14_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.physa.2018.08.002"},{"key":"e_1_2_14_4_2","doi-asserted-by":"publisher","DOI":"10.1093\/comnet\/cny018"},{"key":"e_1_2_14_5_2","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511755071","volume-title":"The Germanic Languages","author":"Harbert W.","year":"2006"},{"key":"e_1_2_14_6_2","volume-title":"Language Typology and Syntactic Description","author":"Shopen T.","year":"2007"},{"key":"e_1_2_14_7_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/9781119598732","volume-title":"A Companion to Chomsky","author":"Allott N.","year":"2021"},{"key":"e_1_2_14_8_2","unstructured":"D\u0119bowski\u0141 A simplistic model of neural scaling laws: multiperiodic santa Fe processes https:\/\/arxiv.org\/pdf\/2302.09049 https:\/\/doi.org\/10.48550\/arXiv.2302.09049."},{"key":"e_1_2_14_9_2","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-030-59377-3","volume-title":"Statistical Universals of Language","author":"Tanaka-Ishii K.","year":"2021"},{"key":"e_1_2_14_10_2","doi-asserted-by":"publisher","DOI":"10.2139\/ssrn.4457882"},{"key":"e_1_2_14_11_2","doi-asserted-by":"crossref","unstructured":"PestovV. Intrinsic dimension of a dataset: what properties does one expect? Proceedings of the International Joint Conference on Neural Networks August 2007 Orlando FL USA IEEE 2959\u20132964 https:\/\/doi.org\/10.1109\/IJCNN.2007.4371431 2-s2.0-51749113343.","DOI":"10.1109\/IJCNN.2007.4371431"},{"key":"e_1_2_14_12_2","volume-title":"Metric Structures for Riemannian and Non-riemannian Spaces","author":"Gromov M.","year":"2007"},{"key":"e_1_2_14_13_2","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511755798","volume-title":"Nonlinear Time Series Analysis","author":"Kantz H.","year":"2003"},{"key":"e_1_2_14_14_2","volume-title":"[Current Problems in Nonlinear Dynamics] Sovremennye Problemy Nelineinoi Dinamiki","author":"Malinetsky G. G.","year":"2000"},{"key":"e_1_2_14_15_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jmva.2012.12.007"},{"key":"e_1_2_14_16_2","first-page":"1","volume-title":"Topological Data Analysis. Abel Symposia","author":"Adams H.","year":"2020"},{"key":"e_1_2_14_17_2","unstructured":"SchweinhartB. Fractal dimension and the persistent homology of random geometric complexes https:\/\/arxiv.org\/abs\/1808.02196 https:\/\/doi.org\/10.48550\/arXiv.1808.02196."},{"key":"e_1_2_14_18_2","doi-asserted-by":"publisher","DOI":"10.1088\/0004-637X\/694\/1\/151"},{"key":"e_1_2_14_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/0960-0779(95)00091-7"},{"key":"e_1_2_14_20_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2012.01.171"},{"key":"e_1_2_14_21_2","first-page":"417","volume-title":"13th Workshop on Statistical Signal Processing","author":"Costa J. A.","year":"2005"},{"key":"e_1_2_14_22_2","first-page":"265","volume-title":"Proceedings of the 24th International Conference on Machine Learning, Corvalis","author":"Farahmand M.","year":"2007"},{"key":"e_1_2_14_23_2","unstructured":"TulchinskiiE. KuznetsovK. KushnarevaL. CherniavskiiD. BarannikovS. PiontkovskayaI. SergeyN. andEvgenyB. Intrinsic dimension estimation for robust detection of AI-generated texts https:\/\/arxiv.org\/abs\/2306.04723 https:\/\/doi.org\/10.48550\/arXiv.2306.04723."},{"key":"e_1_2_14_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/tpami.2023.3245886"},{"key":"e_1_2_14_25_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0285630"},{"key":"e_1_2_14_26_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF01457179"},{"key":"e_1_2_14_27_2","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-031-02556-3","volume-title":"Latent Semantic Mapping: Principles and Applications","author":"Bellegarda J. R.","year":"2007"},{"key":"e_1_2_14_28_2","doi-asserted-by":"publisher","DOI":"10.1080\/07468342.1996.11973744"},{"key":"e_1_2_14_29_2","doi-asserted-by":"publisher","DOI":"10.1137\/0702016"},{"key":"e_1_2_14_30_2","unstructured":"MikolovT. ChenK. CorradoG. andDeanJ. Efficient estimation of word representations in vector space https:\/\/arxiv.org\/abs\/1301.3781 https:\/\/doi.org\/10.48550\/arXiv.1301.3781."},{"key":"e_1_2_14_31_2","doi-asserted-by":"publisher","DOI":"10.1214\/aop\/1176991596"},{"key":"e_1_2_14_32_2","doi-asserted-by":"publisher","DOI":"10.1063\/1.532489"},{"key":"e_1_2_14_33_2","volume-title":"10th International Conference on Pattern Recognition and Machine Intelligence","author":"Kuznetsov S. O.","year":"2023"},{"key":"e_1_2_14_34_2","first-page":"33","volume-title":"Lecture Notes in Computer Science","author":"Kuznetsov S. O.","year":"2009"},{"key":"e_1_2_14_35_2","volume-title":"Topological Spaces","author":"\u010cech E.","year":"1966"},{"key":"e_1_2_14_36_2","unstructured":"GromovV. A. BorodinN. S. andYerbolovaA. S. A language and its dimensions: intrinsic dimensions of language fractal structures https:\/\/arxiv.org\/pdf\/2311.10217 https:\/\/doi.org\/10.48550\/arXiv.2311.10217."},{"key":"e_1_2_14_37_2","unstructured":"DevlinJ. ChangM.-V. LeeK. andToutanovaK. BERT: pre-training of deep bidirectional transformers for language understanding https:\/\/arxiv.org\/abs\/1810.04805 https:\/\/doi.org\/10.48550\/arXiv.1810.04805."},{"key":"e_1_2_14_38_2","unstructured":"BrownT. B. MannB. RyderN. SubbiahM. KaplanJ. andDhariwalP. Language models are few-shot learners https:\/\/arxiv.org\/abs\/2005.14165 https:\/\/doi.org\/10.48550\/arXiv.2005.14165."},{"key":"e_1_2_14_39_2","doi-asserted-by":"publisher","DOI":"10.2307\/3214207"}],"container-title":["Complexity"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2024\/8863360","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,26]],"date-time":"2024-06-26T13:00:59Z","timestamp":1719406859000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2024\/8863360"}},"subtitle":[],"editor":[{"given":"Hiroki","family":"Sayama","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,1]]},"references-count":39,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,1]]}},"alternative-id":["10.1155\/2024\/8863360"],"URL":"https:\/\/doi.org\/10.1155\/2024\/8863360","archive":["Portico"],"relation":{},"ISSN":["1076-2787","1099-0526"],"issn-type":[{"value":"1076-2787","type":"print"},{"value":"1099-0526","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,1]]},"assertion":[{"value":"2023-11-17","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-04-25","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-06-26","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"8863360"}}