{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,21]],"date-time":"2025-12-21T06:26:00Z","timestamp":1766298360403,"version":"build-2065373602"},"reference-count":44,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2024,8,29]],"date-time":"2024-08-29T00:00:00Z","timestamp":1724889600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100005739","name":"UNAM-PAPIIT","doi-asserted-by":"publisher","award":["IN107919","IG101421","IG101324","IV100120","285754"],"award-info":[{"award-number":["IN107919","IG101421","IG101324","IV100120","285754"]}],"id":[{"id":"10.13039\/501100005739","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003141","name":"CONACyT","doi-asserted-by":"publisher","award":["IN107919","IG101421","IG101324","IV100120","285754"],"award-info":[{"award-number":["IN107919","IG101421","IG101324","IV100120","285754"]}],"id":[{"id":"10.13039\/501100003141","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>In recent decades, the field of statistical linguistics has made significant strides, which have been fueled by the availability of data. Leveraging Twitter data, this paper explores the English and Spanish languages, investigating their rank diversity across different scales: temporal intervals (ranging from 3 to 96 h), spatial radii (spanning 3 km to over 3000 km), and grammatical word ngrams (ranging from 1-grams to 5-grams). The analysis focuses on word ngrams, examining a time period of 1 year (2014) and eight different countries. Our findings highlight the relevance of all three scales with the most substantial changes observed at the grammatical level. Specifically, at the monogram level, rank diversity curves exhibit remarkable similarity across languages, countries, and temporal or spatial scales. However, as the grammatical scale expands, variations in rank diversity become more pronounced and influenced by temporal, spatial, linguistic, and national factors. Additionally, we investigate the statistical characteristics of Twitter-specific tokens, including emojis, hashtags, and user mentions, revealing a sigmoid pattern in their rank diversity function. These insights contribute to quantifying universal language statistics while also identifying potential sources of variation.<\/jats:p>","DOI":"10.3390\/e26090734","type":"journal-article","created":{"date-parts":[[2024,8,29]],"date-time":"2024-08-29T03:40:39Z","timestamp":1724902839000},"page":"734","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Language Statistics at Different Spatial, Temporal, and Grammatical Scales"],"prefix":"10.3390","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8410-1113","authenticated-orcid":false,"given":"Fernanda","family":"S\u00e1nchez-Puig","sequence":"first","affiliation":[{"name":"Facultad de Ciencias, Universidad Nacional Aut\u00f3noma de M\u00e9xico, Mexico City 04510, Mexico"},{"name":"Centro de Ciencias de la Complejidad, Universidad Nacional Aut\u00f3noma de M\u00e9xico, Mexico City 04510, Mexico"},{"name":"Instituto de Fisica Interdisciplinar y Sistemas Complejos, Universidad de las Islas Baleares, 07122 Palma de Mallorca, Spain"}]},{"given":"Rogelio","family":"Lozano-Aranda","sequence":"additional","affiliation":[{"name":"Facultad de Ciencias, Universidad Nacional Aut\u00f3noma de M\u00e9xico, Mexico City 04510, Mexico"},{"name":"Centro de Ciencias de la Complejidad, Universidad Nacional Aut\u00f3noma de M\u00e9xico, Mexico City 04510, Mexico"}]},{"given":"Dante","family":"P\u00e9rez-M\u00e9ndez","sequence":"additional","affiliation":[{"name":"Centro de Ciencias de la Complejidad, Universidad Nacional Aut\u00f3noma de M\u00e9xico, Mexico City 04510, Mexico"},{"name":"Posgrado en Ciencias de la Computaci\u00f3n, Universidad Nacional Aut\u00f3noma de M\u00e9xico, Mexico City 04510, Mexico"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2551-8589","authenticated-orcid":false,"given":"Ewan","family":"Colman","sequence":"additional","affiliation":[{"name":"Centro de Ciencias de la Complejidad, Universidad Nacional Aut\u00f3noma de M\u00e9xico, Mexico City 04510, Mexico"},{"name":"Roslin Institute, University of Edinburgh, Midlothian EH8 9YL, UK"}]},{"given":"Alfredo J.","family":"Morales-Guzm\u00e1n","sequence":"additional","affiliation":[{"name":"MIT Media Lab, Cambridge, MA 02139, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3507-1821","authenticated-orcid":false,"given":"Pedro Juan","family":"Rivera Torres","sequence":"additional","affiliation":[{"name":"Centro de Ciencias de la Complejidad, Universidad Nacional Aut\u00f3noma de M\u00e9xico, Mexico City 04510, Mexico"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7306-0894","authenticated-orcid":false,"given":"Carlos","family":"Pineda","sequence":"additional","affiliation":[{"name":"Instituto de F\u00edsica, Universidad Nacional Aut\u00f3noma de M\u00e9xico, Mexico City 04510, Mexico"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0193-3067","authenticated-orcid":false,"given":"Carlos","family":"Gershenson","sequence":"additional","affiliation":[{"name":"Centro de Ciencias de la Complejidad, Universidad Nacional Aut\u00f3noma de M\u00e9xico, Mexico City 04510, Mexico"},{"name":"School of Systems Science and Industrial Engineering, Binghamton University, Binghamton, NY 13902, USA"},{"name":"Instituto de Investigaciones en Matem\u00e1ticas Aplicadas y Sistemas, Universidad Nacional Aut\u00f3noma de M\u00e9xico, Mexico City 04510, Mexico"},{"name":"Santa Fe Institute, Santa Fe, NM 87501, USA"}]}],"member":"1968","published-online":{"date-parts":[[2024,8,29]]},"reference":[{"key":"ref_1","unstructured":"Zipf, G.K. (1932). Selective Studies and the Principle of Relative Frequency in Language, Harvard University Press."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1016\/S0019-9958(67)90201-X","article-title":"A \u201cLaw\u201d of occurrences for words of low frequency","volume":"10","author":"Booth","year":"1967","journal-title":"Inf. Control."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"567","DOI":"10.1016\/S0378-4371(01)00355-7","article-title":"Beyond the Zipf\u2013Mandelbrot law in quantitative linguistics","volume":"300","author":"Montemurro","year":"2001","journal-title":"Phys. A Stat. Mech. Its Appl."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1080\/00107510500052444","article-title":"Power laws, Pareto distributions and Zipf\u2019s law","volume":"46","author":"Newman","year":"2005","journal-title":"Contemp. Phys."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"043004","DOI":"10.1088\/1367-2630\/13\/4\/043004","article-title":"Zipf\u2019s law unzipped","volume":"13","author":"Baek","year":"2011","journal-title":"New J. Phys."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"036115","DOI":"10.1103\/PhysRevE.83.036115","article-title":"Emergence of Zipf\u2019s law in the evolution of communication","volume":"83","author":"Fortuny","year":"2011","journal-title":"Phys. Rev. E"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1142\/S0219525902000468","article-title":"Zipf\u2019s Law and Random Texts","volume":"5","year":"2002","journal-title":"Adv. Complex Syst."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"17290","DOI":"10.1073\/pnas.1113716108","article-title":"The origin and evolution of word order","volume":"108","author":"Ruhlen","year":"2011","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"5241","DOI":"10.1073\/pnas.0608222104","article-title":"Innateness and culture in the evolution of language","volume":"104","author":"Kirby","year":"2007","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Steels, L. (2012). Experiments in Cultural Language Evolution, John Benjamins Publishing Company. Advances in Interaction Studies.","DOI":"10.1075\/ais.3"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1203002","DOI":"10.1142\/S0219525912030026","article-title":"Language dynamics","volume":"15","author":"Baronchelli","year":"2012","journal-title":"Adv. Complex Syst."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"3323","DOI":"10.1098\/rsif.2012.0491","article-title":"Evolution of the most common English words and phrases over the centuries","volume":"9","author":"Perc","year":"2012","journal-title":"J. R. Soc. Interface"},{"key":"ref_13","first-page":"021006","article-title":"Stochastic Model for the Vocabulary Growth in Natural Languages","volume":"3","author":"Gerlach","year":"2013","journal-title":"Phys. Rev. X"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"eabe6534","DOI":"10.1126\/sciadv.abe6534","article-title":"Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter","volume":"7","author":"Alshaabi","year":"2021","journal-title":"Sci. Adv."},{"key":"ref_15","unstructured":"Almodaresi, F., Ungar, L., Kulkarni, V., Zakeri, M., Giorgi, S., and Schwartz, H.A. On the Distribution of Lexical Features at Multiple Levels of Analysis. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1126\/science.1199644","article-title":"Quantitative Analysis of Culture Using Millions of Digitized Books","volume":"331","author":"Michel","year":"2011","journal-title":"Science"},{"key":"ref_17","unstructured":"Rau, M.D. (2024, August 18). Language Identification by Statistical Analysis. Available online: https:\/\/apps.dtic.mil\/sti\/tr\/pdf\/ADA003518.pdf."},{"key":"ref_18","unstructured":"Bollen, J., Pepe, A., and Mao, H. (2011, January 17\u201321). Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. Proceedings of the ICWSM11, Barcelona, Spain."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Dodds, P.S., Harris, K.D., Kloumann, I.M., Bliss, C.A., and Danforth, C.M. (2011). Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter. PLoS ONE, 6.","DOI":"10.1371\/journal.pone.0026752"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.socnet.2014.03.007","article-title":"Efficiency of human activity on information spreading on Twitter","volume":"39","author":"Morales","year":"2014","journal-title":"Soc. Netw."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"20161048","DOI":"10.1098\/rsif.2016.1048","article-title":"Global patterns of synchronization in human communications","volume":"14","author":"Morales","year":"2017","journal-title":"J. R. Soc. Interface"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"590","DOI":"10.1038\/s41586-021-03344-2","article-title":"Shifting attention to accuracy can reduce misinformation online","volume":"592","author":"Pennycook","year":"2021","journal-title":"Nature"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"033114","DOI":"10.1063\/1.4913758","article-title":"Measuring political polarization: Twitter shows the two sides of Venezuela","volume":"25","author":"Morales","year":"2015","journal-title":"Chaos Interdiscip. J. Nonlinear Sci."},{"key":"ref_24","unstructured":"Hong, L., Convertino, G., and Chi, E.H. (2011, January 17\u201321). Language Matters in Twitter: A Large Scale Study. Proceedings of the fifth International AAAI Conference on Weblogs and Social Media, Barcelona, Spain."},{"key":"ref_25","unstructured":"Weerkamp, W., Carter, S., and Tsagkias, M. (2011). How People Use Twitter in Different Languages, ACM."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"190573","DOI":"10.1098\/rsos.190573","article-title":"Segregation and polarization in urban areas","volume":"6","author":"Morales","year":"2019","journal-title":"R. Soc. Open Sci."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Cui, H., and Kert\u00e9sz, J. (2023). Competition for popularity and interventions on a Chinese microblogging site. PLoS ONE, 18.","DOI":"10.1371\/journal.pone.0286093"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Cocho, G., Flores, J., Gershenson, C., Pineda, C., and S\u00e1nchez, S. (2015). Rank Diversity of Languages: Generic Behavior in Computational Linguistics. PLoS ONE, 10.","DOI":"10.1371\/journal.pone.0121898"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Morales, J.A., Colman, E., S\u00e1nchez, S., S\u00e1nchez-Puig, F., Pineda, C., I\u00f1iguez, G., Cocho, G., Flores, J., and Gershenson, C. (2018). Rank Dynamics of Word Usage at Multiple Scales. Front. Phys., 6.","DOI":"10.3389\/fphy.2018.00045"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"121795","DOI":"10.1016\/j.physa.2019.121795","article-title":"Rank-frequency distribution of natural languages: A difference of probabilities approach","volume":"532","author":"Cocho","year":"2019","journal-title":"Phys. A Stat. Mech. Its Appl."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.physrep.2023.12.002","article-title":"Complex systems approach to natural language","volume":"1053","author":"Stanisz","year":"2024","journal-title":"Phys. Rep."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Song, F., and Croft, W.B. (1999). A general language model for information retrieval. Proceedings of the Eighth International Conference on Information and Knowledge Management, Association for Computing Machinery. CIKM\u201999.","DOI":"10.1145\/319950.320022"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1140\/epjds\/s13688-016-0096-y","article-title":"Generic temporal features of performance rankings in sports and games","volume":"5","author":"Morales","year":"2016","journal-title":"EPJ Data Sci."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1646","DOI":"10.1038\/s41467-022-29256-x","article-title":"Dynamics of ranking","volume":"13","author":"Pineda","year":"2022","journal-title":"Nat. Commun."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1140\/epjb\/e2005-00121-8","article-title":"The variation of Zipf\u2019s law in human language","volume":"44","year":"2005","journal-title":"Eur. Phys. J. B"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Evans, D.R., and Larsen-Freeman, D. (2020). Bifurcations and the Emergence of L2 Syntactic Structures in a Complex Dynamic System. Front. Psychol., 11.","DOI":"10.3389\/fpsyg.2020.574603"},{"key":"ref_37","unstructured":"Rubin, E.J., and Gess, R. (2005). Theoretical and Experimental Approaches to Romance Linguistics, John Benjamins Publishing Company. Current Issues in Linguistic Theory."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Ljube\u0161i\u0107, N., and Fi\u0161er, D. (2016, January 7\u201312). A global analysis of emoji usage. Proceedings of the 10th Web as Corpus Workshop, Berlin, Germany.","DOI":"10.18653\/v1\/W16-2610"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Seargeant, P. (2019). The Emoji Revolution: How Technology Is Shaping the Future of Communication, Cambridge University Press.","DOI":"10.1017\/9781108677387"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"274","DOI":"10.1080\/10350330.2014.996948","article-title":"Searchable talk: The linguistic functions of hashtags","volume":"25","author":"Zappavigna","year":"2015","journal-title":"Soc. Semiot."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Shuai, X., Pepe, A., and Bollen, J. (2012). How the Scientific Community Reacts to Newly Submitted Preprints: Article Downloads, Twitter Mentions, and Citations. PLoS ONE, 7.","DOI":"10.1371\/journal.pone.0047523"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1080\/19331681.2017.1338634","article-title":"Tweeting to the target: Candidates\u2019 use of strategic messages and @mentions on Twitter","volume":"15","author":"Hemsley","year":"2018","journal-title":"J. Inf. Technol. Politics"},{"key":"ref_43","unstructured":"Auxier, B., and Anderson, M. (2024, August 22). Social Media Use in 2021. Pew Res. Center, Available online: https:\/\/pewrsr.ch\/3cYWjHA."},{"key":"ref_44","unstructured":"Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., and Metzler, D. (2022). Emergent abilities of large language models. arXiv."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/26\/9\/734\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:44:42Z","timestamp":1760111082000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/26\/9\/734"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,29]]},"references-count":44,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2024,9]]}},"alternative-id":["e26090734"],"URL":"https:\/\/doi.org\/10.3390\/e26090734","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2024,8,29]]}}}