{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T17:37:11Z","timestamp":1780594631600,"version":"3.54.1"},"reference-count":189,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,1,13]],"date-time":"2025-01-13T00:00:00Z","timestamp":1736726400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000855","name":"University of Birmingham","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000855","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Artif. Intell."],"abstract":"<jats:p>In this article, we introduce a sociolinguistic perspective on language modeling. We claim that language models in general are inherently modeling <jats:italic>varieties of language<\/jats:italic>, and we consider how this insight can inform the development and deployment of language models. We begin by presenting a technical definition of the concept of a variety of language as developed in sociolinguistics. We then discuss how this perspective could help us better understand five basic challenges in language modeling: <jats:italic>social bias, domain adaptation, alignment, language change<\/jats:italic>, and <jats:italic>scale<\/jats:italic>. We argue that to maximize the performance and societal value of language models it is important to carefully compile training corpora that accurately represent the specific varieties of language being modeled, drawing on theories, methods, and descriptions from the field of sociolinguistics.<\/jats:p>","DOI":"10.3389\/frai.2024.1472411","type":"journal-article","created":{"date-parts":[[2025,1,13]],"date-time":"2025-01-13T06:13:24Z","timestamp":1736748804000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":24,"title":["The sociolinguistic foundations of language modeling"],"prefix":"10.3389","volume":"7","author":[{"given":"Jack","family":"Grieve","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sara","family":"Bartl","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Matteo","family":"Fuoli","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jason","family":"Grafmiller","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Weihang","family":"Huang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Alejandro","family":"Jawerbaum","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Akira","family":"Murakami","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Marcus","family":"Perlman","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Dana","family":"Roemling","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Bodo","family":"Winter","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1965","published-online":{"date-parts":[[2025,1,13]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2303.08774","article-title":"Gpt-4 technical report","author":"Achiam","year":"2023","journal-title":"arXiv"},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2410.16168","article-title":"Exploring pretraining via active forgetting for improving cross lingual transfer for decoder language models","author":"Aggarwal","year":"2024","journal-title":"arXiv"},{"key":"B3","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1017\/S0266078400001292","article-title":"Is scots a language?","volume":"1","author":"Aitken","year":"1985","journal-title":"English Today"},{"key":"B4","doi-asserted-by":"crossref","first-page":"2199","DOI":"10.1145\/3630106.3659033","article-title":"\u201cA critical analysis of the largest source for generative ai training data: Common crawl,\u201d","volume-title":"The 2024 ACM Conference on Fairness, Accountability, and Transparency","author":"Baack","year":"2024"},{"key":"B5","doi-asserted-by":"publisher","first-page":"e2311878121","DOI":"10.1073\/pnas.2311878121","article-title":"Explaining neural scaling laws","volume":"121","author":"Bahri","year":"2024","journal-title":"Proc. Nat. Acad. Sci"},{"key":"B6","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2402.03927","article-title":"Leak, cheat, repeat: Data contamination and evaluation malpractices in closed-source LLMs","author":"Balloccu","year":"2024","journal-title":"arXiv"},{"key":"B7","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1111\/josl.12080","article-title":"Gender identity and lexical variation in social media","volume":"18","author":"Bamman","year":"2014","journal-title":"J. Sociolinguist"},{"key":"B8","doi-asserted-by":"publisher","first-page":"399","DOI":"10.1111\/josl.12199","article-title":"Labov in sociolinguistics: an introduction","volume":"20","author":"Bell","year":"2016","journal-title":"J. Sociolinguist"},{"key":"B9","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1145\/3442188.3445922","article-title":"\u201cOn the dangers of stochastic parrots: can language models be too big?,\u201d","volume-title":"Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency","author":"Bender","year":"2021"},{"key":"B10","doi-asserted-by":"publisher","first-page":"1137","DOI":"10.1162\/153244303322533223","article-title":"A neural probabilistic language model","volume":"3","author":"Bengio","year":"2003","journal-title":"J. Mach. Learn. Res"},{"key":"B11","doi-asserted-by":"publisher","first-page":"125","DOI":"10.1075\/ijcl.15026.ber","article-title":"Dimensions of variation across internet registers","volume":"23","author":"Berber Sardinha","year":"2018","journal-title":"Int. J. Corpus Linguist"},{"key":"B12","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2402.01376","article-title":"LoTR: low tensor rank weight adaptation","author":"Bershatsky","year":"2024","journal-title":"arXiv"},{"key":"B13","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/978-3-031-56072-9_1","article-title":"\u201cOverview of pan 2024: Multi-author writing style analysis, multilingual text detoxification, oppositional thinking analysis, and generative ai authorship verification,\u201d","author":"Bevendorff","year":"2024","journal-title":"Advances in Information Retrieval"},{"key":"B14","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2305.04812","article-title":"A drop of ink makes a million think: the spread of false information in large language models","author":"Bian","year":"2023","journal-title":"arXiv"},{"key":"B15","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1515\/ling.1989.27.1.3","article-title":"A typology of english texts","volume":"27","author":"Biber","year":"1989","journal-title":"Linguistics"},{"key":"B16","volume-title":"Variation Across Speech and Writing","author":"Biber","year":"1991"},{"key":"B17","doi-asserted-by":"publisher","first-page":"243","DOI":"10.1093\/llc\/8.4.243","article-title":"Representativeness in corpus design","volume":"8","author":"Biber","year":"1993","journal-title":"Literary Linguist. Comp"},{"key":"B18","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511519871","author":"Biber","year":"1995","journal-title":"Dimensions of Register Variation: A Cross-Linguistic Comparison"},{"key":"B19","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1002\/9780470753460.ch10","article-title":"\u201cRegister variation: a corpus approach,\u201d","author":"Biber","year":"2005","journal-title":"The Handbook of Discourse Analysis"},{"key":"B20","doi-asserted-by":"crossref","DOI":"10.1017\/9781108686136","volume-title":"Register, Genre, and Style","author":"Biber","year":"2019"},{"key":"B21","doi-asserted-by":"crossref","DOI":"10.1017\/9781316388228","volume-title":"Register Variation Online","author":"Biber","year":"2018"},{"key":"B22","doi-asserted-by":"crossref","DOI":"10.4324\/9781003087991","volume-title":"The Register-Functional Approach to Grammatical Complexity: Theoretical Foundation, Descriptive Research Findings, Application","author":"Biber","year":"2021"},{"key":"B23","doi-asserted-by":"publisher","first-page":"277","DOI":"10.1038\/s42254-023-00581-4","article-title":"Science in the age of large language models","volume":"5","author":"Birhane","year":"2023","journal-title":"Nat. Rev. Phys"},{"key":"B24","volume-title":"Sociolinguistically Driven Approaches for Just natural language Processing","author":"Blodgett","year":"2021"},{"key":"B25","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.acl-main.485","article-title":"\u201cLanguage (Technology) is Power: A Critical Survey of \u201cBias\u201d","author":"Blodgett","year":"2020"},{"key":"B26","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1120","article-title":"Demographic dialectal variation in social media: a case study of african-american english","author":"Blodgett","year":"2016","journal-title":"arXiv"},{"key":"B27","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1707.00061","article-title":"Racial disparity in natural language processing: a case study of social media african-american english","author":"Blodgett","year":"2017","journal-title":"arXiv"},{"key":"B28","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2108.07258","article-title":"On the opportunities and risks of foundation models","author":"Bommasani","year":"2021","journal-title":"arXiv"},{"key":"B29","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-3002","article-title":"Identifying and reducing gender bias in word-level language models","author":"Bordia","year":"2019","journal-title":"arXiv"},{"key":"B30","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2005.14165","article-title":"Language models are few-shot learners","author":"Brown","year":"2020","journal-title":"arXiv"},{"key":"B31","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781139096768","volume-title":"Language Change","author":"Bybee","year":"2015"},{"key":"B32","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-34960-7_22","article-title":"\u201cEthical dilemmas, mental health, artificial intelligence, and llm-based chatbots,\u201d","author":"Cabrera","year":"2023","journal-title":"Bioinformatics and Biomedical Engineering, IWBBIO 2023. Lecture Notes in Computer Science, vol 13920"},{"key":"B33","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1126\/science.aal4230","article-title":"Semantics derived automatically from language corpora contain human-like biases","volume":"356","author":"Caliskan","year":"2017","journal-title":"Science"},{"key":"B34","volume-title":"Historical Linguistics","author":"Campbell","year":"2013"},{"key":"B35","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511805103","volume-title":"Dialectology","author":"Chambers","year":"1998"},{"key":"B36","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.findings-acl.927","article-title":"From representational harms to quality-of-service harms: a case study on llama 2 safety safeguards","author":"Chehbouni","year":"2024","journal-title":"arXiv"},{"key":"B37","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2310.14735","article-title":"Unleashing the potential of prompt engineering in large language models: a comprehensive review","author":"Chen","year":"2023","journal-title":"arXiv"},{"key":"B38","first-page":"1","article-title":"\u201cPreparedLLM: effective pre-pretraining framework for domain-specific large language models,\u201d","volume-title":"Big Earth Data","author":"Chen","year":"2024"},{"key":"B39","author":"Christian","year":"2021","journal-title":"The Alignment Problem: How Can Machines Learn Human Values"},{"key":"B40","doi-asserted-by":"publisher","first-page":"124","DOI":"10.1177\/09639470221090369","article-title":"A multi-dimensional analysis of english tweets","volume":"31","author":"Clarke","year":"2022","journal-title":"Lang. Literat"},{"key":"B41","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18653\/v1\/W17-3001","article-title":"\u201cDimensions of abusive language on Twitter,\u201d","volume-title":"Proceedings of the First Workshop on Abusive Language Online","author":"Clarke","year":"2017"},{"key":"B42","article-title":"\u201cThe trouble with bias,\u201d","volume-title":"Keynote at Neurips","author":"Crawford","year":"2017"},{"key":"B43","volume-title":"Explaining Language Change: An Evolutionary Approach","author":"Croft","year":"2000"},{"key":"B44","first-page":"130","article-title":"\u201cLLM-based system for technical writing real-time review in urban construction and technology,\u201d","volume-title":"Proceedings of 60th Annual Associated Schools of Construction International Conference","author":"Cruz-Castro","year":"2024"},{"key":"B45","volume-title":"A Dictionary of Linguistics and Phonetics","author":"Crystal","year":"2011"},{"key":"B46","volume-title":"Investigating English Style","author":"Crystal","year":"1969"},{"key":"B47","doi-asserted-by":"publisher","first-page":"818","DOI":"10.1038\/s41586-024-08025-4","article-title":"Scalable watermarking for identifying large language model outputs","volume":"634","author":"Dathathri","year":"2024","journal-title":"Nature"},{"key":"B48","first-page":"256","article-title":"\u201cFrustratingly easy domain adaptation,\u201d","volume-title":"Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics","author":"Daum\u00e9 III","year":"2007"},{"key":"B49","first-page":"22","article-title":"\u201cUsing relative entropy for detection and analysis of periods of diachronic linguistic change,\u201d","author":"Degaetano-Ortlieb","year":"2018","journal-title":"Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature"},{"key":"B50","doi-asserted-by":"publisher","first-page":"688","DOI":"10.1038\/s44159-023-00241-5","article-title":"Using large language models in psychology","volume":"2","author":"Demszky","year":"2023","journal-title":"Nat. Rev. Psychol"},{"key":"B51","doi-asserted-by":"publisher","first-page":"e2309583120","DOI":"10.1073\/pnas.2309583120","article-title":"Systematic testing of three language models reveals low language accuracy, absence of response stability, and a yes-response bias","volume":"120","author":"Dentella","year":"2023","journal-title":"Proc. Nat. Acad. Sci"},{"key":"B52","first-page":"246","article-title":"\u201cOn measures of biases and harms in NLP,\u201d","author":"Dev","year":"2022","journal-title":"Findings of the Association for Computational Linguistics: AACL-IJCNLP"},{"key":"B53","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1810.04805","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","author":"Devlin","year":"2018","journal-title":"arXiv"},{"key":"B54","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-1202","article-title":"Dialectometric analysis of language variation in twitter","author":"Donoso","year":"2017","journal-title":"arXiv"},{"key":"B55","doi-asserted-by":"publisher","first-page":"138","DOI":"10.1007\/s11229-023-04367-0","article-title":"Current cases of ai misalignment and their implications for future risks","volume":"202","author":"Dung","year":"2023","journal-title":"Synthese"},{"key":"B56","doi-asserted-by":"publisher","first-page":"87","DOI":"10.1146\/annurev-anthro-092611-145828","article-title":"Three waves of variation study: the emergence of meaning in the study of sociolinguistic variation","volume":"41","author":"Eckert","year":"2012","journal-title":"Annu. Rev. Anthropol"},{"key":"B57","doi-asserted-by":"crossref","DOI":"10.1017\/9781316403242","volume-title":"Meaning and Linguistic Variation: The Third Wave in Sociolinguistics","author":"Eckert","year":"2018"},{"key":"B58","doi-asserted-by":"publisher","first-page":"383","DOI":"10.1515\/cllt-2018-0033","article-title":"An information-theoretic view on language complexity and register variation: Compressing naturalistic corpus data","volume":"17","author":"Ehret","year":"2021","journal-title":"Corpus Linguist. Linguist. Theory"},{"key":"B59","doi-asserted-by":"publisher","first-page":"368","DOI":"10.1002\/9781118827628.ch21","article-title":"\u201cIdentifying Regional Dialects in On-Line Social Media,\u201d","author":"Eisenstein","year":"2017","journal-title":"The Handbook of Dialectology"},{"key":"B60","doi-asserted-by":"publisher","first-page":"e113114","DOI":"10.1371\/journal.pone.0113114","article-title":"Diffusion of lexical change in social media","volume":"9","author":"Eisenstein","year":"2014","journal-title":"PLoS ONE"},{"key":"B61","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2110.06674","article-title":"Truthful AI: developing and governing AI that does not lie","author":"Evans","year":"2021","journal-title":"arXiv"},{"key":"B62","doi-asserted-by":"publisher","first-page":"13346","DOI":"10.5210\/fm.v28i11.13346","article-title":"Should ChatGPT be biased? challenges and risks of bias in large language models","volume":"28","author":"Ferrara","year":"2023","journal-title":"First Monday"},{"key":"B63","doi-asserted-by":"publisher","first-page":"411","DOI":"10.1007\/s11023-020-09539-2","article-title":"Artificial intelligence, values, and alignment","volume":"30","author":"Gabriel","year":"2020","journal-title":"Minds Mach"},{"key":"B64","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2101.00027","article-title":"The pile: an 800gb dataset of diverse text for language modeling","author":"Gao","year":"2020","journal-title":"arXiv"},{"key":"B65","doi-asserted-by":"crossref","DOI":"10.1093\/acrefore\/9780199384655.013.364","article-title":"\u201cWilliam labov,\u201d","author":"Gordon","year":"2017","journal-title":"Oxford Research Encyclopedia of Linguistics"},{"key":"B66","doi-asserted-by":"publisher","first-page":"59","DOI":"10.3366\/E1749503208000075","article-title":"The identification of stages in diachronic data: variability-based neighbour clustering","volume":"3","author":"Gries","year":"2008","journal-title":"Corpora"},{"key":"B67","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781139506137","volume-title":"Regional Variation in Written American English","author":"Grieve","year":"2016"},{"key":"B68","doi-asserted-by":"publisher","first-page":"73","DOI":"10.1515\/lingvan-2021-0070","article-title":"Situational diversity and linguistic complexity","volume":"9","author":"Grieve","year":"2023","journal-title":"Linguist. Vanguard"},{"key":"B69","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/978-90-481-9178-9_14","article-title":"\u201cVariation among blogs: a multi-dimensional analysis,\u201d","author":"Grieve","year":"2010","journal-title":"Genres on the Web"},{"key":"B70","doi-asserted-by":"publisher","DOI":"10.3389\/978-2-8325-1760-4","article-title":"Computational sociolinguistics","author":"Grieve","year":"2023","journal-title":"Front. AI Res. Topic"},{"key":"B71","doi-asserted-by":"publisher","first-page":"11","DOI":"10.3389\/frai.2019.00011","article-title":"Mapping lexical dialect variation in british english using twitter","volume":"2","author":"Grieve","year":"2019","journal-title":"Front. Artif. Intellig"},{"key":"B72","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1017\/S1360674316000113","article-title":"Analyzing lexical emergence in modern American english online","volume":"21","author":"Grieve","year":"2017","journal-title":"English Lang. Linguist"},{"key":"B73","doi-asserted-by":"publisher","first-page":"293","DOI":"10.1177\/0075424218793191","article-title":"Mapping lexical innovation on american social media","volume":"46","author":"Grieve","year":"2018","journal-title":"J. Engl. Linguist"},{"key":"B74","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.findings-acl.58","article-title":"Econnli: evaluating large language models on economics reasoning","author":"Guo","year":"2024","journal-title":"arXiv"},{"key":"B75","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2004.10964","article-title":"Don't stop pretraining: Adapt language models to domains and tasks","author":"Gururangan","year":"2020","journal-title":"arXiv"},{"key":"B76","volume-title":"Language, Context, and Text: Aspects of Language in a Social-Semiotic Perspective","author":"Halliday","year":"1989"},{"key":"B77","volume-title":"Cohesion in English","author":"Halliday","year":"1976"},{"key":"B78","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s43681-024-00435-4","article-title":"Exploring ChatGPT and its impact on society","volume":"2024","author":"Haque","year":"2024","journal-title":"AI Ethics"},{"key":"B79","first-page":"14","article-title":"\u201cLarge language models meet cognitive science: LLMs as tools, models, and participants,\u201d","author":"Hardy","year":"2023","journal-title":"Proceedings of the 45th Annual Conference of the Cognitive Science Society"},{"key":"B80","volume-title":"Dictionary of Language and Linguistics","author":"Hartmann","year":"1972"},{"key":"B81","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1002\/ev.20556","article-title":"Large language model applications for evaluation: opportunities and ethical implications","volume":"2023","author":"Head","year":"2023","journal-title":"New Direct. Evaluat"},{"key":"B82","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2008.02275","article-title":"Aligning AI with shared human values","author":"Hendrycks","year":"2020","journal-title":"arXiv"},{"key":"B83","author":"Hoffmann","year":"2022"},{"key":"B84","doi-asserted-by":"publisher","first-page":"147","DOI":"10.1038\/s41586-024-07856-5","article-title":"AI generates covertly racist decisions about people based on their dialect","volume":"633","author":"Hofmann","year":"2024","journal-title":"Nature"},{"key":"B85","doi-asserted-by":"publisher","DOI":"10.3386\/w31122","author":"Horton","year":"2023","journal-title":"Large Language Models as Simulated Economic Agents: What Can we Learn from Homo silicus"},{"key":"B86","doi-asserted-by":"publisher","first-page":"1249","DOI":"10.1162\/tacl_a_00517","article-title":"Meta-learning the difference: preparing large language models for efficient adaptation","volume":"10","author":"Hou","year":"2022","journal-title":"Trans. Assoc. Comput. Linguist"},{"key":"B87","doi-asserted-by":"crossref","first-page":"42","DOI":"10.18653\/v1\/W18-1106","article-title":"\u201cThe social and the neural network: How to make natural language processing about people again,\u201d","volume-title":"Proceedings of the Second Workshop on Computational Modeling of People's Opinions, Personality, and Emotions in Social Media","author":"Hovy","year":"2018"},{"key":"B88","doi-asserted-by":"publisher","first-page":"e12432","DOI":"10.1111\/lnc3.12432","article-title":"Five sources of bias in natural language processing","volume":"15","author":"Hovy","year":"2021","journal-title":"Lang. Linguist. Compass"},{"key":"B89","first-page":"483","article-title":"\u201cTagging performance correlates with author age,\u201d","volume-title":"Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (volume 2: Short papers)","author":"Hovy","year":"2015"},{"key":"B90","first-page":"588","article-title":"\u201cThe importance of modeling social factors of language: Theory and practice,\u201d","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Hovy","year":"2021"},{"key":"B91","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2106.09685","article-title":"LoRA: low-rank adaptation of large language models","author":"Hu","year":"2021","journal-title":"arXiv"},{"key":"B92","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2406.18118","article-title":"Safealigner: Safety alignment against jailbreak attacks via response disparity guidance","author":"Huang","year":"2024","journal-title":"arXiv"},{"key":"B93","doi-asserted-by":"publisher","first-page":"937","DOI":"10.1515\/ling-2021-0138","article-title":"Geographic structure of Chinese dialects: a computational dialectometric approach","volume":"62","author":"Huang","year":"2024","journal-title":"Linguistics"},{"key":"B94","article-title":"\u201cAuthorial language models for AI authorship verification,\u201d","volume-title":"Working Notes of CLEF","author":"Huang","year":"2024"},{"key":"B95","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2401.12005","article-title":"ALMs: Authorial language models for authorship attribution","author":"Huang","year":"2024","journal-title":"arXiv"},{"key":"B96","doi-asserted-by":"publisher","first-page":"244","DOI":"10.1016\/j.compenvurbsys.2015.12.003","article-title":"Understanding us regional linguistic variation with twitter data analysis","volume":"59","author":"Huang","year":"2016","journal-title":"Comput. Environ. Urban Syst"},{"key":"B97","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2404.15777","article-title":"A comprehensive survey on evaluating large language model applications in the medical industry","author":"Huang","year":"2024","journal-title":"arXiv"},{"key":"B98","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1111\/josl.12366","article-title":"\u201csassy queens:\u201d Stylistic orthographic variation in twitter and the enregisterment of aave","volume":"24","author":"Ilbury","year":"2020","journal-title":"J. sociolinguist"},{"key":"B99","volume-title":"Key Terms in Linguistics","author":"Jackson","year":"2007"},{"key":"B100","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2310.06825","article-title":"Mistral 7B","author":"Jiang","year":"2023","journal-title":"arXiv"},{"key":"B101","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2406.18841","article-title":"Navigating llm ethics: advancements, challenges, and future directions","author":"Jiao","year":"2024","journal-title":"arXiv"},{"key":"B102","first-page":"147","article-title":"\u201cPolylanguaging in superdiversity,\u201d","volume-title":"Language and Superdiversity","author":"J\u00f3rgensen","year":"2015"},{"key":"B103","doi-asserted-by":"crossref","DOI":"10.1002\/9780470756393","volume-title":"The Handbook of Historical Linguistics","author":"Joseph","year":"2003"},{"key":"B104","unstructured":"Jurafsky\n              D.\n            \n            \n              Martin\n              J. H.\n            \n          \n          Speech and Language Processing, 3rd Edition\n          \n          2023"},{"key":"B105","doi-asserted-by":"crossref","first-page":"51","DOI":"10.18653\/v1\/P17-2009","article-title":"\u201cIncorporating dialectal variability for socially equitable language identification,\u201d","author":"Jurgens","year":"2017","journal-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)"},{"key":"B106","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2001.08361","article-title":"Scaling laws for neural language models","author":"Kaplan","year":"2020","journal-title":"arXiv"},{"key":"B107","doi-asserted-by":"publisher","first-page":"102274","DOI":"10.1016\/j.lindif.2023.102274","article-title":"Chatgpt for good? on opportunities and challenges of large language models for education","volume":"103","author":"Kasneci","year":"2023","journal-title":"Learn. Individ. Differ"},{"key":"B108","doi-asserted-by":"crossref","first-page":"553","DOI":"10.1145\/2835776.2835784","article-title":"\u201cTowards modelling language innovation acceptance in online social networks,\u201d","author":"Kershaw","year":"2016","journal-title":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining"},{"key":"B109","first-page":"17061","article-title":"\u201cA watermark for large language models,\u201d","volume-title":"International Conference on Machine Learning","author":"Kirchenbauer","year":"2023"},{"key":"B110","volume-title":"Impact of Pre-Training on Background Knowledge and Societal Bias","author":"Kocijan","year":"2021"},{"key":"B111","volume-title":"Sociolinguistic Patterns","author":"Labov","year":"1973"},{"key":"B112","doi-asserted-by":"crossref","first-page":"304","DOI":"10.1016\/B978-0-12-051130-3.50029-X","article-title":"\u201cThe social stratification of (r) in new york city department stores,\u201d","volume-title":"Dialect and Language Variation","author":"Labov","year":"1986"},{"key":"B113","doi-asserted-by":"crossref","DOI":"10.1515\/9783110167467","volume-title":"The Atlas of North American English: Phonetics, Phonology and Sound Change","author":"Labov","year":"2006"},{"key":"B114","doi-asserted-by":"crossref","first-page":"10383","DOI":"10.18653\/v1\/2023.emnlp-main.643","article-title":"\u201cImproving diversity of demographic representation in large language models via collective-critiques and self-voting,\u201d","volume-title":"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing","author":"Lahoti","year":"2023"},{"key":"B115","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511620928","volume-title":"Historical Linguistics and Language Change, Volume 81","author":"Lass","year":"1997"},{"key":"B116","first-page":"578","article-title":"\u201cDo we still need clinical language models?\u201d","volume-title":"Conference on Health, Inference, and Learning","author":"Lehman","year":"2023"},{"key":"B117","doi-asserted-by":"crossref","DOI":"10.4324\/9780203416433","volume-title":"Historical Linguistics: An Introduction","author":"Lehmann","year":"2013"},{"key":"B118","doi-asserted-by":"publisher","first-page":"839","DOI":"10.1609\/aies.v7i1.31684","article-title":"How are LLMs mitigating stereotyping harms? Learning from search engine studies","volume":"7","author":"Leidinger","year":"2024","journal-title":"Proc. AAAI\/ACM Conf. AI, Ethics, and Soc"},{"key":"B119","doi-asserted-by":"publisher","first-page":"18471","DOI":"10.1609\/aaai.v38i16.29808","article-title":"Task contamination: Language models may not be few-shot anymore","volume":"38","author":"Li","year":"2024","journal-title":"Proc. AAAI Conf. AI"},{"key":"B120","doi-asserted-by":"publisher","first-page":"e333","DOI":"10.1016\/S2589-7500(23)00083-3","article-title":"Ethics of large language models in medicine and medical research","volume":"5","author":"Li","year":"2023","journal-title":"Lancet Digital Health"},{"key":"B121","doi-asserted-by":"crossref","first-page":"9993","DOI":"10.18653\/v1\/2024.acl-long.538","article-title":"\u201cNewsBench: a systematic evaluation framework for assessing editorial capabilities of large language models in chinese journalism,\u201d","author":"Li","year":"2024","journal-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)"},{"key":"B122","doi-asserted-by":"publisher","first-page":"1092","DOI":"10.1126\/science.abq1158","article-title":"Competition-level code generation with alphacode","volume":"378","author":"Li","year":"2022","journal-title":"Science"},{"key":"B123","doi-asserted-by":"publisher","first-page":"269","DOI":"10.1075\/rs.18005.lii","article-title":"Exploring register variation on reddit: a multi-dimensional study of language use on a social media website","volume":"1","author":"Liimatta","year":"2019","journal-title":"Register Stud"},{"key":"B124","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2305.16960","article-title":"Training socially aligned language models in simulated human society","author":"Liu","year":"2023","journal-title":"arXiv"},{"key":"B125","doi-asserted-by":"publisher","first-page":"241","DOI":"10.18653\/v1\/2022.findings-naacl.18","article-title":"Aligning generative language models with human values","volume":"2022","author":"Liu","year":"2022","journal-title":"Find. Assoc. Comp. Linguist.: NAACL"},{"key":"B126","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2406.10130","article-title":"The devil is in the neurons: Interpreting and mitigating social biases in pre-trained language models","author":"Liu","year":"2024","journal-title":"arXiv"},{"key":"B127","doi-asserted-by":"publisher","first-page":"570","DOI":"10.1002\/asi.24750","article-title":"ChatGPT and a new academic reality: artificial intelligence-written research papers and the ethics of the large language models in scholarly publishing","volume":"74","author":"Lund","year":"2023","journal-title":"J. Assoc. Inform. Sci. Technol"},{"key":"B128","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2407.15240","article-title":"Bigbench: a unified benchmark for social bias in text-to-image generative models based on multi-modal LLM","author":"Luo","year":"2024","journal-title":"arXiv"},{"key":"B129","doi-asserted-by":"crossref","first-page":"942","DOI":"10.18653\/v1\/2022.emnlp-main.61","article-title":"\u201cEntity extraction in low resource domains with selective pre-training of large language models,\u201d","author":"Mahapatra","year":"2022","journal-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing"},{"key":"B130","doi-asserted-by":"publisher","author":"Marcus","year":"2023","DOI":"10.48550\/arXiv.2308.00109"},{"key":"B131","first-page":"149","article-title":"\u201cLanguage, register and genre,\u201d","volume-title":"Analysing English in a Global Context: A Reader","author":"Martin","year":"2001"},{"key":"B132","volume-title":"Oxford Concise Dictionary of Linguistics","author":"Matthews","year":"1997"},{"key":"B133","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40554-015-0015-8","article-title":"Register in the round: Registerial cartography","volume":"2","author":"Matthiessen","year":"2015","journal-title":"Funct. Linguist"},{"key":"B134","doi-asserted-by":"publisher","first-page":"22","DOI":"10.55206\/CIKP7841","article-title":"Linguistic and rhetorical features of dialogue on rhetorical topics between a human and chatbot gpt","volume":"56","author":"Mavrodieva","year":"2023","journal-title":"Rhetoric Commun"},{"key":"B135","volume-title":"Corpus Linguistics","author":"McEnery","year":"2001"},{"key":"B136","volume-title":"Corpus-based Language Studies: An Advanced Resource Book","author":"McEnery","year":"2006"},{"key":"B137","doi-asserted-by":"crossref","DOI":"10.4324\/9780429507922","volume-title":"Introducing Sociolinguistics","author":"Meyerhoff","year":"2018"},{"key":"B138","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1162\/nol_a_00105","article-title":"Strong prediction: language model surprisal explains multiple n400 effects","volume":"2024","author":"Michaelov","year":"2024","journal-title":"Neurobiol. Lang"},{"key":"B139","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3597307","article-title":"Biases in large language models: origins, inventory, and discussion","volume":"15","author":"Navigli","year":"2023","journal-title":"J. Data Inform. Quality"},{"key":"B140","doi-asserted-by":"publisher","first-page":"57","DOI":"10.3390\/informatics11030057","article-title":"Large language models in healthcare and medical domain: A review","volume":"11","author":"Nazi","year":"2024","journal-title":"Informatics"},{"key":"B141","doi-asserted-by":"crossref","DOI":"10.4324\/9781315475172","volume-title":"Historical Sociolinguistics: Language Change in Tudor and Stuart England","author":"Nevalainen","year":"2016"},{"key":"B142","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2209.00626","article-title":"The alignment problem from a deep learning perspective","author":"Ngo","year":"2022","journal-title":"arXiv"},{"key":"B143","doi-asserted-by":"publisher","first-page":"537","DOI":"10.1162\/COLI_a_00258","article-title":"Computational sociolinguistics: a survey","volume":"42","author":"Nguyen","year":"2016","journal-title":"Comput. Linguist"},{"key":"B144","first-page":"603","article-title":"\u201cOn learning and representing social meaning in nlp: a sociolinguistic perspective,\u201d","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Nguyen","year":"2021"},{"key":"B145","volume-title":"Interpretable Hate Speech Detection via Large Language Model-Extracted Rationales","author":"Nirmal","year":"2024"},{"key":"B146","year":"2022","journal-title":"Chatgpt"},{"key":"B147","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2203.02155","article-title":"Training language models to follow instructions with human feedback","author":"Ouyang","year":"2022","journal-title":"arXiv"},{"key":"B148","doi-asserted-by":"publisher","first-page":"187","DOI":"10.1215\/00031283-3130324","article-title":"Audience-modulated variation in online social media","volume":"90","author":"Pavalanathan","year":"2015","journal-title":"Am. Speech"},{"key":"B149","doi-asserted-by":"publisher","first-page":"84","DOI":"10.1177\/10776958221149577","article-title":"Collaborating with chatgpt: Considering the implications of generative artificial intelligence for journalism and media education","volume":"78","author":"Pavlik","year":"2023","journal-title":"Journalism & mass communication educator"},{"key":"B150","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2410.12174","article-title":"Exploring large language models for hate speech detection in rioplatense Spanish","author":"P\u00e9rez","year":"2024","journal-title":"arXiv"},{"key":"B151","article-title":"\u201cModern language models refute chomsky's approach to language,\u201d","volume-title":"Technical Report, Lingbuzz Preprint","author":"Piantadosi","year":"2023"},{"key":"B152","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI Blog"},{"key":"B153","first-page":"1","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"21","author":"Raffel","year":"2020","journal-title":"J. Mach. Learn. Res"},{"key":"B154","doi-asserted-by":"publisher","first-page":"2106","DOI":"10.18653\/v1\/2023.findings-eacl.157","article-title":"Fairness in language models beyond english: Gaps and challenges","volume":"2023","author":"Ramesh","year":"2023","journal-title":"Find. Assoc. Comp. Linguist.: EACL"},{"key":"B155","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1016\/j.iotcps.2023.04.003","article-title":"Chatgpt: a comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope","volume":"3","author":"Ray","year":"2023","journal-title":"Intern. Things Cyber-Phys Syst"},{"key":"B156","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1007\/978-3-030-02438-3_133","article-title":"\u201cDialect typology: recent advances,\u201d","volume-title":"Handbook of the Changing World Language Map","author":"R\u00f6thlisberger","year":"2020"},{"key":"B157","first-page":"66","article-title":"\u201cLanguage Modeling with Limited Domain Data,\u201d","volume-title":"Proc. ARPA Spoken Language Systems Technology Workshop","author":"Rudnicky","year":"1995"},{"key":"B158","volume-title":"Artificial Intelligence: A Modern Approach","author":"Russell","year":"2016"},{"key":"B159","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.240","article-title":"Masked language model scoring","author":"Salazar","year":"2019","journal-title":"arXiv"},{"key":"B160","volume-title":"Empirical Linguistics","author":"Sampson","year":"2002"},{"key":"B161","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2401.00448","article-title":"Beyond chinchilla-optimal: accounting for inference in language model scaling laws","author":"Sardana","year":"2023","journal-title":"arXiv"},{"key":"B162","doi-asserted-by":"crossref","DOI":"10.1075\/scl.60","volume-title":"Multi-Dimensional Analysis, 25 Years On: A Tribute to Douglas Biber, volume 60","author":"Sardinha","year":"2014"},{"key":"B163","article-title":"\u201cPhilosophy of linguistics. In of Philosophy (Spring Edition),\u201d","author":"Scholz","year":"2024","journal-title":"The Stanford Encyclopedia of Philosophy (Spring Edition)"},{"key":"B164","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-022-00458-8","article-title":"Large pre-trained language models contain human-like biases of what is right and wrong to do","author":"Schramowski","year":"2022","journal-title":"Nat. Mach. Intellig"},{"key":"B165","doi-asserted-by":"crossref","first-page":"5248","DOI":"10.18653\/v1\/2020.acl-main.468","article-title":"\u201cPredictive biases in natural language processing models,\u201d","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Shah","year":"2020"},{"key":"B166","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2309.15025","article-title":"Large language model alignment: A survey","author":"Shen","year":"2023","journal-title":"arXiv"},{"key":"B167","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2305.17493","article-title":"The curse of recursion: Training on generated data makes models forget","author":"Shumailov","year":"2023","journal-title":"arXiv"},{"key":"B168","doi-asserted-by":"publisher","first-page":"5861","DOI":"10.48550\/arXiv.2106.10328","article-title":"Process for adapting language models to society (palms) with values-targeted datasets","volume":"34","author":"Solaiman","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst"},{"key":"B169","first-page":"131","article-title":"\u201cEthical considerations in the implementation and usage of large language models,\u201d","volume-title":"The 17th International Conference Interdisciplinarity in Engineering","author":"Stefan","year":"2023"},{"key":"B170","doi-asserted-by":"crossref","first-page":"4360","DOI":"10.18653\/v1\/D18-1467","article-title":"\u201cMaking fetch? happen: the influence of social and linguistic context on the success of lexical innovations,\u201d","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Stewart","year":"2018"},{"key":"B171","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511801624","volume-title":"Analysing Sociolinguistic Variation","author":"Tagliamonte","year":"2006"},{"key":"B172","volume-title":"Variationist Sociolinguistics: Change, Observation, Interpretation","author":"Tagliamonte","year":"2011"},{"key":"B173","doi-asserted-by":"publisher","first-page":"1930","DOI":"10.1038\/s41591-023-02448-8","article-title":"Large language models in medicine","volume":"29","author":"Thirunavukarasu","year":"2023","journal-title":"Nat. Med"},{"key":"B174","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2401.01313","article-title":"A comprehensive survey of hallucination mitigation techniques in large language models","author":"Tonmoy","year":"2024","journal-title":"arXiv"},{"key":"B175","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2307.09288","article-title":"Llama 2: Open foundation and fine-tuned chat models","author":"Touvron","year":"2023","journal-title":"arXiv"},{"key":"B176","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2401.09334","article-title":"Towards neuro-symbolic models of language cognition: Llms as proposers and evaluators","author":"Tsvilodub","year":"2024","journal-title":"arXiv"},{"key":"B177","first-page":"30","article-title":"Attention is all you need","volume":"2017","author":"Vaswani","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst"},{"key":"B178","unstructured":"Wang\n              B.\n            \n            \n              Komatsuzaki\n              A.\n            \n          \n          GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model\n          \n          2021"},{"key":"B179","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2407.05437","article-title":"Enhancing computer programming education with LLMs: a study on effective prompt engineering for Python code generation","author":"Wang","year":"2024","journal-title":"arXiv"},{"key":"B180","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2307.12966","article-title":"Aligning large language models with human: A survey","author":"Wang","year":"2023","journal-title":"arXiv"},{"key":"B181","volume-title":"An Introduction to Sociolinguistics","author":"Wardhaugh","year":"2021"},{"key":"B182","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2105.05595","article-title":"An introduction to algorithmic fairness","author":"Weerts","year":"2021","journal-title":"arXiv"},{"key":"B183","doi-asserted-by":"publisher","first-page":"243","DOI":"10.1146\/annurev-linguist-030514-124930","article-title":"Advances in dialectometry","volume":"1","author":"Wieling","year":"2015","journal-title":"Annual Rev. Linguist"},{"key":"B184","doi-asserted-by":"publisher","first-page":"1355","DOI":"10.1126\/science.131.3410.1355","article-title":"Some moral and technical consequences of automation: as machines learn they may develop unforeseen strategies at rates that baffle their programmers","volume":"131","author":"Wiener","year":"1960","journal-title":"Science"},{"key":"B185","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2304.11082","article-title":"Fundamental limitations of alignment in large language models","author":"Wolf","year":"2023","journal-title":"arXiv"},{"key":"B186","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2104.06390","article-title":"Detoxifying language models risks marginalizing minority voices","author":"Xu","year":"2021","journal-title":"arXiv"},{"key":"B187","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2405.02411","article-title":"The call for socially aware language technologies","author":"Yang","year":"2024","journal-title":"arXiv"},{"key":"B188","doi-asserted-by":"publisher","first-page":"2400429","DOI":"10.1002\/aisy.202400429","article-title":"Large language model-based chatbots in higher education","volume":"2024","author":"Yigci","year":"2024","journal-title":"Adv. Intellig. Syst"},{"key":"B189","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2312.01509","article-title":"Tackling bias in pre-trained language models: Current trends and under-represented societies","author":"Yogarajan","year":"2023","journal-title":"arXiv"}],"container-title":["Frontiers in Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2024.1472411\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,13]],"date-time":"2025-01-13T06:14:09Z","timestamp":1736748849000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2024.1472411\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,13]]},"references-count":189,"alternative-id":["10.3389\/frai.2024.1472411"],"URL":"https:\/\/doi.org\/10.3389\/frai.2024.1472411","relation":{},"ISSN":["2624-8212"],"issn-type":[{"value":"2624-8212","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,13]]},"article-number":"1472411"}}