{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,17]],"date-time":"2026-06-17T20:08:17Z","timestamp":1781726897827,"version":"3.54.5"},"reference-count":53,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2023,2,27]],"date-time":"2023-02-27T00:00:00Z","timestamp":1677456000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100005246","name":"Institute of Education Sciences","doi-asserted-by":"publisher","award":["R305C160004"],"award-info":[{"award-number":["R305C160004"]}],"id":[{"id":"10.13039\/100005246","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>Academic discourse communities and learning circles are characterized by collaboration, sharing commonalities in terms of social interactions and language. The discourse of these communities is composed of jargon, common terminologies, and similarities in how they construe and communicate meaning. This study examines the extent to which discourse reveals \u201cshared language\u201d among its participants that can promote inclusion or affinity. Shared language is characterized in terms of linguistic features and lexical, syntactical, and semantic similarities. We leverage a multi-method approach, including (1) feature engineering using state-of-the-art natural language processing techniques to select the most appropriate features, (2) the bag-of-words classification model to predict linguistic similarity, (3) explainable AI using the local interpretable model-agnostic explanations to explain the model, and (4) a two-step cluster analysis to extract innate groupings between linguistic similarity and emotion. We found that linguistic similarity within and between the threaded discussions was significantly varied, revealing the dynamic and unconstrained nature of the discourse. Further, word choice moderately predicted linguistic similarity between posts within threaded discussions (accuracy = 0.73; F1-score = 0.67), revealing that discourse participants\u2019 lexical choices effectively discriminate between posts in terms of similarity. Lastly, cluster analysis reveals profiles that are distinctly characterized in terms of linguistic similarity, trust, and affect. Our findings demonstrate the potential role of linguistic similarity in supporting social cohesion and affinity within online discourse communities.<\/jats:p>","DOI":"10.3390\/computers12030053","type":"journal-article","created":{"date-parts":[[2023,2,28]],"date-time":"2023-02-28T03:00:38Z","timestamp":1677553238000},"page":"53","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Shared Language: Linguistic Similarity in an Algebra Discussion Forum"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4647-4362","authenticated-orcid":false,"given":"Michelle P.","family":"Banawan","sequence":"first","affiliation":[{"name":"Asian Institute of Management, Makati City 1229, Metro Manila, Philippines"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jinnie","family":"Shin","sequence":"additional","affiliation":[{"name":"College of Education, University of Florida, Gainesville, FL 32611, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5072-8636","authenticated-orcid":false,"given":"Tracy","family":"Arner","sequence":"additional","affiliation":[{"name":"Department of Psychology, Arizona State University, Tempe, AZ 85281, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Renu","family":"Balyan","sequence":"additional","affiliation":[{"name":"SUNY, Old Westbury, NY 11568, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Walter L.","family":"Leite","sequence":"additional","affiliation":[{"name":"College of Education, University of Florida, Gainesville, FL 32611, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Danielle S.","family":"McNamara","sequence":"additional","affiliation":[{"name":"Department of Psychology, Arizona State University, Tempe, AZ 85281, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1080\/00461520.2014.965823","article-title":"The ICAP framework: Linking cognitive engagement to active learning outcomes","volume":"49","author":"Chi","year":"2014","journal-title":"Educ. Psychol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"702","DOI":"10.1080\/03043797.2018.1538324","article-title":"The role of collaborative interactions versus individual construction on students\u2019 learning of engineering concepts","volume":"44","author":"Menekse","year":"2018","journal-title":"Eur. J. Eng. Educ."},{"key":"ref_3","unstructured":"Roscoe, R.D., Gutierrez, P.J., Wylie, R., and Chi, M.T. (2014). Evaluating Lesson Design and Implementation within the ICAP Framework, International Society of the Learning Sciences."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"D\u2019Angelo, S., and Gergle, D. (2016, January 7\u201312). Gazed and confused: Understanding and designing shared gaze for remote collaboration. Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, San Jose, CA, USA.","DOI":"10.1145\/2858036.2858499"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Bizzell, P. (1992). Academic Discourse and Critical Consciousness, University of Pittsburgh.","DOI":"10.2307\/j.ctt7zwb7k"},{"key":"ref_6","unstructured":"Hyland, K. (2011). Continuum Companion to Discourse Analysis, Bloomsbury Publishing."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"145","DOI":"10.35360\/njes.15","article-title":"A rich domain of ELF-the ELFA corpus of academic discourse","volume":"5","author":"Mauranen","year":"2006","journal-title":"Nord. J. Engl. Stud."},{"key":"ref_8","unstructured":"Liebman, N., and Gergle, D. (March, January 27). Capturing turn-by-turn lexical similarity in text-based communication. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, San Francisco, CA, USA."},{"key":"ref_9","unstructured":"Palloff, R.M., and Pratt, K. (2007). Building Online Learning Communities: Effective Strategies for the Virtual Classroom, John Wiley & Sons."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1222","DOI":"10.1016\/j.compedu.2009.11.008","article-title":"Engaging online learners: The impact of Web-based learning technology on college student engagement","volume":"54","author":"Lambert","year":"2010","journal-title":"Comput. Educ."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"419","DOI":"10.1080\/0144929X.2018.1441326","article-title":"Student interactions in online discussion forums: Their perception on learning with business simulation games","volume":"37","year":"2018","journal-title":"Behav. Inf. Technol."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"6347186","DOI":"10.1155\/2018\/6347186","article-title":"Student engagement predictions in an e-learning system and their impact on student course assessment scores","volume":"2018","author":"Hussain","year":"2018","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1016\/j.compedu.2013.06.009","article-title":"Predicting students\u2019 final performance from participation in on-line discussion forums","volume":"68","author":"Romero","year":"2013","journal-title":"Comput. Educ."},{"key":"ref_14","first-page":"24","article-title":"An investigation of factors affecting student participation level in an online discussion forum","volume":"9","author":"Yukselturk","year":"2010","journal-title":"Turk. Online J. Educ. Technol.-TOJET"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"304","DOI":"10.1109\/TLT.2012.10","article-title":"Language and discourse are powerful signals of student emotions during tutoring","volume":"5","author":"Graesser","year":"2012","journal-title":"IEEE Trans. Learn. Technol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/j.tics.2003.10.016","article-title":"Why is conversation so easy?","volume":"8","author":"Garrod","year":"2004","journal-title":"Trends Cogn. Sci."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1177\/0093650209351468","article-title":"Language style matching as a predictor of social dynamics in small groups","volume":"37","author":"Gonzales","year":"2010","journal-title":"Commun. Res."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1016\/0010-0277(87)90018-7","article-title":"Saying what you mean in dialogue: A study in conceptual and semantic co-ordination","volume":"27","author":"Garrod","year":"1987","journal-title":"Cognition"},{"key":"ref_19","first-page":"41","article-title":"Lexical entrainment in spontaneous dialog","volume":"96","author":"Brennan","year":"1996","journal-title":"Proc. ISSD"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Scissors, L.E., Gill, A.J., Geraghty, K., and Gergle, D. (2009, January 4\u20139). In CMC we trust: The role of similarity. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Boston, MA, USA.","DOI":"10.1145\/1518701.1518783"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Scissors, L.E., Gill, A.J., and Gergle, D. (2008, January 8\u201312). Linguistic mimicry and trust in text-based CMC. Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work, San Diego, CA, USA.","DOI":"10.1145\/1460563.1460608"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Friedberg, H., Litman, D., and Paletz, S.B. (2012, January 2\u20135). Lexical entrainment and success in student engineering groups. Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), Miami, FL, USA.","DOI":"10.1109\/SLT.2012.6424258"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Liu, Y., Li, A., Dang, J., and Zhou, D. (2021, January 18\u201322). Semantic and Acoustic-Prosodic Entrainment of Dialogues in Service Scenarios. Proceedings of the Companion Publication of the 2021 International Conference on Multimodal Interaction, Montreal, QC, Canada.","DOI":"10.1145\/3461615.3491105"},{"key":"ref_24","unstructured":"Lin, D. (1998, January 24\u201327). An information-theoretic definition of similarity. Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), Madison, WI, USA."},{"key":"ref_25","unstructured":"Princeton University (2010). \u201cAbout WordNet.\u201d WordNet, Princeton University."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"16291","DOI":"10.1109\/ACCESS.2019.2891692","article-title":"Challenging the boundaries of unsupervised learning for semantic similarity","volume":"7","author":"Pawar","year":"2019","journal-title":"IEEE Access"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1111\/j.1756-8765.2010.01117.x","article-title":"Computational methods to extract meaning from text and advance theories of human cognition","volume":"3","author":"McNamara","year":"2011","journal-title":"Top. Cogn. Sci."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Banawan, M., Shin, J., Balyan, R., Leite, W.L., and McNamara, D.S. (2022, January 1\u20133). Math Discourse Linguistic Components (Cohesive Cues within a Math Discussion Board Discourse). Proceedings of the Ninth ACM Conference on Learning@ Scale, New York, NY, USA.","DOI":"10.1145\/3491140.3528320"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"3017","DOI":"10.1007\/s11192-020-03502-9","article-title":"Math-word embedding in math search and semantic extraction","volume":"125","author":"Youssef","year":"2020","journal-title":"Scientometrics"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Jo, H., Kang, D., Head, A., and Hearst, M.A. (2021, January 16\u201320). Modeling Mathematical Notation Semantics in Academic Papers. Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, Punta Cana, Dominican Republic.","DOI":"10.18653\/v1\/2021.findings-emnlp.266"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Ferreira, D., and Freitas, A. (2020, January 6\u20138). Premise selection in natural language mathematical texts. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online.","DOI":"10.18653\/v1\/2020.acl-main.657"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Patel, A., Bhattamishra, S., and Goyal, N. (2021). Are NLP Models really able to Solve Simple Math Word Problems?. arXiv.","DOI":"10.18653\/v1\/2021.naacl-main.168"},{"key":"ref_33","unstructured":"(2021, January 22). Algebra Nation. Available online: https:\/\/lastinger.center.ufl.edu\/mathematics\/algebra-nation\/."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"964","DOI":"10.1080\/10705511.2021.1919895","article-title":"Multilevel Mixture Modeling with Propensity Score Weights for Quasi-Experimental Evaluation of Virtual Learning Environments","volume":"28","author":"Leite","year":"2021","journal-title":"Struct. Equ. Model. A Multidiscip. J."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"569","DOI":"10.1111\/jcal.12360","article-title":"The relationship between Algebra Nation usage and high-stakes test performance for struggling students","volume":"35","author":"Leite","year":"2019","journal-title":"J. Comput. Assist. Learn."},{"key":"ref_36","first-page":"411","article-title":"spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing","volume":"7","author":"Honnibal","year":"2017","journal-title":"Appear"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Pennington, J., Socher, R., and Manning, C.D. (2014, January 25\u201329). Glove: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar.","DOI":"10.3115\/v1\/D14-1162"},{"key":"ref_38","unstructured":"(2022, October 28). Available online: https:\/\/github.com\/MartinoMensio\/spacy-universal-sentence-encoder-tfhub."},{"key":"ref_39","unstructured":"(2022, October 28). Available online: https:\/\/tfhub.dev\/google\/universal-sentence-encoder\/4."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Hutto, C., and Gilbert, E. (2014, January 1\u20134). Vader: A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the International AAAI Conference on Web and Social Media, Ann Arbor, MI, USA.","DOI":"10.1609\/icwsm.v8i1.14550"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1191","DOI":"10.3758\/s13428-012-0314-x","article-title":"Norms of valence, arousal, and dominance for 13,915 English lemmas","volume":"45","author":"Warriner","year":"2013","journal-title":"Behav. Res. Methods"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1111\/j.1467-8640.2012.00460.x","article-title":"Crowdsourcing a word\u2013emotion association lexicon","volume":"29","author":"Mohammad","year":"2013","journal-title":"Comput. Intell."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1164","DOI":"10.3389\/fpsyg.2017.01164","article-title":"Vocabulary knowledge predicts lexical processing: Evidence from a group of participants with diverse educational backgrounds","volume":"8","author":"Mainz","year":"2017","journal-title":"Front. Psychol."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1037\/a0024177","article-title":"Individual differences in visual word recognition: Insights from the English Lexicon Project","volume":"38","author":"Yap","year":"2012","journal-title":"J. Exp. Psychol. Hum. Percept. Perform."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Ribeiro, M.T., Singh, S., and Guestrin, C. (2016, January 13\u201317). \u201cWhy should I trust you?\u201d Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939778"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1016\/j.datak.2007.01.002","article-title":"Investigating diversity of clustering methods: An empirical comparison","volume":"63","author":"Gelbard","year":"2007","journal-title":"Data Knowl. Eng."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1085","DOI":"10.3389\/fpsyg.2020.01085","article-title":"Using two-step cluster analysis and latent class cluster analysis to classify the cognitive heterogeneity of cross-diagnostic psychiatric inpatients","volume":"11","author":"Benassi","year":"2020","journal-title":"Front. Psychol."},{"key":"ref_48","unstructured":"Paxton, A., Roche, J.M., Ibarra, A., and Tanenhaus, M.K. (2014, January 23\u201326). Failure to (mis) communicate: Linguistic convergence, lexical choice, and communicative success in dyadic problem solving. Proceedings of the Annual Meeting of the Cognitive Science Society, Quebec City, QC, Canada."},{"key":"ref_49","unstructured":"Tosi, A. (2017). Adjusting Linguistically to Others: The Role of Social Context in Lexical Choices and Spatial Language. [Ph.D. Thesis, The University of Edinburgh]."},{"key":"ref_50","first-page":"59","article-title":"Discourse devices used to establish community, increase coherence, and negotiate agreement in an online university course","volume":"21","author":"Lapadat","year":"2007","journal-title":"Int. J. E-Learn. Distance Educ. Rev. Int. E-Learn. Form. \u00c0 Distance"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1016\/j.sbspro.2011.03.085","article-title":"The effect of cooperative learning on mathematics anxiety and help seeking behavior","volume":"15","author":"Lavasani","year":"2011","journal-title":"Procedia-Soc. Behav. Sci."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1186\/s41239-018-0100-7","article-title":"Student help-seeking attitudes and behaviors in a digital era","volume":"15","author":"Qayyum","year":"2018","journal-title":"Int. J. Educ. Technol. High. Educ."},{"key":"ref_53","unstructured":"Dadure, P., Pakray, P., and Bandyopadhyay, S. (2021). Deep Natural Language Processing and AI Applications for Industry 5.0, IGI Global."}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/12\/3\/53\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:43:47Z","timestamp":1760121827000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/12\/3\/53"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,27]]},"references-count":53,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,3]]}},"alternative-id":["computers12030053"],"URL":"https:\/\/doi.org\/10.3390\/computers12030053","relation":{},"ISSN":["2073-431X"],"issn-type":[{"value":"2073-431X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,27]]}}}