{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T00:15:44Z","timestamp":1777853744430,"version":"3.51.4"},"reference-count":31,"publisher":"SAGE Publications","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["JID"],"published-print":{"date-parts":[[2023,10,17]]},"abstract":"<jats:p>Qualitative data analysis is produced frequently in healthcare settings, which is a time-consuming and skilled analytic task. The use of qualitative research findings in clinical settings takes years, which is sometimes obsolete knowledge as the health context is dynamic. Artificial Intelligence (AI)-based qualitative data analysis might present with rapid analysis of text-based data in real-time, thereby empowering qualitative researchers to expedite their analysis and facilitate timely use of the research findings. We tested an AI-based method to complement the manual analysis of text-based data from the verbatim transcripts of seven mall managers\u2019 interviews. First, we prepared text data into a machine-calculable format and employed BERT model to extract sentence-level features in our case. Second, we implement TF-IDF-based keywords mining techniques to extract the main candidate themes from the interview transcripts to support text-based analysis, including: 1) primary cluster detection algorithm, and 2) keyword extraction algorithm. The extracted core themes provide qualitative researchers with a more comprehensive overview of the qualitative data. Most of the sentences clustered in meaningful short topics or sentences carrying independent and clear information. The extracted topics and clustered sentences reduced qualitative researchers\u2019 workload by condensing and identifying meaningful concepts and naming them. This method combining contextualized word embeddings, unsupervised clustering, and keyword extraction techniques can significantly reduce the overall workload and time consumed in qualitative research using conventional methods.<\/jats:p>","DOI":"10.3233\/jid-220013","type":"journal-article","created":{"date-parts":[[2022,10,14]],"date-time":"2022-10-14T11:34:03Z","timestamp":1665747243000},"page":"41-58","source":"Crossref","is-referenced-by-count":17,"title":["Natural language processing (NLP) aided qualitative method in health research"],"prefix":"10.1177","volume":"27","author":[{"given":"Cheligeer","family":"Cheligeer","sequence":"first","affiliation":[{"name":"Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC, Canada"},{"name":"Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada"}]},{"given":"Lin","family":"Yang","sequence":"additional","affiliation":[{"name":"Department of Cancer Epidemiology and Prevention Research, Alberta Health Services, Calgary, AB, Canada"},{"name":"Department of Cancer, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada"},{"name":"Department of Community Health Science, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada"}]},{"given":"Tannistha","family":"Nandi","sequence":"additional","affiliation":[{"name":"Department of Information Technologies, Research Computing Services, University of Calgary, Calgary, AB, Canada"}]},{"given":"Chelsea","family":"Doktorchik","sequence":"additional","affiliation":[{"name":"Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada"},{"name":"Department of Community Health Science, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada"}]},{"given":"Hude","family":"Quan","sequence":"additional","affiliation":[{"name":"Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada"},{"name":"Department of Community Health Science, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada"}]},{"given":"Yong","family":"Zeng","sequence":"additional","affiliation":[{"name":"Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC, Canada"}]},{"given":"Shaminder","family":"Singh","sequence":"additional","affiliation":[{"name":"Department of Community Health Science, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada"},{"name":"School of Nursing and Midwifery, Faculty of Health, Community and Education, Mount Royal University, Calgary, AB, Canada"}]}],"member":"179","reference":[{"issue":"4-5","key":"10.3233\/JID-220013_ref3","first-page":"993","article-title":"Latent Dirichlet allocation","volume":"3","author":"Blei,","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"10.3233\/JID-220013_ref4","doi-asserted-by":"crossref","unstructured":"Cath, C. (2018) Governing artificial intelligence: Ethical, legal and technical opportunities and challenges. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 376(2133). https:\/\/doi.org\/10.1098\/rsta.2018.0080","DOI":"10.1098\/rsta.2018.0080"},{"issue":"3","key":"10.3233\/JID-220013_ref6","doi-asserted-by":"crossref","first-page":"201","DOI":"10.4997\/jrcpe.2015.305","article-title":"Qualitative research in healthcare: An introduction to grounded theory using thematic analysis","volume":"45","author":"Chapman,","year":"2015","journal-title":"Journal of the Royal College of Physicians of Edinburgh"},{"key":"10.3233\/JID-220013_ref8","doi-asserted-by":"crossref","unstructured":"Chen, N.C , Drouhard, M , Kocielnik, R , Suh, J , Aragon, C.R. (2018) Using machine learning to support qualitative coding in social science: Shifting the focus to ambiguity. ACM Transactions on Interactive Intelligent Systems, 8(2). https:\/\/doi.org\/10.1145\/3185515","DOI":"10.1145\/3185515"},{"key":"10.3233\/JID-220013_ref10","doi-asserted-by":"crossref","first-page":"205031211882292","DOI":"10.1177\/2050312118822927","article-title":"Grounded theory research: A design framework for novice researchers","volume":"7","author":"Chun Tie,","year":"2019","journal-title":"SAGE Open Medicine"},{"issue":"December 2018","key":"10.3233\/JID-220013_ref13","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/j.ijmedinf.2019.02.008","article-title":"A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data","volume":"125","author":"Dreisbach,","year":"2019","journal-title":"International Journal of Medical Informatics"},{"key":"10.3233\/JID-220013_ref14","first-page":"4052","article-title":"Pre-trained language model representations for language generation","volume":"1","author":"Edunov,","year":"2019","journal-title":"NAACL HLT 2019\u20132019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies \u2013Proceedings of the Conference"},{"key":"10.3233\/JID-220013_ref17","doi-asserted-by":"crossref","unstructured":"Gu, Y , Tinn, R , Cheng, H , Lucas, M , Usuyama, N , Liu, X , Naumann, T , Gao, J , Poon, H. (2022) Domain-Specific Lan guage Model Pretraining for Biomedical Natural Language Processing. ACM Transactions on Computing for Healthcare, 3(1). https:\/\/doi.org\/10.1145\/3458754","DOI":"10.1145\/3458754"},{"issue":"7","key":"10.3233\/JID-220013_ref19","doi-asserted-by":"crossref","first-page":"1007","DOI":"10.1007\/s11121-015-0561-z","article-title":"Clustering Methods with Qualitative Data: aMixed-Methods Approach for Prevention Research with Small Samples","volume":"16","author":"Henry,","year":"2015","journal-title":"Prevention Science"},{"key":"10.3233\/JID-220013_ref20","first-page":"328","article-title":"Universal language model fine-tuning for text classification","volume":"1","author":"Howard,","year":"2018","journal-title":"ACL 2018 \u2013 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)"},{"key":"10.3233\/JID-220013_ref21","doi-asserted-by":"crossref","first-page":"BII.S11661","DOI":"10.4137\/BII.S11661","article-title":"Using Conversation Topics for Predicting Therapy Outcomes in Schizophrenia","volume":"6s1","author":"Howes,","year":"2013","journal-title":"Biomedical Informatics Insights"},{"key":"10.3233\/JID-220013_ref22","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1016\/j.jbi.2013.09.003","article-title":"Discovery of clinical pathway patterns from event logs using probabilistic topic models","volume":"47","author":"Huang,","year":"2014","journal-title":"Journal of Biomedical Informatics"},{"issue":"3","key":"10.3233\/JID-220013_ref23","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1145\/331499.331504","article-title":"Data clustering: A Review","volume":"31","author":"Jain,","year":"1999","journal-title":"ACM Computing Surveys (CSUR)"},{"key":"10.3233\/JID-220013_ref24","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2016.35","article-title":"MIMIC-III, a freely accessible critical care database","volume":"3","author":"Johnson,","year":"2016","journal-title":"Scientific Data"},{"key":"10.3233\/JID-220013_ref25","doi-asserted-by":"crossref","unstructured":"Jurafsky, D , Martin, J.H. (2018) Speech and Language Processing. 1. https:\/\/doi.org\/10.1162\/089120100750105975","DOI":"10.1162\/089120100750105975"},{"key":"10.3233\/JID-220013_ref26","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: a pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee,","year":"2020","journal-title":"Bioinformatics"},{"key":"10.3233\/JID-220013_ref27","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1177\/1609406919887021","article-title":"Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study","volume":"18","author":"Leeson,","year":"2019","journal-title":"International Journal of Qualitative Methods"},{"key":"10.3233\/JID-220013_ref28","unstructured":"Liu, Y , Ott, M , Goyal, N , Du, J , Joshi, M , Chen, D , Levy, O , Lewis, M , Zettlemoyer, L , Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. 1."},{"key":"10.3233\/JID-220013_ref30","first-page":"1744","volume":"iii","author":"Michalopoulos,","year":"2021","journal-title":"UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus"},{"issue":"1","key":"10.3233\/JID-220013_ref32","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1177\/0049124117729703","article-title":"Computational Grounded Theory: A Methodological Framework","volume":"49","author":"Nelson,","year":"2020","journal-title":"Sociological Methods and Research"},{"key":"10.3233\/JID-220013_ref33","first-page":"2227","article-title":"Deep contextualized word representations","volume":"1","author":"Peters,","year":"2018","journal-title":"NAACLHLT 2018 \u2013 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies \u2013 Proceedings of the Conference"},{"issue":"10","key":"10.3233\/JID-220013_ref34","doi-asserted-by":"crossref","first-page":"1872","DOI":"10.1007\/s11431-020-1647-3","article-title":"Pre-trained models for natural language processing: A survey","volume":"63","author":"Qiu,","year":"2020","journal-title":"Science China Technological Sciences"},{"key":"10.3233\/JID-220013_ref36","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1162\/tacl_a_00313","article-title":"Leveraging Pre-trained Checkpoints for Sequence Generation Tasks","volume":"8","author":"Rothe,","year":"2020","journal-title":"Transactions of the Association for Computational Linguistics"},{"issue":"1","key":"10.3233\/JID-220013_ref37","doi-asserted-by":"crossref","first-page":"75","DOI":"10.25139\/jsk.v4i1.2180","article-title":"Communication pattern between nurses and elderly patients through a neuro-linguistic programming approach","volume":"4","author":"Rustan,","year":"2020","journal-title":"Jurnal Studi Komunikasi (Indonesian Journal of Communications Studies)"},{"issue":"1","key":"10.3233\/JID-220013_ref38","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1177\/1754073919898526","article-title":"A Review on Five Recent and Near-Future Developments in Computational Processing of Emotion in the Human Voice","volume":"13","author":"Schuller,","year":"2021","journal-title":"Emotion Review"},{"key":"10.3233\/JID-220013_ref39","first-page":"0","article-title":"Machine Learning and Grounded Theory: New Opportunities for Mixed-Design Research","volume":"2020","author":"Singh,","year":"2020","journal-title":"Americas Conference on Information Systems (AMCIS)"},{"key":"10.3233\/JID-220013_ref40","doi-asserted-by":"crossref","unstructured":"Singh, S , Estefan, A. (2018) Selecting a Grounded Theory Approach for Nursing Research. Global Qualitative Nursing Research, 5. https:\/\/doi.org\/10.1177\/2333393618799571","DOI":"10.1177\/2333393618799571"},{"key":"10.3233\/JID-220013_ref41","doi-asserted-by":"crossref","unstructured":"Smith, J.G , Tissing, R. (2018) Using computational text classification for qualitative research and evaluation in extension. Journal of Extension, 56(2).","DOI":"10.34068\/joe.56.02.04"},{"issue":"10","key":"10.3233\/JID-220013_ref42","doi-asserted-by":"crossref","first-page":"1372","DOI":"10.1177\/1049732307307031","article-title":"Choose your method: A comparison of phenomenology, discourse analysis, and grounded theory","volume":"17","author":"Starks,","year":"2007","journal-title":"Qualitative Health Research"},{"key":"10.3233\/JID-220013_ref43","first-page":"1105","article-title":"Evaluation methods for topic models","volume":"4","author":"Wallach,","year":"2009","journal-title":"Proceedings of the 26th International Conference On Machine Learning, ICML 2009"},{"key":"10.3233\/JID-220013_ref45","first-page":"662","article-title":"Neural question generation from text: A preliminary study","volume":"10619 LNAI","author":"Zhou,","year":"2018","journal-title":"Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)"}],"container-title":["Journal of Integrated Design and Process Science"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/JID-220013","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T22:56:33Z","timestamp":1777503393000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/JID-220013"}},"subtitle":[],"editor":[{"given":"Varadraj P.","family":"Gurupur","sequence":"additional","affiliation":[]},{"given":"Thomas T.H.","family":"Wan","sequence":"additional","affiliation":[]},{"given":"Rama Raju","family":"Rudraraju","sequence":"additional","affiliation":[]},{"given":"Shrirang A.","family":"Kulkarni","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,10,17]]},"references-count":31,"journal-issue":{"issue":"1"},"URL":"https:\/\/doi.org\/10.3233\/jid-220013","relation":{},"ISSN":["1092-0617","1875-8959"],"issn-type":[{"value":"1092-0617","type":"print"},{"value":"1875-8959","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,17]]}}}