{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,4]],"date-time":"2026-03-04T06:07:50Z","timestamp":1772604470333,"version":"3.50.1"},"reference-count":66,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T00:00:00Z","timestamp":1672531200000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000051","name":"National Human Genome Research Institute","doi-asserted-by":"publisher","award":["K01HG011341"],"award-info":[{"award-number":["K01HG011341"]}],"id":[{"id":"10.13039\/100000051","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000069","name":"National Institute of Arthritis and Musculoskeletal and Skin Diseases","doi-asserted-by":"publisher","award":["K24AR075060"],"award-info":[{"award-number":["K24AR075060"]}],"id":[{"id":"10.13039\/100000069","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"publisher","award":["KL2TR001083"],"award-info":[{"award-number":["KL2TR001083"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"publisher","award":["UL1TR001085"],"award-info":[{"award-number":["UL1TR001085"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"publisher","award":["UL1TR003142"],"award-info":[{"award-number":["UL1TR003142"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000054","name":"National Cancer Institute","doi-asserted-by":"publisher","award":["DP2CA225433"],"award-info":[{"award-number":["DP2CA225433"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Big Data &amp; Society"],"published-print":{"date-parts":[[2023,1]]},"abstract":"<jats:p> Differences between computationally generated and human-generated themes in unstructured text are important to understand yet difficult to assess formally. In this study, we bridge these approaches through two contributions. First, we formally compare a primarily computational approach, topic modeling, to a primarily human-driven approach, qualitative thematic coding, in an impactful context: physician mothers\u2019 experience of workplace discrimination. Second, we compare our chosen topic model to a principled alternative topic model to make explicit study design decisions meriting consideration in future research. By formally contrasting computationally generated (i.e. topic modeling) and human-generated (i.e. thematic coding) knowledge, we shed light on issues of interest to several audiences, notably computational social scientists who wish to understand study design tradeoffs, and qualitative researchers who may wish to leverage computational methods to improve the speed and reproducibility of labor-intensive coding. Although useful in other domains, we highlight the value of fast, reproducible methods to better understand experiences of workplace discrimination. <\/jats:p>","DOI":"10.1177\/20539517221149106","type":"journal-article","created":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T05:51:40Z","timestamp":1674625900000},"update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":20,"title":["Formally comparing topic models and human-generated qualitative coding  of physician mothers\u2019 experiences  of workplace discrimination"],"prefix":"10.1177","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5125-4735","authenticated-orcid":false,"given":"Adam S","family":"Miner","sequence":"first","affiliation":[{"name":"Department of Psychiatry and Behavioral Sciences, Stanford University, Palo Alto, California, USA"},{"name":"Department of Epidemiology and Population Health, Stanford University, Palo Alto, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5789-8390","authenticated-orcid":false,"given":"Sheridan A","family":"Stewart","sequence":"additional","affiliation":[{"name":"Department of Sociology, Stanford University, Stanford, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5031-9840","authenticated-orcid":false,"given":"Meghan C","family":"Halley","sequence":"additional","affiliation":[{"name":"Center for Biomedical Ethics, Stanford University, Stanford, California, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8948-300X","authenticated-orcid":false,"given":"Laura K","family":"Nelson","sequence":"additional","affiliation":[{"name":"Department of Sociology, University of British Columbia, Vancouver, British Columbia, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5856-6301","authenticated-orcid":false,"given":"Eleni","family":"Linos","sequence":"additional","affiliation":[{"name":"Department of Epidemiology and Population Health, Stanford University, Palo Alto, California, USA"},{"name":"Department of Dermatology, Stanford University, Stanford, California, USA"}]}],"member":"179","published-online":{"date-parts":[[2023,1,24]]},"reference":[{"key":"bibr1-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1001\/jamainternmed.2017.1394"},{"key":"bibr2-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1353\/sof.0.0252"},{"key":"bibr3-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1016\/j.cmpb.2015.10.014"},{"key":"bibr4-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1002\/asi.23786"},{"key":"bibr5-20539517221149106","volume-title":"Natural Language Processing with Python","author":"Bird S","year":"2009"},{"key":"bibr6-20539517221149106","unstructured":"Bischof J, Airoldi EM (2012) Summarizing topical content with word frequency and exclusivity. In: Proceedings of the 29th International Conference on Machine Learning, pp.201\u2013208."},{"key":"bibr7-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1145\/2133806.2133826"},{"key":"bibr8-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143859"},{"key":"bibr9-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1214\/07-AOAS114"},{"key":"bibr10-20539517221149106","first-page":"993","volume":"3","author":"Blei DM","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"bibr11-20539517221149106","first-page":"225","volume-title":"Handbook of Mixed Membership Models and Their Applications","author":"Boyd-Graber J","year":"2014"},{"key":"bibr12-20539517221149106","first-page":"288","volume":"22","author":"Chang J","year":"2009","journal-title":"Advances in Neural Information Processing Systems"},{"key":"bibr13-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1086\/511799"},{"key":"bibr14-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1057\/9781137533296"},{"key":"bibr15-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1017\/pan.2017.44"},{"key":"bibr16-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1177\/2053951715602908"},{"key":"bibr17-20539517221149106","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v025.i05"},{"key":"bibr18-20539517221149106","unstructured":"Friedman D (2022) topicdoc: Topic-specific diagnostics for LDA and CTM topic models. https:\/\/cran.r-project.org\/web\/packages\/topicdoc\/index.html."},{"key":"bibr19-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1109\/JBHI.2015.2503985"},{"key":"bibr20-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1126\/sciadv.aaq1360"},{"key":"bibr21-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-019-0112-6"},{"key":"bibr22-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1093\/pan\/mps028"},{"key":"bibr23-20539517221149106","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v040.i13"},{"key":"bibr24-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1136\/bmj.k4926"},{"key":"bibr25-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-020-2649-2"},{"key":"bibr26-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1177\/1094428120971683"},{"key":"bibr27-20539517221149106","unstructured":"Honnibal M, Montani I, Van Landeghem S, et al. (2020) spaCy: Industrial-strength natural language processing in Python. https:\/\/spacy.io\/."},{"key":"bibr28-20539517221149106","first-page":"1","volume":"34","author":"Hoyle A","year":"2021","journal-title":"Advances in Neural Information Processing Systems"},{"key":"bibr29-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1177\/20539517211020332"},{"key":"bibr30-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1001\/jama.2015.10680"},{"key":"bibr31-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1001\/jamainternmed.2016.3284"},{"key":"bibr32-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1109\/IISA.2016.7785373"},{"key":"bibr33-20539517221149106","doi-asserted-by":"crossref","unstructured":"Lau JH, Newman D, Baldwin T (2014) Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp.530\u2013539.","DOI":"10.3115\/v1\/E14-1056"},{"key":"bibr34-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1057\/ajcs.2014.13"},{"key":"bibr35-20539517221149106","unstructured":"Lehman LW, Saeed M, Long W, et al. (2012) Risk stratification of ICU patients using topic models inferred from unstructured progress notes. In: AMIA Annual Symposium Proceedings 2012, pp.505\u2013511."},{"key":"bibr36-20539517221149106","doi-asserted-by":"publisher","DOI":"10.3389\/fpsyg.2021.712111"},{"key":"bibr37-20539517221149106","unstructured":"May C, Cotterell R, Van Durme B (2019) An analysis of lemmatization on topic models of morphologically rich language. arXiv:1608.03995v2."},{"issue":"28","key":"bibr38-20539517221149106","first-page":"1","volume":"5","author":"Melville S","year":"2019","journal-title":"Frontiers in Digital Humanities"},{"key":"bibr39-20539517221149106","unstructured":"Daniel F, Microsoft Corporation, Weston S, Tenenbaum D (2022) doParallel: Foreach parallel adaptor for the \u2018parallel\u2019 package. https:\/\/cran.r-project.org\/web\/packages\/doParallel\/index.html."},{"key":"bibr40-20539517221149106","unstructured":"Mimno D, Wallach HM, Talley E, et al. (2011) Optimizing semantic coherence in topic models. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp.262\u2013272."},{"key":"bibr41-20539517221149106","doi-asserted-by":"publisher","DOI":"10.3102\/0002831219860511"},{"key":"bibr42-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1177\/0049124117729703"},{"key":"bibr43-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1177\/0049124118769114"},{"key":"bibr44-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0103408"},{"issue":"85","key":"bibr45-20539517221149106","first-page":"2825","volume":"12","author":"Pedregosa F","year":"2011","journal-title":"Journal of Machine Learning Research"},{"key":"bibr46-20539517221149106","doi-asserted-by":"publisher","DOI":"10.3115\/1699510.1699543"},{"key":"bibr47-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020481"},{"key":"bibr48-20539517221149106","unstructured":"\u0158eh\u016f\u0159ek R, Sojka P (2010) Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp.45\u201350."},{"issue":"1","key":"bibr49-20539517221149106","volume":"2","author":"Rhody LM","year":"2012","journal-title":"Journal of Digital Humanities"},{"key":"bibr50-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780199755776.001.0001"},{"key":"bibr51-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781316257340.004"},{"key":"bibr52-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1111\/ajps.12103"},{"key":"bibr53-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1145\/2684822.2685324"},{"key":"bibr54-20539517221149106","first-page":"33","volume":"20","author":"Sahlgren M","year":"2008","journal-title":"Italian Journal of Linguistics"},{"key":"bibr55-20539517221149106","unstructured":"Schiebinger L (2021) Analyzing Research Priorities and Potential Outcomes, Gendered Innovations. Available at: https:\/\/genderedinnovations.stanford.edu\/methods\/priorities.html (accessed 5 October 2021)."},{"key":"bibr56-20539517221149106","unstructured":"Schofield A, Magnusson M, Thompson L, et al. (2017) Understanding text pre-processing for latent Dirichlet allocation. ACL Workshop for Women in NLP."},{"key":"bibr57-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00099"},{"key":"bibr58-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1177\/20539517211021437"},{"key":"bibr59-20539517221149106","doi-asserted-by":"crossref","unstructured":"Sievert C, Shirley KE (2014) LDAvis: A method for visualizing and interpreting topics. In: Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, pp.63\u201370.","DOI":"10.3115\/v1\/W14-3110"},{"key":"bibr60-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-019-1657-6"},{"key":"bibr61-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1198\/tast.2009.08210"},{"key":"bibr62-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-019-0686-2"},{"key":"bibr63-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pbio.2006930"},{"key":"bibr64-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1177\/17427665211023984"},{"key":"bibr65-20539517221149106","doi-asserted-by":"publisher","DOI":"10.1177\/0003122417712729"},{"key":"bibr66-20539517221149106","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2021\/638"}],"container-title":["Big Data &amp; Society"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/20539517221149106","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/20539517221149106","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/20539517221149106","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,2]],"date-time":"2025-03-02T23:00:14Z","timestamp":1740956414000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/20539517221149106"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1]]},"references-count":66,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,1]]}},"alternative-id":["10.1177\/20539517221149106"],"URL":"https:\/\/doi.org\/10.1177\/20539517221149106","relation":{},"ISSN":["2053-9517","2053-9517"],"issn-type":[{"value":"2053-9517","type":"print"},{"value":"2053-9517","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1]]},"article-number":"20539517221149106"}}