{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,15]],"date-time":"2026-05-15T23:18:29Z","timestamp":1778887109657,"version":"3.51.4"},"reference-count":53,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2016,7,10]],"date-time":"2016-07-10T00:00:00Z","timestamp":1468108800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2017,2]]},"abstract":"<jats:p>Qualitative studies, such as sociological research, opinion analysis and media studies, can benefit greatly from automated topic mining provided by topic models such as latent Dirichlet allocation (LDA). However, examples of qualitative studies that employ topic modelling as a tool are currently few and far between. In this work, we identify two important problems along the way to using topic models in qualitative studies: lack of a good quality metric that closely matches human judgement in understanding topics and the need to indicate specific subtopics that a specific qualitative study may be most interested in mining. For the first problem, we propose a new quality metric, tf-idf coherence, that reflects human judgement more accurately than regular coherence, and conduct an experiment to verify this claim. For the second problem, we propose an interval semi-supervised approach (ISLDA) where certain predefined sets of keywords (that define the topics researchers are interested in) are restricted to specific intervals of topic assignments. Our experiments show that ISLDA is better for topic extraction than LDA in terms of tf-idf coherence, number of topics identified to predefined keywords and topic stability. We also present a case study on a Russian LiveJournal dataset aimed at ethnicity discourse analysis.<\/jats:p>","DOI":"10.1177\/0165551515617393","type":"journal-article","created":{"date-parts":[[2015,12,11]],"date-time":"2015-12-11T21:29:52Z","timestamp":1449869392000},"page":"88-102","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":268,"title":["Topic modelling for qualitative studies"],"prefix":"10.1177","volume":"43","author":[{"given":"Sergey I.","family":"Nikolenko","sequence":"first","affiliation":[{"name":"National Research University Higher School of Economics and Steklov Mathematical Institute at St Petersburg, Russia"}]},{"given":"Sergei","family":"Koltcov","sequence":"additional","affiliation":[{"name":"National Research University Higher School of Economics, Russia"}]},{"given":"Olessia","family":"Koltsova","sequence":"additional","affiliation":[{"name":"National Research University Higher School of Economics, Russia"}]}],"member":"179","published-online":{"date-parts":[[2016,7,10]]},"reference":[{"key":"bibr1-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007617005950"},{"issue":"4","key":"bibr2-0165551515617393","first-page":"993","volume":"3","author":"Blei DM","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"bibr3-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0307752101"},{"key":"bibr4-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1145\/2380718.2380750"},{"key":"bibr5-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150450"},{"key":"bibr6-0165551515617393","first-page":"859","volume-title":"SIAM International Conference on Data Mining (SDM09)","author":"Gohr A"},{"key":"bibr7-0165551515617393","first-page":"20","author":"Wang X","year":"2007","journal-title":"Advances in Neural Information Processing Systems"},{"key":"bibr8-0165551515617393","first-page":"349","volume-title":"Proceedings of the 11th annual international ACM\/IEEE joint conference on digital libraries (JCDL\u201911)","volume":"2011","author":"Pan CC"},{"key":"bibr9-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1201\/9781420059458.ch2"},{"key":"bibr10-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-87481-2_2"},{"key":"bibr11-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1177\/0165551514540565"},{"key":"bibr12-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1177\/0165551512457893"},{"key":"bibr13-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1646076"},{"key":"bibr14-0165551515617393","first-page":"18","author":"Blei DM","year":"2006","journal-title":"Advances in Neural Information Processing Systems"},{"key":"bibr15-0165551515617393","volume-title":"Markov random field modelling in image analysis. Advances in pattern recognition","author":"Li SZ","year":"2009"},{"key":"bibr16-0165551515617393","first-page":"185","volume-title":"Proceedings of the 2008 NIPS conference","author":"Boyd-Graber JL","year":"2008"},{"key":"bibr17-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1214\/09-AOAS309"},{"key":"bibr18-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143859"},{"key":"bibr19-0165551515617393","first-page":"579","volume-title":"Proceedings of the 24th conference on uncertainty in artificial intelligence","author":"Wang C","year":"2008"},{"key":"bibr20-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1177\/0165551512473066"},{"key":"bibr21-0165551515617393","first-page":"22","author":"Blei DM","year":"2007","journal-title":"Advances in Neural Information Processing Systems"},{"key":"bibr22-0165551515617393","first-page":"487","volume-title":"Proceedings of the 20th conference on uncertainty in artificial intelligence","author":"Rosen-Zvi M","year":"2004"},{"key":"bibr23-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1145\/1658377.1658381"},{"key":"bibr24-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1177\/0165551514538744"},{"key":"bibr25-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1198\/016214506000000302"},{"key":"bibr26-0165551515617393","first-page":"17","volume":"16","author":"Blei DM","year":"2003","journal-title":"Advances in Neural Information Processing Systems"},{"key":"bibr27-0165551515617393","first-page":"1385","volume":"17","author":"Teh YW","year":"2004","journal-title":"Advances in Neural Information Processing Systems"},{"key":"bibr28-0165551515617393","first-page":"1151","volume-title":"Proceedings of the 27th international conference on machine learning","author":"Williamson S","year":"2010"},{"key":"bibr29-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1145\/2339530.2339549"},{"key":"bibr30-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553378"},{"key":"bibr31-0165551515617393","doi-asserted-by":"publisher","DOI":"10.3115\/1621829.1621835"},{"key":"bibr32-0165551515617393","unstructured":"Wayne XZ, Jing J, Hongfei Y, Xiaoming L. In: Hang L, Luis M (eds), Proceedings of the 2010 conference on empirical methods in natural language processing. Stroudsburg, PA: Association for Computational Linguistics, 2010, pp. 56\u201365."},{"key":"bibr33-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1109\/ICDMW.2011.125"},{"key":"bibr34-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2011.48"},{"key":"bibr35-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1145\/1935826.1935932"},{"key":"bibr36-0165551515617393","first-page":"204","volume-title":"Proceedings of the 13th conference of the European Chapter of the Association for Computational Linguistics","author":"Jagarlamudi J","year":"2012"},{"key":"bibr37-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1145\/2396761.2398556"},{"key":"bibr38-0165551515617393","first-page":"288","volume":"20","author":"Chang J","year":"2009","journal-title":"Advances in Neural Information Processing Systems"},{"key":"bibr39-0165551515617393","first-page":"27","volume-title":"Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence","author":"Asuncion A","year":"2000"},{"key":"bibr40-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553515"},{"key":"bibr41-0165551515617393","first-page":"227","volume-title":"Proceedings of the 2011 conference on empirical methods in natural language processing","author":"Mimno D","year":"2011"},{"key":"bibr42-0165551515617393","first-page":"262","volume-title":"Proceedings of the 2011 conference on empirical methods in natural language processing","author":"Mimno D","year":"2011"},{"key":"bibr43-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-04180-8_22"},{"key":"bibr44-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1177\/0165551514524678"},{"key":"bibr45-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(88)90021-0"},{"key":"bibr46-0165551515617393","volume-title":"Proceedings of the 31st international conference on machine learning","author":"Tang J","year":"2014"},{"key":"bibr47-0165551515617393","unstructured":"Wallach HM. Structured topic models for language. Thesis submitted for the degree of Doctor of Philosophy, University of Cambridge, 2008."},{"key":"bibr48-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1002\/1944-2866.POI331"},{"key":"bibr49-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1198\/016214503000000666"},{"key":"bibr50-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1145\/2615569.2615680"},{"key":"bibr51-0165551515617393","first-page":"519","volume-title":"Proceedings of the 18th international joint conference on artificial intelligence","author":"Ling CX","year":"2003"},{"key":"bibr52-0165551515617393","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010920819831"},{"key":"bibr53-0165551515617393","first-page":"496","volume":"24","author":"Newman D","year":"2011","journal-title":"Advances in Neural Information Processing Systems"}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551515617393","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/0165551515617393","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551515617393","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:09:23Z","timestamp":1777504163000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0165551515617393"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,7,10]]},"references-count":53,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2017,2]]}},"alternative-id":["10.1177\/0165551515617393"],"URL":"https:\/\/doi.org\/10.1177\/0165551515617393","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,7,10]]}}}