{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T22:58:51Z","timestamp":1767999531562,"version":"3.49.0"},"reference-count":40,"publisher":"MIT Press","license":[{"start":{"date-parts":[[2022,6,21]],"date-time":"2022-06-21T00:00:00Z","timestamp":1655769600000},"content-version":"vor","delay-in-days":171,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,6,17]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Researchers in the social sciences are often interested in the relationship between text and an outcome of interest, where the goal is to both uncover latent patterns in the text and predict outcomes for unseen texts. To this end, this paper develops the heterogeneous supervised topic model (HSTM), a probabilistic approach to text analysis and prediction. HSTMs posit a joint model of text and outcomes to find heterogeneous patterns that help with both text analysis and prediction. The main benefit of HSTMs is that they capture heterogeneity in the relationship between text and the outcome across latent topics. To fit HSTMs, we develop a variational inference algorithm based on the auto-encoding variational Bayes framework. We study the performance of HSTMs on eight datasets and find that they consistently outperform related methods, including fine-tuned black-box models. Finally, we apply HSTMs to analyze news articles labeled with pro- or anti-tone. We find evidence of differing language used to signal a pro- and anti-tone.<\/jats:p>","DOI":"10.1162\/tacl_a_00487","type":"journal-article","created":{"date-parts":[[2022,6,21]],"date-time":"2022-06-21T18:09:04Z","timestamp":1655834944000},"page":"732-745","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":10,"title":["Heterogeneous Supervised Topic Models"],"prefix":"10.1162","volume":"10","author":[{"given":"Dhanya","family":"Sridhar","sequence":"first","affiliation":[{"name":"Universit\u00e9 de Montr\u00e9al and Mila-Quebec AI Institute, Canada. dhanya.sridhar@mila.quebec"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"suffix":"III","given":"Hal","family":"Daum\u00e9","sequence":"additional","affiliation":[{"name":"University of Maryland and Microsoft Research, USA. hal3@umd.edu"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Blei","sequence":"additional","affiliation":[{"name":"Columbia University, USA. david.blei@columbia.edu"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","published-online":{"date-parts":[[2022,6,17]]},"reference":[{"key":"2022062118084949900_bib1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1034","article-title":"Deep Dirichlet multinomial regression","volume-title":"Proceedings of NAACL-HLT","author":"Benton","year":"2018"},{"key":"2022062118084949900_bib2","doi-asserted-by":"crossref","DOI":"10.1145\/1143844.1143859","article-title":"Dynamic topic models","volume-title":"Proceedings of ICML","author":"Blei","year":"2006"},{"issue":"Jan","key":"2022062118084949900_bib3","first-page":"993","article-title":"Latent Dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"2022062118084949900_bib4","article-title":"Understanding disentangling in \u03b2-vae","author":"Burgess","year":"2018","journal-title":"arXiv preprint arXiv:1804.03599"},{"key":"2022062118084949900_bib5","doi-asserted-by":"crossref","DOI":"10.1609\/aaai.v29i1.9499","article-title":"A novel neural topic model and its supervised extension","volume-title":"Proceedings of AAAI","author":"Cao","year":"2015"},{"key":"2022062118084949900_bib6","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-2072","article-title":"The media frames corpus: Annotations of frames across issues","volume-title":"Proceedings of ACL","author":"Card","year":"2015"},{"key":"2022062118084949900_bib7","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1189","article-title":"Neural models for documents with metadata","volume-title":"Proceedings of ACL","author":"Card","year":"2017"},{"key":"2022062118084949900_bib8","article-title":"Reading tea leaves: How humans interpret topic models","volume-title":"Proceedings of NeurIPS","author":"Chang","year":"2009"},{"key":"2022062118084949900_bib9","doi-asserted-by":"crossref","DOI":"10.3115\/v1\/P15-1077","article-title":"Gaussian LDA for topic models with word embeddings","volume-title":"Proceedings of ACL","author":"Das","year":"2015"},{"key":"2022062118084949900_bib10","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1304","article-title":"Analyzing polarization in social media: Method and application to tweets on 21 mass shootings","volume-title":"Proceedings of NAACL-HLT","author":"Demszky","year":"2019"},{"key":"2022062118084949900_bib11","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of NAACL-HLT","author":"Devlin","year":"2018"},{"key":"2022062118084949900_bib12","doi-asserted-by":"publisher","first-page":"439","DOI":"10.1162\/tacl_a_00325","article-title":"Topic modeling in embedding spaces","volume":"8","author":"Dieng","year":"2020","journal-title":"Transactions of the Association for Comxputational Linguistics"},{"key":"2022062118084949900_bib13","article-title":"Sparse additive generative models of text","volume-title":"Proceedings of ICML","author":"Eisenstein","year":"2011"},{"key":"2022062118084949900_bib14","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1407","article-title":"Pathologies of neural models make interpretations difficult","volume-title":"Proceedings of EMNLP","author":"Feng","year":"2018"},{"issue":"3","key":"2022062118084949900_bib15","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1093\/pan\/mps028","article-title":"Text as data: The promise and pitfalls of automatic content analysis methods for political texts","volume":"21","author":"Grimmer","year":"2013","journal-title":"Political Analysis"},{"key":"2022062118084949900_bib16","doi-asserted-by":"crossref","DOI":"10.1145\/3097983.3098074","article-title":"Efficient correlated topic modeling with topic embedding","volume-title":"Proceedings of KDD","author":"He","year":"2017"},{"issue":"8","key":"2022062118084949900_bib17","doi-asserted-by":"publisher","first-page":"1771","DOI":"10.1162\/089976602760128018","article-title":"Training products of experts by minimizing contrastive divergence","volume":"14","author":"Hinton","year":"2002","journal-title":"Neural Computation"},{"key":"2022062118084949900_bib18","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.386","article-title":"Towards faithfully interpretable nlp systems: How should we define and evaluate faithfulness?","volume-title":"Proceedings of ACL","author":"Jacovi","year":"2020"},{"key":"2022062118084949900_bib19","article-title":"Attention is not explanation","volume-title":"Proceedings of NAACL-HLT","author":"Jain","year":"2019"},{"key":"2022062118084949900_bib20","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1149","article-title":"A dataset of peer reviews (peerread): Collection, insights and nlp applications","volume-title":"Proceedings of NAACL-HLT","author":"Kang","year":"2018"},{"key":"2022062118084949900_bib21","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1007\/978-3-030-28954-6_14","article-title":"The (un) reliability of saliency methods","volume-title":"Explainable AI: Interpreting, Explaining and Visualizing Deep Learning","author":"Kindermans","year":"2019"},{"key":"2022062118084949900_bib22","article-title":"Adam: A method for stochastic optimization","volume-title":"Proceedings of ICLR","author":"Kingma","year":"2015"},{"key":"2022062118084949900_bib23","article-title":"Auto-encoding variational Bayes","author":"Kingma","year":"2013","journal-title":"arXiv preprint arXiv:1312.6114"},{"key":"2022062118084949900_bib24","doi-asserted-by":"publisher","DOI":"10.4135\/9781071878781","volume-title":"Content Analysis: An Introduction to its Methodology","author":"Krippendorff","year":"2018"},{"key":"2022062118084949900_bib25","article-title":"DiscLDA: Discriminative learning for dimensionality reduction and classification","volume-title":"Proceedings of NeurIPS","author":"Lacoste-Julien","year":"2008"},{"key":"2022062118084949900_bib26","article-title":"A neural autoregressive topic model","volume-title":"Proceedings of NeurIPS","author":"Larochelle","year":"2012"},{"key":"2022062118084949900_bib27","article-title":"Topically driven neural language model","author":"Lau","year":"2017","journal-title":"Proceedings of ACL"},{"key":"2022062118084949900_bib28","article-title":"Supervised topic models","volume-title":"Proceedings of NeurIPS","author":"McAuliffe","year":"2008"},{"key":"2022062118084949900_bib29","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4899-3242-6","volume-title":"Generalized Linear Models","author":"McCullough","year":"1989"},{"key":"2022062118084949900_bib30","article-title":"Neural variational inference for text processing","volume-title":"Proceedings of ICML","author":"Miao","year":"2016"},{"key":"2022062118084949900_bib31","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1139","article-title":"Tea party in the house: A hierarchical ideal point topic model and its application to republican legislators in the 112th Congress","volume-title":"Proceedings of ACL","author":"Nguyen","year":"2015"},{"key":"2022062118084949900_bib32","article-title":"Lexical and hierarchical topic regression","volume-title":"Proceedings of NeurIPS","author":"Nguyen","year":"2013"},{"key":"2022062118084949900_bib33","doi-asserted-by":"publisher","DOI":"10.3115\/1699510.1699543","article-title":"Labeled lda: A supervised topic model for credit attribution in multi-labeled corpora","volume-title":"Proceedings of EMNLP","author":"Ramage","year":"2009"},{"issue":"4","key":"2022062118084949900_bib34","doi-asserted-by":"publisher","first-page":"1064","DOI":"10.1111\/ajps.12103","article-title":"Structural topic models for open-ended survey responses","volume":"58","author":"Roberts","year":"2014","journal-title":"American Journal of Political Science"},{"key":"2022062118084949900_bib35","article-title":"The author-topic model for authors and documents","author":"Rosen-Zvi","year":"2012","journal-title":"Proceedings of UAI"},{"issue":"5","key":"2022062118084949900_bib36","doi-asserted-by":"publisher","first-page":"206","DOI":"10.1038\/s42256-019-0048-x","article-title":"Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead","volume":"1","author":"Rudin","year":"2019","journal-title":"Nature Machine Intelligence"},{"key":"2022062118084949900_bib37","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/P19-1282","article-title":"Is attention interpretable?","volume-title":"Proceedings of ACL","author":"Serrano","year":"2019"},{"key":"2022062118084949900_bib38","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1282","article-title":"Autoencoding variational inference for topic models","author":"Srivastava","year":"2017","journal-title":"Proceedings of ICLR"},{"issue":"503","key":"2022062118084949900_bib39","doi-asserted-by":"publisher","first-page":"755","DOI":"10.1080\/01621459.2012.734168","article-title":"Multinomial inverse regression for text analysis","volume":"108","author":"Taddy","year":"2013","journal-title":"Journal of the American Statistical Association"},{"key":"2022062118084949900_bib40","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.acl-main.475","article-title":"Text-based ideal points","volume-title":"Proceedings of ACL","author":"Vafa","year":"2020"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00487\/2030688\/tacl_a_00487.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00487\/2030688\/tacl_a_00487.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,8]],"date-time":"2023-02-08T23:21:54Z","timestamp":1675898514000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00487\/111727\/Heterogeneous-Supervised-Topic-Models"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022]]},"references-count":40,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00487","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022]]},"published":{"date-parts":[[2022]]}}}