{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,22]],"date-time":"2025-07-22T11:09:07Z","timestamp":1753182547028,"version":"3.41.0"},"reference-count":37,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2016,7,20]],"date-time":"2016-07-20T00:00:00Z","timestamp":1468972800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Interact. Intell. Syst."],"published-print":{"date-parts":[[2016,8,3]]},"abstract":"<jats:p>Statistical topic models have become a useful and ubiquitous tool for analyzing large text corpora. One common application of statistical topic models is to support topic-centric navigation and exploration of document collections. Existing work on topic modeling focuses on the inference of model parameters so the resulting model fits the input data. Since the exact inference is intractable, statistical inference methods, such as Gibbs Sampling, are commonly used to solve the problem. However, most of the existing work ignores an important aspect that is closely related to the end user experience: topic model stability. When the model is either re-trained with the same input data or updated with new documents, the topic previously assigned to a document may change under the new model, which may result in a disruption of end users\u2019 mental maps about the relations between documents and topics, thus undermining the usability of the applications. In this article, we propose a novel user-directed non-disruptive topic model update method that balances the tradeoff between finding the model that fits the data and maintaining the stability of the model from end users\u2019 perspective. It employs a novel constrained LDA algorithm to incorporate pairwise document constraints, which are converted from user feedback about topics, to achieve topic model stability. Evaluation results demonstrate the advantages of our approach over previous methods.<\/jats:p>","DOI":"10.1145\/2954002","type":"journal-article","created":{"date-parts":[[2016,7,21]],"date-time":"2016-07-21T15:13:24Z","timestamp":1469114004000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["The Stability and Usability of Statistical Topic Models"],"prefix":"10.1145","volume":"6","author":[{"given":"Yi","family":"Yang","sequence":"first","affiliation":[{"name":"University of Illinois at Urbana-Champaign, Champaign, IL"}]},{"given":"Shimei","family":"Pan","sequence":"additional","affiliation":[{"name":"University of Maryland, Baltimore County (UMBC), Baltimore MD"}]},{"given":"Jie","family":"Lu","sequence":"additional","affiliation":[{"name":"IBM T. J. Watson Research Center, Yorktown Heights, NY"}]},{"given":"Mercan","family":"Topkara","sequence":"additional","affiliation":[{"name":"JW Player, New York, NY"}]},{"given":"Yangqiu","family":"Song","sequence":"additional","affiliation":[{"name":"West Virginia University, Morgantown, WV"}]}],"member":"320","published-online":{"date-parts":[[2016,7,20]]},"reference":[{"doi-asserted-by":"publisher","key":"e_1_2_1_1_1","DOI":"10.1145\/1553374.1553378"},{"doi-asserted-by":"publisher","key":"e_1_2_1_2_1","DOI":"10.5555\/2283516.2283593"},{"doi-asserted-by":"publisher","key":"e_1_2_1_3_1","DOI":"10.1137\/1.9781611972771.40"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of Advances in Neural Information Processing Systems. 121--128","author":"Blei David","year":"2008","unstructured":"David Blei and Jon McAuliffe . 2008 . Supervised topic models . In Proceedings of Advances in Neural Information Processing Systems. 121--128 . David Blei and Jon McAuliffe. 2008. Supervised topic models. In Proceedings of Advances in Neural Information Processing Systems. 121--128."},{"doi-asserted-by":"publisher","key":"e_1_2_1_5_1","DOI":"10.5555\/944919.944937"},{"key":"e_1_2_1_6_1","volume-title":"Griffiths","author":"Canini Kevin R.","year":"2009","unstructured":"Kevin R. Canini , Lei Shi , and Thomas L . Griffiths . 2009 . Online inference of topics with latent Dirichlet allocation. In Proceedings of Artificial Intelligence and Statistics . Kevin R. Canini, Lei Shi, and Thomas L. Griffiths. 2009. Online inference of topics with latent Dirichlet allocation. In Proceedings of Artificial Intelligence and Statistics."},{"doi-asserted-by":"publisher","key":"e_1_2_1_7_1","DOI":"10.1109\/TVCG.2013.212"},{"doi-asserted-by":"publisher","key":"e_1_2_1_8_1","DOI":"10.1111\/j.0007-1013.2004.00390.x"},{"unstructured":"A. Dix J. Finlay G. Abowd and R. Beale. 1998. Human-Computer Interaction. Prentice Hall Upper Saddle River NY.   A. Dix J. Finlay G. Abowd and R. Beale. 1998. Human-Computer Interaction. Prentice Hall Upper Saddle River NY.","key":"e_1_2_1_9_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_10_1","DOI":"10.1073\/pnas.0307752101"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of Advances in Neural Information Processing Systems.","author":"Hoffman Matthew D.","year":"2010","unstructured":"Matthew D. Hoffman , David M. Blei , and Francis Bach . 2010 . Online learning for latent Dirichlet allocation . In Proceedings of Advances in Neural Information Processing Systems. Matthew D. Hoffman, David M. Blei, and Francis Bach. 2010. Online learning for latent Dirichlet allocation. In Proceedings of Advances in Neural Information Processing Systems."},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the Association for Computational Linguistics. 248--257","author":"Hu Yuening","year":"2011","unstructured":"Yuening Hu , Jordan Boyd-Graber , and Brianna Satinoff . 2011 . Interactive topic modeling . In Proceedings of the Association for Computational Linguistics. 248--257 . Yuening Hu, Jordan Boyd-Graber, and Brianna Satinoff. 2011. Interactive topic modeling. In Proceedings of the Association for Computational Linguistics. 248--257."},{"doi-asserted-by":"publisher","key":"e_1_2_1_13_1","DOI":"10.1145\/2557500.2557539"},{"doi-asserted-by":"publisher","key":"e_1_2_1_14_1","DOI":"10.1145\/2623330.2623756"},{"doi-asserted-by":"publisher","key":"e_1_2_1_15_1","DOI":"10.1145\/2089094.2089101"},{"doi-asserted-by":"publisher","key":"e_1_2_1_16_1","DOI":"10.1145\/1553374.1553460"},{"doi-asserted-by":"publisher","key":"e_1_2_1_17_1","DOI":"10.1145\/1961189.1961198"},{"doi-asserted-by":"publisher","key":"e_1_2_1_18_1","DOI":"10.1145\/1367497.1367512"},{"doi-asserted-by":"publisher","key":"e_1_2_1_19_1","DOI":"10.5555\/2145432.2145462"},{"doi-asserted-by":"publisher","key":"e_1_2_1_20_1","DOI":"10.1145\/1401890.1401957"},{"doi-asserted-by":"publisher","key":"e_1_2_1_21_1","DOI":"10.5555\/1577069.1755845"},{"volume-title":"Usability Engineering","author":"Nielsen J.","unstructured":"J. Nielsen . 1993. Usability Engineering . Academic Press , San Diego, CA . J. Nielsen. 1993. Usability Engineering. Academic Press, San Diego, CA.","key":"e_1_2_1_22_1"},{"key":"e_1_2_1_23_1","volume-title":"Norman and Jakob Nielsen","author":"Donald","year":"2013","unstructured":"Donald A. Norman and Jakob Nielsen . 2013 . 10 Heuristics for User Interface Design . http:\/\/www.nngroup.com\/articles\/ten-usability-heuristics\/. (2013). Donald A. Norman and Jakob Nielsen. 2013. 10 Heuristics for User Interface Design. http:\/\/www.nngroup.com\/articles\/ten-usability-heuristics\/. (2013)."},{"doi-asserted-by":"publisher","key":"e_1_2_1_24_1","DOI":"10.1145\/2449396.2449441"},{"key":"e_1_2_1_25_1","volume-title":"Manning","author":"Ramage Daniel","year":"2009","unstructured":"Daniel Ramage , David Hall , Ramesh Nallapati , and Christopher D . Manning . 2009 . Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of Empirical Methods in Natural Language Processing . 248--256. Daniel Ramage, David Hall, Ramesh Nallapati, and Christopher D. Manning. 2009. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of Empirical Methods in Natural Language Processing. 248--256."},{"doi-asserted-by":"publisher","key":"e_1_2_1_26_1","DOI":"10.1145\/1081870.1081925"},{"volume-title":"Constrained coclustering for textual documents","author":"Song Yangqiu","unstructured":"Yangqiu Song , Shimei Pan , Shixia Liu , Furu Wei , Michelle Zhou , and Weihong Qian . 2010. Constrained coclustering for textual documents . In Association for the Advancement of Artificial Intelligence. Yangqiu Song, Shimei Pan, Shixia Liu, Furu Wei, Michelle Zhou, and Weihong Qian. 2010. Constrained coclustering for textual documents. In Association for the Advancement of Artificial Intelligence.","key":"e_1_2_1_27_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_28_1","DOI":"10.1145\/1645953.1646223"},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the National Academy of Sciences.","author":"Stephen Elena Erosheva","year":"2004","unstructured":"Elena Erosheva Stephen , Stephen Fienberg , and John Lafferty . 2004 . Mixed membership models of scientific publications . In Proceedings of the National Academy of Sciences. 2004. Elena Erosheva Stephen, Stephen Fienberg, and John Lafferty. 2004. Mixed membership models of scientific publications. In Proceedings of the National Academy of Sciences. 2004."},{"doi-asserted-by":"publisher","key":"e_1_2_1_30_1","DOI":"10.1007\/BF00993473"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the International Conference of Machine Learning. 577--584","author":"Wagstaff Kiri","year":"2001","unstructured":"Kiri Wagstaff , Claire Cardie , Seth Rogers , and Stefan Schr\u00f6dl . 2001 . Constrained k-means clustering with background knowledge . In Proceedings of the International Conference of Machine Learning. 577--584 . Kiri Wagstaff, Claire Cardie, Seth Rogers, and Stefan Schr\u00f6dl. 2001. Constrained k-means clustering with background knowledge. In Proceedings of the International Conference of Machine Learning. 577--584."},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of Advances in Neural Information Processing Systems. 1973--1981","author":"Wallach Hanna M.","year":"2009","unstructured":"Hanna M. Wallach , David M. Mimno , and Andrew McCallum . 2009 . Rethinking LDA: Why priors matter . In Proceedings of Advances in Neural Information Processing Systems. 1973--1981 . Hanna M. Wallach, David M. Mimno, and Andrew McCallum. 2009. Rethinking LDA: Why priors matter. In Proceedings of Advances in Neural Information Processing Systems. 1973--1981."},{"key":"e_1_2_1_33_1","volume-title":"Peacock: Learning long-tail topic features for industrial applications.","author":"Wang Yi","year":"2014","unstructured":"Yi Wang , Xuemin Zhao , Zhenlong Sun , Hao Yan , Lifeng Wang , Zhihui Jin , Liubin Wang , Yang Gao , Ching Law , and Jia Zeng . 2014 . Peacock: Learning long-tail topic features for industrial applications. (2014). Yi Wang, Xuemin Zhao, Zhenlong Sun, Hao Yan, Lifeng Wang, Zhihui Jin, Liubin Wang, Yang Gao, Ching Law, and Jia Zeng. 2014. Peacock: Learning long-tail topic features for industrial applications. (2014)."},{"doi-asserted-by":"publisher","key":"e_1_2_1_34_1","DOI":"10.18653\/v1\/D15-1037"},{"doi-asserted-by":"publisher","key":"e_1_2_1_35_1","DOI":"10.1145\/1557019.1557121"},{"unstructured":"Jinhui Yuan Fei Gao Qirong Ho Wei Dai Jinliang Wei Xun Zheng Eric P. Xing Tie-Yan Liu and Wei-Ying Ma. 2014. LightLDA: Big topic models on modest compute clusters. (2014).  Jinhui Yuan Fei Gao Qirong Ho Wei Dai Jinliang Wei Xun Zheng Eric P. Xing Tie-Yan Liu and Wei-Ying Ma. 2014. LightLDA: Big topic models on modest compute clusters. (2014).","key":"e_1_2_1_36_1"},{"volume-title":"Proceedings of the International Conference of Machine Learning. 561--569","author":"Zhai Ke","unstructured":"Ke Zhai and Jordan L . Boyd-Graber. 2013. Online latent Dirichlet allocation with infinite vocabulary . In Proceedings of the International Conference of Machine Learning. 561--569 . Ke Zhai and Jordan L. Boyd-Graber. 2013. Online latent Dirichlet allocation with infinite vocabulary. In Proceedings of the International Conference of Machine Learning. 561--569.","key":"e_1_2_1_37_1"}],"container-title":["ACM Transactions on Interactive Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2954002","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2954002","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:56:22Z","timestamp":1750222582000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2954002"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,7,20]]},"references-count":37,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2016,8,3]]}},"alternative-id":["10.1145\/2954002"],"URL":"https:\/\/doi.org\/10.1145\/2954002","relation":{},"ISSN":["2160-6455","2160-6463"],"issn-type":[{"type":"print","value":"2160-6455"},{"type":"electronic","value":"2160-6463"}],"subject":[],"published":{"date-parts":[[2016,7,20]]},"assertion":[{"value":"2015-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-07-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}