{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,15]],"date-time":"2026-05-15T18:20:00Z","timestamp":1778869200402,"version":"3.51.4"},"reference-count":13,"publisher":"MIT Press - Journals","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["TACL"],"published-print":{"date-parts":[[2016,12]]},"abstract":"<jats:p> Rule-based stemmers such as the Porter stemmer are frequently used to preprocess English corpora for topic modeling. In this work, we train and evaluate topic models on a variety of corpora using several different stemming algorithms. We examine several different quantitative measures of the resulting models, including likelihood, coherence, model stability, and entropy. Despite their frequent use in topic modeling, we find that stemmers produce no meaningful improvement in likelihood and coherence and in fact can degrade topic stability. <\/jats:p>","DOI":"10.1162\/tacl_a_00099","type":"journal-article","created":{"date-parts":[[2018,12,28]],"date-time":"2018-12-28T15:44:07Z","timestamp":1546011847000},"page":"287-300","source":"Crossref","is-referenced-by-count":137,"title":["Comparing Apples to Apple: The Effects of Stemmers on Topic                     Models"],"prefix":"10.1162","volume":"4","author":[{"given":"Alexandra","family":"Schofield","sequence":"first","affiliation":[{"name":"Cornell University, Ithaca, NY 14853,"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Mimno","sequence":"additional","affiliation":[{"name":"Cornell University, Ithaca, NY 14853,"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","reference":[{"issue":"2","key":"p_1","first-page":"350","volume":"37","author":"Bhamidipati Narayan L","year":"2007","journal-title":"IEEE Transactions on"},{"key":"p_3","first-page":"993","volume":"3","author":"Blei David M","year":"2003","journal-title":"The Journal of Machine Learning Research"},{"issue":"7","key":"p_8","doi-asserted-by":"crossref","first-page":"2643","DOI":"10.1073\/pnas.1018067108","volume":"108","author":"Grimmer Justin","year":"2011","journal-title":"PNAS"},{"issue":"1","key":"p_10","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1002\/(SICI)1097-4571(199101)42:1<7::AID-ASI2>3.0.CO;2-P","volume":"42","author":"Harman Donna","year":"1991","journal-title":"Journal of the American Society for Information Science"},{"issue":"1","key":"p_11","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1080\/21670811.2015.1093271","volume":"4","author":"Jacobi Carina","year":"2016","journal-title":"Digital Journalism"},{"issue":"6","key":"p_12","first-page":"1930","volume":"2","author":"Jivani Anjali Ganesh","year":"2011","journal-title":"International Journal of Computer Technology and Applications"},{"issue":"6","key":"p_13","doi-asserted-by":"crossref","first-page":"750","DOI":"10.1016\/j.poetic.2013.08.005","volume":"41","author":"Jockers Matthew L","year":"2013","journal-title":"Poetics"},{"key":"p_19","first-page":"22","volume":"11","author":"Lovins Julie B","year":"1968","journal-title":"Mechanical Translation and Computational Linguistics"},{"issue":"4","key":"p_20","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1145\/1281485.1281489","volume":"25","author":"Majumder Prasenjit","year":"2007","journal-title":"ACM Transactions on Information Systems (TOIS)"},{"issue":"3","key":"p_26","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1145\/101306.101310","volume":"24","author":"Paice Chris D","year":"1990","journal-title":"ACM SIGIR Forum"},{"issue":"3","key":"p_27","doi-asserted-by":"crossref","first-page":"130","DOI":"10.1108\/eb046814","volume":"14","author":"Porter Martin F","year":"1980","journal-title":"Program"},{"issue":"3","key":"p_29","first-page":"165","volume":"4","author":"Ruba Rani SP","year":"2015","journal-title":"International Journal of Computer Science and Mobile Computing"},{"issue":"2","key":"p_31","doi-asserted-by":"crossref","first-page":"113","DOI":"10.3233\/KES-130267","volume":"17","author":"Stankov Ivan","year":"2013","journal-title":"International Journal of Knowledge-based and Intelligent Engineering Systems"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/tacl_a_00099","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T21:38:25Z","timestamp":1615585105000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/43370"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,12]]},"references-count":13,"alternative-id":["10.1162\/tacl_a_00099"],"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00099","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,12]]}}}