{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,9,9]],"date-time":"2022-09-09T06:40:45Z","timestamp":1662705645000},"reference-count":28,"publisher":"MIT Press","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["TACL"],"published-print":{"date-parts":[[2016,12]]},"abstract":"<jats:p>We present Sparse Non-negative Matrix (SNM) estimation, a novel probability estimation technique for language modeling that can efficiently incorporate arbitrary features. We evaluate SNM language models on two corpora: the One Billion Word Benchmark and a subset of the LDC English Gigaword corpus. Results show that SNM language models trained with n-gram features are a close match for the well-established Kneser-Ney models. The addition of skip-gram features yields a model that is in the same league as the state-of-the-art recurrent neural network language models, as well as complementary: combining the two modeling techniques yields the best known result on the One Billion Word Benchmark. On the Gigaword corpus further improvements are observed using features that cross sentence boundaries. The computational advantages of SNM estimation over both maximum entropy and neural network estimation are probably its main strength, promising an approach that has large flexibility in combining arbitrary features and yet scales gracefully to large amounts of data.<\/jats:p>","DOI":"10.1162\/tacl_a_00102","type":"journal-article","created":{"date-parts":[[2018,12,28]],"date-time":"2018-12-28T15:44:07Z","timestamp":1546011847000},"page":"329-342","source":"Crossref","is-referenced-by-count":1,"title":["Sparse Non-negative Matrix Language Modeling"],"prefix":"10.1162","volume":"4","author":[{"given":"Joris","family":"Pelemans","sequence":"first","affiliation":[{"name":"Google Inc., ESAT, KU Leuven,"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Noam","family":"Shazeer","sequence":"additional","affiliation":[{"name":"Google Inc.,"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ciprian","family":"Chelba","sequence":"additional","affiliation":[{"name":"Google Inc.,"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","reference":[{"issue":"8","key":"p_1","doi-asserted-by":"crossref","first-page":"1279","DOI":"10.1109\/5.880084","volume":"88","author":"Bellegarda Jerome","year":"2000","journal-title":"Proceedings of the IEEE"},{"key":"p_2","first-page":"1137","volume":"3","author":"Bengio Yoshua","year":"2003","journal-title":"Journal of Machine Learning Research"},{"issue":"4","key":"p_4","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1006\/csla.2000.0147","volume":"14","author":"Chelba Ciprian","year":"2000","journal-title":"Computer Speech and Language"},{"key":"p_5","first-page":"2635","author":"Chelba Ciprian","year":"2014","journal-title":"Proceedings of Interspeech"},{"key":"p_7","first-page":"681","author":"Chen Stanley F.","year":"1998","journal-title":"Proceedings of ICASSP"},{"key":"p_8","first-page":"5411","author":"Chen Xie","year":"2015","journal-title":"Proceedings of ICASSP"},{"key":"p_9","first-page":"2121","volume":"12","author":"Duchi John","year":"2011","journal-title":"Journal of Machine Learning Research"},{"issue":"2","key":"p_10","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1207\/s15516709cog1402_1","volume":"14","author":"Elman Jeffrey L.","year":"1990","journal-title":"Cognitive Science"},{"key":"p_13","first-page":"561","author":"Goodman Joshua T.","year":"2001","journal-title":"Proceedings of ICASSP"},{"issue":"1","key":"p_14","first-page":"307","volume":"13","author":"Gutmann Michael","year":"2012","journal-title":"Journal of Machine Learning Research"},{"key":"p_15","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1006\/csla.1993.1007","volume":"2","author":"Huang Xuedong","year":"1993","journal-title":"Computer Speech and Language"},{"issue":"3","key":"p_16","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1109\/TASSP.1987.1165125","volume":"35","author":"Katz Slava M.","year":"1987","journal-title":"IEEE Transactions on Acoustics, Speech and Signal Processing"},{"key":"p_17","first-page":"1695","author":"Klakow Dietrich","year":"1998","journal-title":"Proceedings of ICSLP"},{"key":"p_18","first-page":"181","author":"Kneser Reinhard","year":"1995","journal-title":"Proceedings of ICASSP"},{"key":"p_21","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1109\/TASL.2012.2215599","volume":"21","author":"Le Hai-Son","year":"2013","journal-title":"IEEE Transactions on Audio, Speech & Language Processing"},{"key":"p_22","first-page":"489","author":"Chatalbashev Vassil","year":"2007","journal-title":"Proceedings of ICML"},{"key":"p_23","first-page":"196","author":"Mikolov Tom\u00e1\u0161","year":"2011","journal-title":"Proceedings of ASRU"},{"key":"p_25","first-page":"246","author":"Morin Frederic","year":"2005","journal-title":"Proceedings of AISTATS"},{"key":"p_26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1006\/csla.1994.1001","volume":"8","author":"Ney Hermann","year":"1994","journal-title":"Computer Speech and Language"},{"key":"p_27","first-page":"1145","author":"Pickhardt Rene","year":"2014","journal-title":"Proceedings of ACL"},{"key":"p_29","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1006\/csla.2000.0159","volume":"15","author":"Rosenfeld Ronald","year":"2001","journal-title":"Computer Speech and Language"},{"key":"p_30","first-page":"1215","author":"Schwenk Holger","year":"2004","journal-title":"Proceedings of ICSLP"},{"key":"p_31","first-page":"201","author":"Schwenk Holger","year":"2005","journal-title":"Proceedings of EMNLP"},{"key":"p_32","doi-asserted-by":"crossref","first-page":"492","DOI":"10.1016\/j.csl.2006.09.003","volume":"21","author":"Schwenk Holger","year":"2007","journal-title":"Computer Speech and Language"},{"key":"p_34","doi-asserted-by":"crossref","first-page":"194","DOI":"10.21437\/Interspeech.2012-65","author":"Sundermeyer Martin","year":"2012","journal-title":"Proceedings of Interspeech"},{"key":"p_35","doi-asserted-by":"publisher","DOI":"10.1162\/COLI_a_00107"},{"key":"p_36","first-page":"1128","author":"Xu Puyang","year":"2011","journal-title":"Proceedings of EMNLP"},{"key":"p_37","first-page":"5391","author":"Williams Will","year":"2015","journal-title":"Proceedings of ICASSP"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/tacl_a_00102","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,9,9]],"date-time":"2022-09-09T06:18:40Z","timestamp":1662704320000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/43367"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,12]]},"references-count":28,"alternative-id":["10.1162\/tacl_a_00102"],"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00102","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,12]]}}}