{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,2]],"date-time":"2026-01-02T07:34:12Z","timestamp":1767339252703,"version":"3.40.5"},"reference-count":47,"publisher":"Cambridge University Press (CUP)","issue":"2","license":[{"start":{"date-parts":[[2022,2,4]],"date-time":"2022-02-04T00:00:00Z","timestamp":1643932800000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":["cambridge.org"],"crossmark-restriction":true},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2023,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Large-scale transformer-based language models (LMs) demonstrate impressive capabilities in open-text generation. However, controlling the generated text\u2019s properties such as the topic, style, and sentiment is challenging and often requires significant changes to the model architecture or retraining and fine-tuning the model on new supervised data. This paper presents a novel approach for topical language generation (TLG) by combining a pre-trained LM with topic modeling information. We cast the problem using Bayesian probability formulation with topic probabilities as a prior, LM probabilities as the likelihood, and TLG probability as the posterior. In learning the model, we derive the topic probability distribution from the user-provided document\u2019s natural structure. Furthermore, we extend our model by introducing new parameters and functions to influence the quantity of the topical features presented in the generated text. This feature would allow us to easily control the topical properties of the generated text. Our experimental results demonstrate that our model outperforms the state-of-the-art results on coherency, diversity, and fluency while being faster in decoding.<\/jats:p>","DOI":"10.1017\/s1351324922000031","type":"journal-article","created":{"date-parts":[[2022,2,4]],"date-time":"2022-02-04T12:17:50Z","timestamp":1643977070000},"page":"337-359","update-policy":"https:\/\/doi.org\/10.1017\/policypage","source":"Crossref","is-referenced-by-count":7,"title":["Topical language generation using transformers"],"prefix":"10.1017","volume":"29","author":[{"given":"Rohola","family":"Zandie","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8923-4660","authenticated-orcid":false,"given":"Mohammad H.","family":"Mahoor","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2022,2,4]]},"reference":[{"key":"S1351324922000031_ref4","doi-asserted-by":"crossref","unstructured":"Bowman, S.R. , Vilnis, L. , Vinyals, O. , Dai, A.M. , Jozefowicz, R. and Bengio, S. (2015). Generating sentences from a continuous space. In SIGNLL Conference on Computational Natural Language Learning (CONLL).","DOI":"10.18653\/v1\/K16-1002"},{"key":"S1351324922000031_ref19","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1152"},{"key":"S1351324922000031_ref26","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.428"},{"key":"S1351324922000031_ref40","doi-asserted-by":"publisher","DOI":"10.1007\/BF01016429"},{"key":"S1351324922000031_ref17","unstructured":"Hoffman, M. , Bach, F.R. and Blei, D.M. (2010). Online learning for latent dirichlet allocation. In Advances in Neural Information Processing Systems, pp. 856\u2013864."},{"key":"S1351324922000031_ref45","doi-asserted-by":"crossref","unstructured":"Yu, L. , Zhang, W. , Wang, J. and Yu, Y. (2017). Seqgan: Sequence generative adversarial nets with policy gradient. In Thirty-First AAAI Conference on Artificial Intelligence.","DOI":"10.1609\/aaai.v31i1.10804"},{"key":"S1351324922000031_ref1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1431"},{"key":"S1351324922000031_ref25","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1169"},{"key":"S1351324922000031_ref20","unstructured":"Hu, Z. , Yang, Z. , Liang, X. , Salakhutdinov, R. and Xing, E.P. (2017). Toward controlled generation of text. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, pp. 1587\u20131596."},{"key":"S1351324922000031_ref30","unstructured":"Mikolov, T. , Sutskever, I. , Chen, K. , Corrado, G.S. and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pp. 3111\u20133119."},{"key":"S1351324922000031_ref39","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W18-6321"},{"key":"S1351324922000031_ref8","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324913000375"},{"key":"S1351324922000031_ref35","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1410"},{"key":"S1351324922000031_ref46","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1138"},{"key":"S1351324922000031_ref47","unstructured":"Zhao, Y. , Bi, V.W. , Cai, D. , Liu, X. , Tu, K. and Shi, S. (2018). Language style transfer from non-parallel text with arbitrary styles. In International Conference on Learning Representations. rejected."},{"key":"S1351324922000031_ref24","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1033"},{"key":"S1351324922000031_ref33","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1080"},{"key":"S1351324922000031_ref34","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI Blog"},{"key":"S1351324922000031_ref5","unstructured":"Brown, T.B. , Mann, B. , Ryder, N. , Subbiah, M. , Kaplan, J. , Dhariwal, P. , Neelakantan, A. , Shyam, P. , Sastry, G. , Askell, A. , Agarwal, S. , Herbert-Voss, A. , Krueger, G. , Henighan, T. , Child, R. , Ramesh, A. , Ziegler, D. , Wu, J. , Winter, C. , Hesse, C. , Chen, M. , Sigler, E. , Litwin, M. , Gray, S. , Chess, B. , Clark, J. , Berner, C. , McCandlish, S. , Radford, A. , Sutskever, I. and Amodei, D. (2020). Language models are few-shot learners. In Advances in Neural Information Processing Systems 33 (NeurIPS 2020)."},{"key":"S1351324922000031_ref23","unstructured":"Keskar, N.S. , McCann, B. , Varshney, L.R. , Xiong, C. and Socher, R. (2019). Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858."},{"key":"S1351324922000031_ref16","doi-asserted-by":"publisher","DOI":"10.1137\/090771806"},{"key":"S1351324922000031_ref37","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1170"},{"key":"S1351324922000031_ref44","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1090"},{"key":"S1351324922000031_ref11","first-page":"23","article-title":"A new algorithm for data compression","volume":"12","author":"Gage","year":"1994","journal-title":"C Users Journal"},{"key":"S1351324922000031_ref36","doi-asserted-by":"publisher","DOI":"10.1145\/2684822.2685324"},{"key":"S1351324922000031_ref18","unstructured":"Holtzman, A. , Buys, J. , Du, L. , Forbes, M. and Choi, Y. (2020). The curious case of neural text degeneration. In International Conference on Learning Representations."},{"key":"S1351324922000031_ref12","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-4008"},{"key":"S1351324922000031_ref15","doi-asserted-by":"crossref","unstructured":"Guo, J. , Lu, S. , Cai, H. , Zhang, W. , Yu, Y. and Wang, J. (2018). Long text generation via adversarial training with leaked information. In Thirty-Second AAAI Conference on Artificial Intelligence.","DOI":"10.1609\/aaai.v32i1.11957"},{"key":"S1351324922000031_ref9","doi-asserted-by":"crossref","unstructured":"Dziri, N. , Kamalloo, E. , Mathewson, K.W. and Zaiane, O. (2018). Augmenting neural response generation with context-aware topical attention. In Proceedings of the First Workshop on NLP for Conversational AI.","DOI":"10.18653\/v1\/W19-4103"},{"key":"S1351324922000031_ref27","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-5609"},{"key":"S1351324922000031_ref29","unstructured":"Martins, A. and Astudillo, R. (2016). From softmax to sparsemax: A sparse model of attention and multi-label classification. In International Conference on Machine Learning, pp. 1614\u20131623."},{"key":"S1351324922000031_ref21","unstructured":"Huang, J. (2005). Maximum likelihood estimation of dirichlet distribution parameters. CMU Technique Report."},{"key":"S1351324922000031_ref2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2009.12.007"},{"key":"S1351324922000031_ref7","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9"},{"key":"S1351324922000031_ref10","doi-asserted-by":"crossref","unstructured":"Fu, Z. , Tan, X. , Peng, N. , Zhao, D. and Yan, R. (2018). Style transfer in text: Exploration and evaluation. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32.","DOI":"10.1609\/aaai.v32i1.11330"},{"key":"S1351324922000031_ref22","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939801"},{"key":"S1351324922000031_ref3","first-page":"993","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"S1351324922000031_ref28","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-5609"},{"key":"S1351324922000031_ref38","unstructured":"Singh, A. and Palod, R. (2018). Sentiment transfer using seq2seq adversarial autoencoders. arXiv preprint arXiv:1804.04003."},{"key":"S1351324922000031_ref6","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1223"},{"key":"S1351324922000031_ref13","unstructured":"Goodfellow, I. (2016). Nips 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160."},{"key":"S1351324922000031_ref32","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1250"},{"key":"S1351324922000031_ref14","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-3079"},{"key":"S1351324922000031_ref31","unstructured":"Mueller, J. , Gifford, D. and Jaakkola, T. (2017). Sequence to better sequence: Continuous revision of combinatorial structures. In International Conference on Machine Learning, pp. 2536\u20132544."},{"key":"S1351324922000031_ref41","unstructured":"Vaswani, A. , Shazeer, N. , Parmar, N. , Uszkoreit, J. , Jones, L. , Gomez, A.N. , Kaiser, \u0141. and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems, pp. 5998\u20136008."},{"key":"S1351324922000031_ref42","unstructured":"Welleck, S. , Kulikov, I. , Roller, S. , Dinan, E. , Cho, K. and Weston, J. (2020). Neural text generation with unlikelihood training. In International Conference on Learning Representations."},{"key":"S1351324922000031_ref43","doi-asserted-by":"crossref","unstructured":"Xing, C. , Wu, W. , Wu, Y. , Liu, J. , Huang, Y. , Zhou, M. and Ma, W.-Y. (2017). Topic aware neural response generation. In Thirty-First AAAI Conference on Artificial Intelligence.","DOI":"10.1609\/aaai.v31i1.10981"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324922000031","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,13]],"date-time":"2023-03-13T04:19:44Z","timestamp":1678681184000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324922000031\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,4]]},"references-count":47,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,3]]}},"alternative-id":["S1351324922000031"],"URL":"https:\/\/doi.org\/10.1017\/s1351324922000031","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"type":"print","value":"1351-3249"},{"type":"electronic","value":"1469-8110"}],"subject":[],"published":{"date-parts":[[2022,2,4]]},"assertion":[{"value":"\u00a9 The Author(s), 2022. Published by Cambridge University Press","name":"copyright","label":"Copyright","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}}]}}