{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,28]],"date-time":"2025-10-28T15:10:17Z","timestamp":1761664217828,"version":"3.41.0"},"reference-count":54,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2021,1,31]],"date-time":"2021-01-31T00:00:00Z","timestamp":1612051200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2021,1,31]]},"abstract":"<jats:p>A topic model is one of the best stochastic models for summarizing an extensive collection of text. It has accomplished an inordinate achievement in text analysis as well as text summarization. It can be employed to the set of documents that are represented as a bag-of-words, without considering grammar and order of the words. We modeled the topics for Gujarati news articles corpus. As the Gujarati language has a diverse morphological structure and inflectionally rich, Gujarati text processing finds more complexity. The size of the vocabulary plays an important role in the inference process and quality of topics. As the vocabulary size increases, the inference process becomes slower and topic semantic coherence decreases. If the vocabulary size is diminished, then the topic inference process can be accelerated. It may also improve the quality of topics. In this work, the list of suffixes has been prepared that encounters too frequently with words in Gujarati text. The inflectional forms have been reduced to the root words concerning the suffixes in the list. Moreover, Gujarati single-letter words have been eliminated for faster inference and better quality of topics. Experimentally, it has been proved that if inflectional forms are reduced to their root words, then vocabulary length is shrunk to a significant extent. It also caused the topic formation process quicker. Moreover, the inflectional forms reduction and single-letter word removal enhanced the interpretability of topics. The interpretability of topics has been assessed on semantic coherence, word length, and topic size. The experimental results showed improvements in the topical semantic coherence score. Also, the topic size grew notably as the number of tokens assigned to the topics increased.<\/jats:p>","DOI":"10.1145\/3447760","type":"journal-article","created":{"date-parts":[[2021,3,10]],"date-time":"2021-03-10T13:13:05Z","timestamp":1615381985000},"page":"1-18","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Improving Semantic Coherence of Gujarati Text Topic Model Using Inflectional Forms Reduction and Single-letter Words Removal"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4373-2132","authenticated-orcid":false,"given":"Uttam","family":"Chauhan","sequence":"first","affiliation":[{"name":"Vishwakarma Government Engineering College, Gujarat, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Apurva","family":"Shah","sequence":"additional","affiliation":[{"name":"The Maharaja Sayajirao University of Baroda, Gujarat, India"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,3,10]]},"reference":[{"volume-title":"Fienberg","year":"2014","author":"Airoldi Edoardo M.","key":"e_1_2_1_1_1"},{"volume-title":"Proceedings of the 10th International Conference on Computational Semantics (IWCS\u201913)","year":"2013","author":"Aletras Nikolaos","key":"e_1_2_1_2_1"},{"key":"e_1_2_1_3_1","unstructured":"Juhi Ameta Nisheeth Joshi and Iti Mathur. 2012. A lightweight stemmer for Gujarati. arXiv:1210.5486). Retrieved from https:\/\/arxiv.org\/abs\/1210.5486.  Juhi Ameta Nisheeth Joshi and Iti Mathur. 2012. A lightweight stemmer for Gujarati. arXiv:1210.5486). Retrieved from https:\/\/arxiv.org\/abs\/1210.5486."},{"volume-title":"Proceedings of the International Conference on Language Resources and Evaluation (LREC\u201910)","author":"Aswani Niraj","key":"e_1_2_1_4_1"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2133806.2133826"},{"volume-title":"Proceedings of the 23rd International Conference on Machine Learning. ACM, 113--120","author":"David","key":"e_1_2_1_6_1"},{"volume-title":"Jordan","year":"2003","author":"Blei David M.","key":"e_1_2_1_7_1"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-011-9171-y"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2009.4959927"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0087555"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P15-1077"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9"},{"volume-title":"Arabic named entity recognition using topic modeling. Context 230","year":"2017","author":"Bazi Ismail El","key":"e_1_2_1_13_1"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1162\/089120101750300490"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1017\/S1351324905004055"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0307752101"},{"volume-title":"Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc., 289--296","year":"1999","author":"Hofmann Thomas","key":"e_1_2_1_17_1"},{"key":"e_1_2_1_18_1","first-page":"1","article-title":"Information retrieval from historical newspaper collections in highly inflectional languages: A query expansion approach","volume":"67","author":"J\u00e4rvelin Anni","year":"2015","journal-title":"J. Assoc. Inf. Sci. Technol."},{"volume-title":"Proceedings of the 2nd Workshop on South and Southeast Asian Natural Language Processing (WSSANLP\u201911)","year":"2011","author":"Dipti Jiandani Kartik Suba","key":"e_1_2_1_19_1"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2956235"},{"volume-title":"Proceedings of the International Conference on Language Resources and Evaluation (LREC\u201916)","year":"2016","author":"Kanojia Diptesh","key":"e_1_2_1_21_1"},{"volume-title":"Probabilistic Graphical Models: Principles and Techniques","author":"Koller Daphne","key":"e_1_2_1_22_1"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/E14-1056"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3091108"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2699939"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.22628"},{"key":"e_1_2_1_27_1","first-page":"304","article-title":"An LDA and synonym lexicon based approach to product feature extraction from online consumer product reviews","volume":"14","author":"Ma Baizhang","year":"2013","journal-title":"J. Electr. Commerce Res."},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 14th Australasian Database Conference","volume":"17","author":"Ma Liping","year":"2003"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242596"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/1699571.1699627"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-015-0020-5"},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the IJCAI-99 Workshop on Machine Learning for Information Filtering","volume":"1","author":"Nigam Kamal","year":"1999"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-40087-2_4"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0103408"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.5121\/csit.2013.3408"},{"key":"e_1_2_1_36_1","first-page":"1921","article-title":"Word features for latent dirichlet allocation","volume":"1","author":"Petterson James","year":"2010","journal-title":"Adv. Neur. Inf. Process. Syst."},{"volume-title":"Proceedings of the 23rd International Conference on Computational Linguistics. 51","year":"2010","author":"Kashyap Popat Pratikkumar Patel","key":"e_1_2_1_37_1"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2016.03.004"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2016.06.040"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.5555\/1036843.1036902"},{"volume-title":"Fennig","year":"2017","author":"Simons Gary F.","key":"e_1_2_1_41_1"},{"key":"e_1_2_1_42_1","first-page":"1008","article-title":"Complexity of inference in latent dirichlet allocation","volume":"1","author":"Sontag David","year":"2011","journal-title":"Adv. Neur. Inf. Process. Syst."},{"key":"e_1_2_1_43_1","first-page":"424","article-title":"Probabilistic topic models","volume":"427","author":"Steyvers Mark","year":"2007","journal-title":"Handbook Latent Semant. Anal."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1038\/nmeth.1619"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2014.08.003"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143967"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553515"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2015.10.012"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2007.09.001"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2007.09.013"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1080\/13873950802431992"},{"volume-title":"Int. Joint Conf. Artif. Intell. 1","year":"2013","author":"Zhang Tao","key":"e_1_2_1_52_1"},{"volume-title":"Xing","year":"2008","author":"Zhao Bing","key":"e_1_2_1_53_1"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.5555\/3225641.3225823"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3447760","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3447760","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:41:10Z","timestamp":1750200070000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3447760"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1,31]]},"references-count":54,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,1,31]]}},"alternative-id":["10.1145\/3447760"],"URL":"https:\/\/doi.org\/10.1145\/3447760","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2021,1,31]]},"assertion":[{"value":"2018-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-03-10","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}