{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T12:22:44Z","timestamp":1767183764800,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":43,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,12,14]],"date-time":"2021-12-14T00:00:00Z","timestamp":1639440000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,12,14]]},"DOI":"10.1145\/3486622.3493952","type":"proceedings-article","created":{"date-parts":[[2022,4,14]],"date-time":"2022-04-14T01:18:53Z","timestamp":1649899133000},"page":"162-169","source":"Crossref","is-referenced-by-count":2,"title":["Improving Topic Modeling Performance through N-gram Removal"],"prefix":"10.1145","author":[{"given":"Mohamad","family":"Almgerbi","sequence":"first","affiliation":[{"name":"University of Florence, Italy"}]},{"given":"Andrea","family":"De Mauro","sequence":"additional","affiliation":[{"name":"University of Rome Tor Vergata, Italy"}]},{"given":"Adham","family":"Kahlawi","sequence":"additional","affiliation":[{"name":"University of Florence, Italy"}]},{"given":"Valentina","family":"Poggioni","sequence":"additional","affiliation":[{"name":"University of Perugia, Italy"}]}],"member":"320","published-online":{"date-parts":[[2022,4,13]]},"reference":[{"key":"e_1_3_2_1_1_1","first-page":"4102","article-title":"Review of data preprocessing techniques in data mining","volume":"12","author":"Alasadi A","year":"2017","unstructured":"Suad\u00a0 A Alasadi and Wesam\u00a0 S Bhaya . 2017 . Review of data preprocessing techniques in data mining . Journal of Engineering and Applied Sciences 12 , 16 (2017), 4102 \u2013 4107 . Suad\u00a0A Alasadi and Wesam\u00a0S Bhaya. 2017. Review of data preprocessing techniques in data mining. Journal of Engineering and Applied Sciences 12, 16 (2017), 4102\u20134107.","journal-title":"Journal of Engineering and Applied Sciences"},{"key":"e_1_3_2_1_2_1","unstructured":"Mohamad Almgerbi Andrea De\u00a0Mauro Adham Kahlawi and Valentina Poggioni. 2021. A Systematic Review of Data Analytics Job Requirements and Online-Courses. Journal of Computer Information Systems(2021) 1\u201313.  Mohamad Almgerbi Andrea De\u00a0Mauro Adham Kahlawi and Valentina Poggioni. 2021. A Systematic Review of Data Analytics Job Requirements and Online-Courses. Journal of Computer Information Systems(2021) 1\u201313."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1165"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cobeha.2019.01.020"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/s13278-019-0568-8"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-15-1420-3_89"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2914731"},{"key":"e_1_3_2_1_8_1","volume-title":"8th Symposium on Languages, Applications and Technologies (SLATE","author":"Ferreira Jo\u00e3o","year":"2019","unstructured":"Jo\u00e3o Ferreira , Hugo Gon\u00e7alo\u00a0Oliveira , and Ricardo Rodrigues . 2019 . Improving NLTK for processing Portuguese . In 8th Symposium on Languages, Applications and Technologies (SLATE 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. Jo\u00e3o Ferreira, Hugo Gon\u00e7alo\u00a0Oliveira, and Ricardo Rodrigues. 2019. Improving NLTK for processing Portuguese. In 8th Symposium on Languages, Applications and Technologies (SLATE 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1186\/s41044-016-0014-0"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1257\/jel.20181020"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/IC4.2015.7375527"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-33-4673-4_27"},{"key":"e_1_3_2_1_13_1","volume-title":"Journal of Physics: Conference Series, Vol.\u00a01163","author":"Ignatenko Vera","year":"2025","unstructured":"Vera Ignatenko , Sergej Koltcov , Steffen Staab , and Zeyd Boukhers . 2019. Fractal approach for determining the optimal number of topics in the field of topic modeling .. In Journal of Physics: Conference Series, Vol.\u00a01163 . IOP Publishing , 01 2025 . Vera Ignatenko, Sergej Koltcov, Steffen Staab, and Zeyd Boukhers. 2019. Fractal approach for determining the optimal number of topics in the field of topic modeling.. In Journal of Physics: Conference Series, Vol.\u00a01163. IOP Publishing, 012025."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1111\/psj.12343"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-018-6894-4"},{"key":"e_1_3_2_1_16_1","unstructured":"Jashanjot Kaur and Preetpal\u00a0Kaur Buttar. 2018. STOPWORDS REMOVAL AND ITS ALGORITHMS BASED ON DIFFERENT METHODS.International Journal of Advanced Research in Computer Science 10 5(2018).  Jashanjot Kaur and Preetpal\u00a0Kaur Buttar. 2018. STOPWORDS REMOVAL AND ITS ALGORITHMS BASED ON DIFFERENT METHODS.International Journal of Advanced Research in Computer Science 10 5(2018)."},{"key":"e_1_3_2_1_17_1","first-page":"207","article-title":"A systematic review on stopword removal algorithms","volume":"4","author":"Kaur Jashanjot","year":"2018","unstructured":"Jashanjot Kaur and Preetpal\u00a0Kaur Buttar . 2018 . A systematic review on stopword removal algorithms . International Journal on Future Revolution in Computer Science & Communication Engineering 4 , 4 (2018), 207 \u2013 210 . Jashanjot Kaur and Preetpal\u00a0Kaur Buttar. 2018. A systematic review on stopword removal algorithms. International Journal on Future Revolution in Computer Science & Communication Engineering 4, 4 (2018), 207\u2013210.","journal-title":"International Journal on Future Revolution in Computer Science & Communication Engineering"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-1075"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.23919\/FRUCT.2017.8250181"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.3390\/make1010025"},{"key":"e_1_3_2_1_21_1","unstructured":"Vlad Krotov and Leiser Silva. 2018. Legality and ethics of web scraping. (2018).  Vlad Krotov and Leiser Silva. 2018. Legality and ethics of web scraping. (2018)."},{"volume-title":"Natural language processing recipes","author":"Kulkarni Akshay","key":"e_1_3_2_1_22_1","unstructured":"Akshay Kulkarni and Adarsha Shivananda . 2019. Deep learning for NLP . In Natural language processing recipes . Springer , 185\u2013227. Akshay Kulkarni and Adarsha Shivananda. 2019. Deep learning for NLP. In Natural language processing recipes. Springer, 185\u2013227."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/E14-1056"},{"key":"e_1_3_2_1_24_1","unstructured":"Baojun Ma Hua Yuan Yan Wan Yu Qian Nan Zhang and Qiongwei Ye. 2016. Public opinion analysis based on probabilistic topic modeling and deep learning. (2016).  Baojun Ma Hua Yuan Yan Wan Yu Qian Nan Zhang and Qiongwei Ye. 2016. Public opinion analysis based on probabilistic topic modeling and deep learning. (2016)."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.im.2018.05.003"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-57529-2_29"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1080\/08839514.2019.1661576"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.4018\/IJNCR.2020040102"},{"key":"e_1_3_2_1_29_1","unstructured":"Serhad Sarica and Jianxi Luo. 2020. Stopwords in technical language processing. arXiv preprint arXiv:2006.02633(2020).  Serhad Sarica and Jianxi Luo. 2020. Stopwords in technical language processing. arXiv preprint arXiv:2006.02633(2020)."},{"key":"e_1_3_2_1_30_1","volume-title":"Topic modeling, long texts and the best number of topics. Some Problems and solutions.Quality & Quantity 54, 4","author":"Sbalchiero Stefano","year":"2020","unstructured":"Stefano Sbalchiero and Maciej Eder . 2020. Topic modeling, long texts and the best number of topics. Some Problems and solutions.Quality & Quantity 54, 4 ( 2020 ). Stefano Sbalchiero and Maciej Eder. 2020. Topic modeling, long texts and the best number of topics. Some Problems and solutions.Quality & Quantity 54, 4 (2020)."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/E17-2069"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D17-1290"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/Eco-friendly.2016.7893249"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10586-017-0942-0"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.5555\/2390948.2391052"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2018.06.022"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICECA.2019.8822022"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2016.05.029"},{"key":"e_1_3_2_1_39_1","first-page":"257","article-title":"Optimization of Topic Recognition Model for News Texts Based on LDA","volume":"17","author":"Wang Hongbin","year":"2019","unstructured":"Hongbin Wang , Jianxiong Wang , Yafei Zhang , Meng Wang , and Cunli Mao . 2019 . Optimization of Topic Recognition Model for News Texts Based on LDA . J. Digit. Inf. Manag. 17 , 5 (2019), 257 . Hongbin Wang, Jianxiong Wang, Yafei Zhang, Meng Wang, and Cunli Mao. 2019. Optimization of Topic Recognition Model for News Texts Based on LDA.J. Digit. Inf. Manag. 17, 5 (2019), 257.","journal-title":"J. Digit. Inf. Manag."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148204"},{"key":"e_1_3_2_1_41_1","first-page":"1","article-title":"Tuning Latent Dirichlet Allocation parameters using ant colony optimization","volume":"10","author":"Yarnguy Thanakorn","year":"2018","unstructured":"Thanakorn Yarnguy and Wanida Kanarkard . 2018 . Tuning Latent Dirichlet Allocation parameters using ant colony optimization . Journal of Telecommunication, Electronic and Computer Engineering (JTEC) 10 , 1 - 9 (2018), 21\u201324. Thanakorn Yarnguy and Wanida Kanarkard. 2018. Tuning Latent Dirichlet Allocation parameters using ant colony optimization. Journal of Telecommunication, Electronic and Computer Engineering (JTEC) 10, 1-9(2018), 21\u201324.","journal-title":"Journal of Telecommunication, Electronic and Computer Engineering (JTEC)"},{"volume-title":"Encyclopedia of Big Data. Encyclopedia of Big Data(2020), 3\u20135","author":"Zhao Bo","key":"e_1_3_2_1_42_1","unstructured":"Bo Zhao . 2020. Encyclopedia of Big Data. Encyclopedia of Big Data(2020), 3\u20135 . Bo Zhao. 2020. Encyclopedia of Big Data. Encyclopedia of Big Data(2020), 3\u20135."},{"volume-title":"BMC bioinformatics, Vol.\u00a016","author":"Zhao Weizhong","key":"e_1_3_2_1_43_1","unstructured":"Weizhong Zhao , James\u00a0 J Chen , Roger Perkins , Zhichao Liu , Weigong Ge , Yijun Ding , and Wen Zou . 2015. A heuristic approach to determine an appropriate number of topics in topic modeling . In BMC bioinformatics, Vol.\u00a016 . Springer , 1\u201310. Weizhong Zhao, James\u00a0J Chen, Roger Perkins, Zhichao Liu, Weigong Ge, Yijun Ding, and Wen Zou. 2015. A heuristic approach to determine an appropriate number of topics in topic modeling. In BMC bioinformatics, Vol.\u00a016. Springer, 1\u201310."}],"event":{"name":"WI-IAT '21: IEEE\/WIC\/ACM International Conference on Web Intelligence","sponsor":["SIGAI ACM Special Interest Group on Artificial Intelligence"],"location":"ESSENDON VIC Australia","acronym":"WI-IAT '21"},"container-title":["IEEE\/WIC\/ACM International Conference on Web Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3486622.3493952","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3486622.3493952","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:12:05Z","timestamp":1750191125000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3486622.3493952"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,14]]},"references-count":43,"alternative-id":["10.1145\/3486622.3493952","10.1145\/3486622"],"URL":"https:\/\/doi.org\/10.1145\/3486622.3493952","relation":{},"subject":[],"published":{"date-parts":[[2021,12,14]]}}}