{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T14:22:48Z","timestamp":1753885368847,"version":"3.41.2"},"reference-count":20,"publisher":"World Scientific Pub Co Pte Ltd","issue":"04","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Info. Know. Mgmt."],"published-print":{"date-parts":[[2023,8]]},"abstract":"<jats:p> In this era, news is not only generated continuously with high speed but also growing in its amount by different web sources like talent hunt, news agencies, and so on. To predict the exact class of news depending on its topic, GepH (Grouped entity predictor for Hindi) is proposed using entity extraction and grouping. Entity extraction is popular for English corpus. Hindi is a national language due to its resource scarceness not being explored so much by researchers. More than 1,270 news are processed to apply entity extraction, clustering, and classification using the vector space model for Hindi (VSMH), Synset vector space model for Hindi (SVSMH), and grouped entity document matrix for Hindi (GEDMH). Synset-based dimension reduction techniques are used to get improved accuracy. Evaluation of HAC using three matrices shows the best performance of GEDMH for varied datasets. Thus labelled corpus obtained after applying HAC (Hierarchical agglomerative clustering) to GEDMH is used as a training dataset and predictions are done using random forest and Na\u00efve Bayes. The Na\u00efve Bayes classifier implemented using the proposed GEDMH performs the best. GepH shows 0.8 purity, 0.4 entropy, and 0.3 as error rate for 1,273 Hindi news. <\/jats:p>","DOI":"10.1142\/s0219649223500168","type":"journal-article","created":{"date-parts":[[2023,4,3]],"date-time":"2023-04-03T16:17:57Z","timestamp":1680538677000},"source":"Crossref","is-referenced-by-count":0,"title":["GepH: Entity Predictor for Hindi News"],"prefix":"10.1142","volume":"22","author":[{"given":"Prafulla B.","family":"Bafna","sequence":"first","affiliation":[{"name":"Symbiosis Institute of Computer Studies and Research, Symbiosis International (Deemed) University, Pune, India"}]}],"member":"219","published-online":{"date-parts":[[2023,3,31]]},"reference":[{"key":"S0219649223500168BIB001","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4614-3223-4","volume-title":"Mining Text Data","author":"Aggarwal CC","year":"2012"},{"key":"S0219649223500168BIB002","first-page":"385","volume-title":"Advances in Information Communication Technology and Computing","author":"Avasthi S","year":"2020"},{"key":"S0219649223500168BIB003","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1109\/ICEEOT.2016.7754750","volume-title":"2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)","author":"Bafna PB","year":"2016"},{"issue":"4","key":"S0219649223500168BIB004","doi-asserted-by":"crossref","first-page":"2020","DOI":"10.14569\/IJACSA.2020.0110419","volume":"11","author":"Bafna PB","year":"2020","journal-title":"International Journal of Advanced Computer Science and Applications"},{"issue":"2","key":"S0219649223500168BIB005","doi-asserted-by":"crossref","first-page":"81","DOI":"10.14569\/IJACSA.2020.0110224","volume":"11","author":"Bafna PB","year":"2020","journal-title":"International Journal of Advanced Computer Science and Applications"},{"issue":"2","key":"S0219649223500168BIB006","doi-asserted-by":"crossref","first-page":"95","DOI":"10.4018\/IJSSMET.2020040106","volume":"11","author":"Deena G","year":"2020","journal-title":"International Journal of Service Science, Management, Engineering, and Technology"},{"issue":"4","key":"S0219649223500168BIB007","first-page":"395","volume":"7","author":"Garg NK","year":"2013","journal-title":"International Journal of Image Processing"},{"key":"S0219649223500168BIB008","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1007\/978-981-13-8253-6_22","volume-title":"Performance Management of Integrated Systems and its Applications in Software Engineering","author":"Gulati AN","year":"2020"},{"issue":"11","key":"S0219649223500168BIB009","first-page":"54","volume":"3","author":"Hanumanthappa M","year":"2014","journal-title":"International Journal of Computer Science and Mobile Computing"},{"key":"S0219649223500168BIB010","doi-asserted-by":"crossref","first-page":"771","DOI":"10.1007\/s10055-020-00426-w","author":"Jain P","year":"2020","journal-title":"Virtual Reality"},{"key":"S0219649223500168BIB011","doi-asserted-by":"crossref","first-page":"2341","DOI":"10.1007\/s13042-020-01122-6","volume":"11","author":"Kim G","year":"2020","journal-title":"International Journal of Machine Learning and Cybernetics"},{"issue":"5","key":"S0219649223500168BIB012","first-page":"1","volume":"25","author":"Meyer D","year":"2008","journal-title":"Journal of Statistical Software"},{"issue":"19","key":"S0219649223500168BIB013","doi-asserted-by":"crossref","first-page":"23","DOI":"10.5120\/21176-4185","volume":"119","author":"Pandey P","year":"2015","journal-title":"International Journal of Computer Applications"},{"key":"S0219649223500168BIB014","doi-asserted-by":"crossref","first-page":"15","DOI":"10.5121\/ijnlc.2016.5102","volume":"5","author":"Patil N","year":"2016","journal-title":"International Journal on Natural Language Computing"},{"issue":"1","key":"S0219649223500168BIB015","first-page":"15","volume":"1","author":"Rodzuan NAS","year":"2020","journal-title":"Academia of Intelligence Computing"},{"key":"S0219649223500168BIB016","doi-asserted-by":"crossref","first-page":"322","DOI":"10.1016\/j.knosys.2011.09.015","volume":"27","author":"Saha SK","year":"2012","journal-title":"Knowledge-Based Systems"},{"key":"S0219649223500168BIB017","doi-asserted-by":"crossref","first-page":"757","DOI":"10.1007\/s10772-020-09730-x","volume":"23","author":"Shrestha H","year":"2020","journal-title":"International Journal of Speech Technology"},{"key":"S0219649223500168BIB018","doi-asserted-by":"crossref","first-page":"27","DOI":"10.18653\/v1\/W18-2405","volume-title":"Proceedings of the Seventh Named Entities Workshop","author":"Singh V","year":"2018"},{"issue":"1","key":"S0219649223500168BIB019","first-page":"10","volume":"2","author":"Srivastava S","year":"2011","journal-title":"International Journal of Computational Linguistics"},{"issue":"1","key":"S0219649223500168BIB020","doi-asserted-by":"crossref","first-page":"258","DOI":"10.37398\/JSR.2020.640149","volume":"64","author":"Verma P","year":"2020","journal-title":"Journal of Scientific Research"}],"container-title":["Journal of Information &amp; Knowledge Management"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0219649223500168","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,5]],"date-time":"2023-09-05T05:52:49Z","timestamp":1693893169000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S0219649223500168"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,31]]},"references-count":20,"journal-issue":{"issue":"04","published-print":{"date-parts":[[2023,8]]}},"alternative-id":["10.1142\/S0219649223500168"],"URL":"https:\/\/doi.org\/10.1142\/s0219649223500168","relation":{},"ISSN":["0219-6492","1793-6926"],"issn-type":[{"type":"print","value":"0219-6492"},{"type":"electronic","value":"1793-6926"}],"subject":[],"published":{"date-parts":[[2023,3,31]]},"article-number":"2350016"}}