{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,11]],"date-time":"2026-01-11T21:13:23Z","timestamp":1768166003744,"version":"3.49.0"},"reference-count":31,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2020,3,17]],"date-time":"2020-03-17T00:00:00Z","timestamp":1584403200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>YouTube is a boon, and through it people can educate, entertain, and express themselves about various topics. YouTube India currently has millions of active users. As there are millions of active users it can be understood that the data present on the YouTube will be large. With India being a very diverse country, many people are multilingual. People express their opinions in a code-mix form. Code-mix form is the mixing of two or more languages. It has become a necessity to perform Sentiment Analysis on the code-mix languages as there is not much research on Indian code-mix language data. In this paper, Sentiment Analysis (SA) is carried out on the Marglish (Marathi + English) as well as Devanagari Marathi comments which are extracted from the YouTube API from top Marathi channels. Several machine-learning models are applied on the dataset along with 3 different vectorizing techniques. Multilayer Perceptron (MLP) with Count vectorizer provides the best accuracy of 62.68% on the Marglish dataset and Bernoulli Na\u00efve Bayes along with the Count vectorizer, which gives accuracy of 60.60% on the Devanagari dataset. Multilayer Perceptron and Bernoulli Na\u00efve Bayes are considered to be the best performing algorithms. 10-fold cross-validation and statistical testing was also carried out on the dataset to confirm the results.<\/jats:p>","DOI":"10.3390\/bdcc4010003","type":"journal-article","created":{"date-parts":[[2020,3,17]],"date-time":"2020-03-17T09:27:41Z","timestamp":1584437261000},"page":"3","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":24,"title":["Opinion-Mining on Marglish and Devanagari Comments of YouTube Cookery Channels Using Parametric and Non-Parametric Learning Models"],"prefix":"10.3390","volume":"4","author":[{"given":"Sonali Rajesh","family":"Shah","sequence":"first","affiliation":[{"name":"School of Computing, Dublin Business School, D02 WC04 Dublin, Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3329-1807","authenticated-orcid":false,"given":"Abhishek","family":"Kaushik","sequence":"additional","affiliation":[{"name":"ADAPT Centre, School of Computing, Dublin City University, D09 W6Y4 Dublin, Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4098-6313","authenticated-orcid":false,"given":"Shubham","family":"Sharma","sequence":"additional","affiliation":[{"name":"School of Food Science and Environment Health, TU Dublin, D01 HV58 Dublin, Ireland"}]},{"given":"Janice","family":"Shah","sequence":"additional","affiliation":[{"name":"Department of Information Technology, Sardar Patel Institute of Technology, 400058 Mumbai, India"}]}],"member":"1968","published-online":{"date-parts":[[2020,3,17]]},"reference":[{"key":"ref_1","unstructured":"(2020, January 06). List of Countries by Number of Internet Users. Available online: https:\/\/en.wikipedia.org\/wiki\/List_of_countries_by_number_of_Internet_users."},{"key":"ref_2","unstructured":"Diwanji, S. (2020, January 06). Most Popular Smartphone Activities in India as of January 2018. Available online: https:\/\/www.statista.com\/statistics\/309867\/mobile-phone-activities-india\/."},{"key":"ref_3","unstructured":"Mitter, S. (2020, January 06). How YouTube India Spurred a Thriving Content Economy Cutting across Genres, Languages, Demographics. Available online: https:\/\/yourstory.com\/2019\/03\/youtube-india-thriving-content-economy-yi37d6xy4u."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"978","DOI":"10.1016\/j.im.2016.04.005","article-title":"Social emotion classification of short text via topic-level maximum entropy model","volume":"53","author":"Rao","year":"2016","journal-title":"Inf. Manag."},{"key":"ref_5","unstructured":"Nagarajan, R. (2020, January 06). 52% of India\u2019s Urban Youth Are Now Bilingual, 18% Speak Three Languages. Available online: https:\/\/timesofindia.indiatimes.com\/india\/52-of-indias-urban-youth-are-now-bilingual-18-speak-three-languages\/articleshow\/66530958.cms."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Pundlik, S., Dasare, P., Kasbekar, P., Gawade, A., Gaikwad, G., and Pundlik, P. (2016, January 21\u201324). Multiclass classification and class based sentiment analysis for Hindi language. Proceedings of the IEEE 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Jaipur, India.","DOI":"10.1109\/ICACCI.2016.7732097"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Shah, S.R., and Kaushik, A. (2019). Sentiment Analysis On Indian Indigenous Languages: A Review On Multilingual Opinion Mining. arXiv.","DOI":"10.20944\/preprints201911.0338.v1"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Yadav, M., and Bhojane, V. (2019, January 10\u201311). Semi-Supervised Mix-Hindi Sentiment Analysis using Neural Network. Proceedings of the IEEE 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India.","DOI":"10.1109\/CONFLUENCE.2019.8776943"},{"key":"ref_9","first-page":"14","article-title":"Sentiment Analysis of Mixed Code for the Transliterated Hindi and Marathi Texts","volume":"7","author":"Ansari","year":"2018","journal-title":"Int. J. Nat. Lang. Comput."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Jha, V., Manjunath, N., Shenoy, P.D., Venugopal, K., and Patnaik, L.M. (2015, January 9\u201311). Homs: Hindi opinion mining system. Proceedings of the 2015 IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS), Kolkata, India.","DOI":"10.1109\/ReTIS.2015.7232906"},{"key":"ref_11","unstructured":"Afli, H., Maguire, S., and Way, A. (2017, January 17\u201321). Sentiment translation for low resourced languages: Experiments on irish general election tweets. Proceedings of the 18th International Conference on Computational Linguistics and Intelligent Text Processing, Budapest, Hungry."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Naidu, R., Bharti, S.K., Babu, K.S., and Mohapatra, R.K. (2017, January 22\u201324). Sentiment analysis using Telugu sentiwordnet. Proceedings of the 2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), Chennai, India.","DOI":"10.1109\/WiSPNET.2017.8299844"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Nanda, C., Dua, M., and Nanda, G. (2018, January 3\u20135). Sentiment Analysis of Movie Reviews in Hindi Language Using Machine Learning. Proceedings of the IEEE 2018 International Conference on Communication and Signal Processing (ICCSP), Chennai, India.","DOI":"10.1109\/ICCSP.2018.8524223"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Pandey, P., and Govilkar, S. (2015). A framework for sentiment analysis in Hindi using HSWN. Int. J. Comput. Appl., 119.","DOI":"10.5120\/21176-4185"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Bhargava, R., Sharma, Y., and Sharma, S. (2016, January 21\u201324). Sentiment analysis for mixed script indic sentences. Proceedings of the IEEE 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Jaipur, India.","DOI":"10.1109\/ICACCI.2016.7732099"},{"key":"ref_16","unstructured":"Kaur, H., Mangat, V., and Krail, N. (2017). Dictionary based sentiment analysis of hinglish text. Int. J. Adv. Res. Comput. Sci., 8."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Sun, B., Tian, F., and Liang, L. (2018, January 16\u201317). Tibetan Micro-Blog Sentiment Analysis Based on Mixed Deep Learning. Proceedings of the IEEE 2018 International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, China.","DOI":"10.1109\/ICALIP.2018.8455328"},{"key":"ref_18","first-page":"1487","article-title":"An approach to sentiment analysis on Gujarati tweets","volume":"10","author":"Joshi","year":"2017","journal-title":"Adv. Comput. Sci. Technol."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Sharma, P., and Moh, T.S. (2016, January 5\u20138). Prediction of indian election using sentiment analysis on hindi twitter. Proceedings of the 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, USA.","DOI":"10.1109\/BigData.2016.7840818"},{"key":"ref_20","unstructured":"Phani, S., Lahiri, S., and Biswas, A. (2016, January 11\u201316). Sentiment analysis of tweets in three Indian languages. Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016), Osaka, Japan."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1007\/s40012-016-0117-9","article-title":"Sentiment analysis for Odia language using supervised classifier: An information retrieval in Indian language initiative","volume":"4","author":"Sahu","year":"2016","journal-title":"CSI Trans. ICT"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Guthier, B., Ho, K., and El Saddik, A. (2017, January 5\u20138). Language-independent data set annotation for machine learning-based sentiment analysis. Proceedings of the IEEE 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Banff, AB, Canada.","DOI":"10.1109\/SMC.2017.8122930"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Kaur, G., Kaushik, A., and Sharma, S. (2019). Cooking Is Creating Emotion: A Study on Hinglish Sentiments of Youtube Cookery Channels Using Semi-Supervised Approach. Big Data Cogn. Comput., 3.","DOI":"10.3390\/bdcc3030037"},{"key":"ref_24","unstructured":"Akhtar, M.S., Ekbal, A., and Bhattacharyya, P. (2016, January 23\u201328). Aspect based sentiment analysis in Hindi: Resource creation and evaluation. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC\u201916), Portoro\u017e, Slovenia."},{"key":"ref_25","unstructured":"Ray, P., and Chakrabarti, A. (2019). A Mixed approach of Deep Learning method and Rule-Based method to improve Aspect Level Sentiment Analysis. Appl. Comput. Inform."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Khatua, A., Cambria, E., Ghosh, K., Chaki, N., and Khatua, A. (2019, January 3\u20135). Tweeting in Support of LGBT? A Deep Learning Approach. Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, Swissotel, Kolkata, India.","DOI":"10.1145\/3297001.3297057"},{"key":"ref_27","unstructured":"Godino, I.G., and DHaro, L.F. (2019, January 24). Gth-upm at tass 2019: Sentiment analysis of tweets for spanish variants. Proceedings of the TASS workshop at SEPLN (Spanish Society for Natural Language Processing), Bilbao, Spain. Available online: http:\/\/ceur-ws.org\/Vol-2421\/TASSoverview.pdf."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"3305","DOI":"10.1007\/s13369-018-3500-z","article-title":"Deep Learning Based Sentiment Analysis Using Convolution Neural Network","volume":"44","author":"Rani","year":"2019","journal-title":"Arab. J. Sci. Eng."},{"key":"ref_29","unstructured":"Hoang, M., Bihorac, O.A., and Rouces, J. (October, January 30). Aspect-Based Sentiment Analysis Using BERT. Proceedings of the 22nd Nordic Conference on Computional Linguistics (NoDaLiDa), Turku, Finland. Available online: http:\/\/www.sepln.org\/workshops\/tass\/."},{"key":"ref_30","unstructured":"(2020, January 06). stopwords-iso\/stopwords-mr. Available online: https:\/\/yourstory.com\/2019\/03\/youtube-india-thriving-content-economy-yi37d6xy4uhttps:\/\/github.com\/stopwords-iso\/stopwords-mr\/blob\/master\/stopwords-mr.txt."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"El Naqa, I., and Murphy, M.J. (2015). What is machine learning?. Machine Learning in Radiation Oncology, Springer.","DOI":"10.1007\/978-3-319-18305-3"}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/4\/1\/3\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:07:26Z","timestamp":1760173646000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/4\/1\/3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,3,17]]},"references-count":31,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2020,3]]}},"alternative-id":["bdcc4010003"],"URL":"https:\/\/doi.org\/10.3390\/bdcc4010003","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,3,17]]}}}