{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T16:30:38Z","timestamp":1774456238484,"version":"3.50.1"},"reference-count":58,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2025,11,27]],"date-time":"2025-11-27T00:00:00Z","timestamp":1764201600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Turk Telecom R&amp;D","award":["No grant but the company supports open access publications."],"award-info":[{"award-number":["No grant but the company supports open access publications."]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Sentiment analysis is essential for understanding consumer opinions, yet selecting the optimal models and embedding methods remains challenging, especially when handling ambiguous expressions, slang, or mismatched sentiment\u2013rating pairs. This study provides a comprehensive comparative evaluation of sentiment classification models across three paradigms: traditional machine learning, pre-transformer deep learning, and transformer-based models. Using the Amazon Magazine Subscriptions 2023 dataset, we evaluate a range of embedding techniques, including static embeddings (GloVe, FastText) and contextual transformer embeddings (BERT, DistilBERT, etc.). To capture predictive confidence and model uncertainty, we include categorical cross-entropy as a key evaluation metric alongside accuracy, precision, recall, and F1-score. In addition to detailed quantitative comparisons, we conduct a systematic qualitative analysis of misclassified samples to reveal model-specific patterns of uncertainty. Our findings show that FastText consistently outperforms GloVe in both traditional and LSTM-based models, particularly in recall, due to its subword-level semantic richness. Transformer-based models demonstrate superior contextual understanding and achieve the highest accuracy (92%) and lowest cross-entropy loss (0.25) with DistilBERT, indicating well-calibrated predictions. To validate the generalisability of our results, we replicated our experiments on the Amazon Gift Card Reviews dataset, where similar trends were observed. We also adopt a resource-aware approach by reducing the dataset size from 25 K to 20 K to reflect real-world hardware constraints. This study contributes to both sentiment analysis and sustainable AI by offering a scalable, entropy-aware evaluation framework that supports informed, context-sensitive model selection for practical applications.<\/jats:p>","DOI":"10.3390\/e27121202","type":"journal-article","created":{"date-parts":[[2025,11,27]],"date-time":"2025-11-27T11:50:13Z","timestamp":1764244213000},"page":"1202","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Transformer and Pre-Transformer Model-Based Sentiment Prediction with Various Embeddings: A Case Study on Amazon Reviews"],"prefix":"10.3390","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4005-4818","authenticated-orcid":false,"given":"Ismail","family":"Duru","sequence":"first","affiliation":[{"name":"R&D Department, T\u00fcrk Telekom, Ankara 06103, Turkey"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0836-5616","authenticated-orcid":false,"given":"Ay\u015fe Saliha","family":"Sunar","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Warwick, Coventry CV4 7AL, UK"},{"name":"Department of Computer Engineering, Bitlis Eren University, Bitlis 13000, Turkey"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/1500000011","article-title":"Opinion mining and sentiment analysis","volume":"2","author":"Pang","year":"2008","journal-title":"Found. Trends\u00ae Inf. Retr."},{"key":"ref_2","unstructured":"Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Pennington, J., Socher, R., and Manning, C.D. (2014, January 25\u201329). Glove: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar.","DOI":"10.3115\/v1\/D14-1162"},{"key":"ref_4","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3505244","article-title":"Transformers in vision: A survey","volume":"54","author":"Khan","year":"2022","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1016\/j.future.2020.06.050","article-title":"Transformer based deep intelligent contextual embedding for twitter sentiment analysis","volume":"113","author":"Naseem","year":"2020","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_7","unstructured":"Zhang, X., Zhao, J., and LeCun, Y. (2015, January 7\u201312). Character-level convolutional networks for text classification. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Hu, M., and Liu, B. (2004, January 22\u201325). Mining and summarizing customer reviews. Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, USA.","DOI":"10.1145\/1014052.1014073"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Sep\u00falveda-Fontaine, S.A., and Amig\u00f3, J.M. (2024). Applications of entropy in data analysis and machine learning: A review. Entropy, 26.","DOI":"10.20944\/preprints202410.0213.v1"},{"key":"ref_10","unstructured":"Hou, Y., Li, J., He, Z., Yan, A., Chen, X., and McAuley, J. (2024). Bridging Language and Items for Retrieval and Recommendation. arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1007\/s13042-010-0001-0","article-title":"Understanding bag-of-words model: A statistical framework","volume":"1","author":"Zhang","year":"2010","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1016\/S0306-4573(02)00021-3","article-title":"An information-theoretic perspective of tf\u2013idf measures","volume":"39","author":"Aizawa","year":"2003","journal-title":"Inf. Process. Manag."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1108\/EUM0000000007161","article-title":"Applications of n-grams in textual information systems","volume":"54","author":"Robertson","year":"1998","journal-title":"J. Doc."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Mishra, R.K., and Urolagin, S. (2019, January 11\u201312). A Sentiment analysis-based hotel recommendation using TF-IDF Approach. Proceedings of the 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), Dubai, United Arab Emirates.","DOI":"10.1109\/ICCIKE47802.2019.9004385"},{"key":"ref_15","first-page":"49","article-title":"Influence of pre-processing strategies on the performance of ML classifiers exploiting TF-IDF and BOW features","volume":"9","author":"Pimpalkar","year":"2020","journal-title":"Adcaij Adv. Distrib. Comput. Artif. Intell. J."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1016\/j.eswa.2016.03.028","article-title":"Classification of sentiment reviews using n-gram machine learning approach","volume":"57","author":"Tripathy","year":"2016","journal-title":"Expert Syst. Appl."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Hasan, T., and Matin, A. (2020, January 20\u201321). Extract Sentiment from Customer Reviews: A Better Approach of TF-IDF and BOW-Based Text Classification Using N-Gram Technique. Proceedings of the International Joint Conference on Advances in Computational Intelligence: IJCACI, Daffodil International University, Birulia, Bangladesh.","DOI":"10.1007\/978-981-16-0586-4_19"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"20071","DOI":"10.1109\/ACCESS.2025.3534261","article-title":"Enhancing Sentiment Analysis and Rating Prediction Using the Review Text Granularity (RTG) Model","volume":"13","author":"Garapati","year":"2025","journal-title":"IEEE Access"},{"key":"ref_19","unstructured":"Joulin, A., Grave, E., Bojanowski, P., Douze, M., J\u00e9gou, H., and Mikolov, T. (2016). Fasttext. zip: Compressing text classification models. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Edizel, B., Piktus, A., Bojanowski, P., Ferreira, R., Grave, E., and Silvestri, F. (2019). Misspelling oblivious word embeddings. arXiv.","DOI":"10.18653\/v1\/N19-1326"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"108911","DOI":"10.1016\/j.knosys.2022.108911","article-title":"Multi-level out-of-vocabulary words handling approach","volume":"251","author":"Lochter","year":"2022","journal-title":"Knowl.-Based Syst."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1016\/j.procs.2021.05.103","article-title":"Sentiment classification using fasttext embedding and deep learning model","volume":"189","author":"Khasanah","year":"2021","journal-title":"Procedia Comput. Sci."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Kaibi, I., and Satori, H. (2019, January 3\u20134). A comparative evaluation of word embeddings techniques for twitter sentiment analysis. Proceedings of the 2019 International Conference on Wireless Technologies, Embedded and Intelligent Systems (WITS), Fez, Morocco.","DOI":"10.1109\/WITS.2019.8723864"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"352","DOI":"10.29207\/resti.v6i3.3711","article-title":"The Accuracy Comparison Between Word2Vec and FastText On Sentiment Analysis of Hotel Reviews","volume":"6","author":"Khomsah","year":"2022","journal-title":"J. RESTI (Rekayasa Sist. Dan Teknol. Inf.)"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Du, C.H., Tsai, M.F., and Wang, C.J. (2019, January 12\u201317). Beyond word-level to sentence-level sentiment analysis for financial reports. Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK.","DOI":"10.1109\/ICASSP.2019.8683085"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"e1253","DOI":"10.1002\/widm.1253","article-title":"Deep learning for sentiment analysis: A survey","volume":"8","author":"Zhang","year":"2018","journal-title":"Wiley Interdiscip. Rev. Data Min. Knowl. Discov."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"85616","DOI":"10.1109\/ACCESS.2020.2992013","article-title":"Deep learning-based sentiment classification: A comparative survey","volume":"8","author":"Mabrouk","year":"2020","journal-title":"IEEE Access"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Ruder, S., Ghaffari, P., and Breslin, J.G. (2016, January 1\u20134). A hierarchical model of reviews for aspect-based sentiment analysis. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistic, Austin, TX, USA.","DOI":"10.18653\/v1\/D16-1103"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Li, X., Sun, X., Xu, Z., and Zhou, Y. (2021, January 23\u201325). Explainable sentence-level sentiment analysis for amazon product reviews. Proceedings of the 2021 5th International Conference on Imaging, Signal Processing and Communications (ICISPC), Kumamoto, Japan.","DOI":"10.1109\/ICISPC53419.2021.00024"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1016\/j.ins.2018.10.030","article-title":"Three-way enhanced convolutional neural networks for sentence-level sentiment classification","volume":"477","author":"Zhang","year":"2019","journal-title":"Inf. Sci."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Madasu, A., and Rao, V.A. (2019). Sequential learning of convolutional features for effective text classification. arXiv.","DOI":"10.18653\/v1\/D19-1567"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Tammina, S., and Annareddy, S. (2020, January 22\u201324). Sentiment analysis on customer reviews using convolutional neural network. Proceedings of the 2020 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India.","DOI":"10.1109\/ICCCI48352.2020.9104086"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Kumari, A., Bachan, A., Yang, T., Rathore, B., and Mishra, N. (2024, January 23\u201324). Sentiment Analysis of Amazon Product Reviews using Deep Predictive Model. Proceedings of the 2024 International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS), Hassan, India.","DOI":"10.1109\/IACIS61494.2024.10721906"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1109\/EMR.2022.3208818","article-title":"Combination of GRU and CNN deep learning models for sentiment analysis on French customer reviews using XLNet model","volume":"51","author":"Habbat","year":"2022","journal-title":"IEEE Eng. Manag. Rev."},{"key":"ref_36","unstructured":"Bahdanau, D., Cho, K., and Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Nallapati, R., Zhou, B., Gulcehre, C., and Xiang, B. (2016). Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv.","DOI":"10.18653\/v1\/K16-1028"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Chan, W., Jaitly, N., Le, Q., and Vinyals, O. (2016, January 20\u201325). Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. Proceedings of the 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), Shanghai, China.","DOI":"10.1109\/ICASSP.2016.7472621"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Asadi, A., and Safabakhsh, R. (2020). The encoder-decoder framework and its applications. Deep Learning: Concepts and Architectures, Springer.","DOI":"10.1007\/978-3-030-31756-0_5"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"13371","DOI":"10.1007\/s00521-022-07366-3","article-title":"Attention mechanism in neural networks: Where it comes and where it goes","volume":"34","author":"Soydaner","year":"2022","journal-title":"Neural Comput. Appl."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"4291","DOI":"10.1109\/TNNLS.2020.3019893","article-title":"Attention in natural language processing","volume":"32","author":"Galassi","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"103354","DOI":"10.1016\/j.ipm.2023.103354","article-title":"Dual emotion based fake news detection: A deep attention-weight update approach","volume":"60","author":"Luvembe","year":"2023","journal-title":"Inf. Process. Manag."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1016\/j.neucom.2021.03.091","article-title":"A review on the attention mechanism of deep learning","volume":"452","author":"Niu","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_44","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017, January 4\u20139). Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"12405","DOI":"10.1007\/s11042-022-12410-4","article-title":"Sentiment analysis for user reviews using Bi-LSTM self-attention based CNN model","volume":"81","author":"Bhuvaneshwari","year":"2022","journal-title":"Multimed. Tools Appl."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Cao, Y., Sun, Z., Li, L., and Mo, W. (2022). A study of sentiment analysis algorithms for agricultural product reviews based on improved BERT model. Symmetry, 14.","DOI":"10.3390\/sym14081604"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"113603","DOI":"10.1016\/j.dss.2021.113603","article-title":"S2SAN: A sentence-to-sentence attention network for sentiment analysis of online reviews","volume":"149","author":"Wang","year":"2021","journal-title":"Decis. Support Syst."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"100157","DOI":"10.1016\/j.array.2022.100157","article-title":"Transformer-based deep learning models for the sentiment analysis of social media data","volume":"14","author":"Kokab","year":"2022","journal-title":"Array"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Ali, H., Hashmi, E., Yayilgan Yildirim, S., and Shaikh, S. (2024). Analyzing amazon products sentiment: A comparative study of machine and deep learning, and transformer-based techniques. Electronics, 13.","DOI":"10.3390\/electronics13071305"},{"key":"ref_50","first-page":"474","article-title":"Transformer based contextual model for sentiment analysis of customer reviews: A fine-tuned bert","volume":"12","author":"Durairaj","year":"2021","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Mu, G., Chen, Y., Li, X., Dai, L., and Dai, J. (2025). Semantic enhancement and cross-modal interaction fusion for sentiment analysis in social media. PLoS ONE, 20.","DOI":"10.1371\/journal.pone.0321011"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Davoodi, L., Mezei, J., and Heikkil\u00e4, M. (2025). Aspect-based sentiment classification of user reviews to understand customer satisfaction of e-commerce platforms. Electron. Commer. Res., 1\u201343. Available online: https:\/\/link.springer.com\/article\/10.1007\/s10660-025-09948-4.","DOI":"10.1007\/s10660-025-09948-4"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Singh, U., Saraswat, A., Azad, H.K., Abhishek, K., and Shitharth, S. (2022). Towards improving e-commerce customer review analysis for sentiment detection. Sci. Rep., 12.","DOI":"10.1038\/s41598-022-26432-3"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Herrera-Poyatos, D., Pel\u00e1ez-Gonz\u00e1lez, C., Zuheros, C., Herrera-Poyatos, A., Tejedor, V., Herrera, F., and Montes, R. (2025). An overview of model uncertainty and variability in LLM-based sentiment analysis: Challenges, mitigation strategies, and the role of explainability. Front. Artif. Intell., 8.","DOI":"10.3389\/frai.2025.1609097"},{"key":"ref_55","first-page":"15","article-title":"Product sentiment analysis for amazon reviews","volume":"13","author":"AlQahtani","year":"2021","journal-title":"Int. J. Comput. Sci. Inf. Technol. (IJCSIT)"},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"11029","DOI":"10.1007\/s11227-023-05094-6","article-title":"Text sentiment classification of Amazon reviews using word embeddings and convolutional neural networks","volume":"79","author":"Qorich","year":"2023","journal-title":"J. Supercomput."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Yenicelik, D., Schmidt, F., and Kilcher, Y. (2020, January 20). How does BERT capture semantics? A closer look at polysemous words. Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, Online.","DOI":"10.18653\/v1\/2020.blackboxnlp-1.15"},{"key":"ref_58","unstructured":"Sanh, V. (2019). DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/27\/12\/1202\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,27]],"date-time":"2025-11-27T11:54:12Z","timestamp":1764244452000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/27\/12\/1202"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,27]]},"references-count":58,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["e27121202"],"URL":"https:\/\/doi.org\/10.3390\/e27121202","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,27]]}}}