{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T21:51:10Z","timestamp":1774129870656,"version":"3.50.1"},"reference-count":81,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T00:00:00Z","timestamp":1740096000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002322","name":"Coordena\u00e7\u00e3o de Aperfei\u00e7oamento de Pessoal de N\u00edvel Superior\u2014Brasil (CAPES)","doi-asserted-by":"publisher","award":["001"],"award-info":[{"award-number":["001"]}],"id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002322","name":"Coordena\u00e7\u00e3o de Aperfei\u00e7oamento de Pessoal de N\u00edvel Superior\u2014Brasil (CAPES)","doi-asserted-by":"publisher","award":["UIDB\/50021\/2020"],"award-info":[{"award-number":["UIDB\/50021\/2020"]}],"id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]},{"name":"FCT, Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","award":["001"],"award-info":[{"award-number":["001"]}]},{"name":"FCT, Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","award":["UIDB\/50021\/2020"],"award-info":[{"award-number":["UIDB\/50021\/2020"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>The complex and specialized terminology of financial language in Portuguese-speaking markets create significant challenges for natural language processing (NLP) applications, which must capture nuanced linguistic and contextual information to support accurate analysis and decision-making. This paper presents DeB3RTa, a transformer-based model specifically developed through a mixed-domain pretraining strategy that combines extensive corpora from finance, politics, business management, and accounting to enable a nuanced understanding of financial language. DeB3RTa was evaluated against prominent models\u2014including BERTimbau, XLM-RoBERTa, SEC-BERT, BusinessBERT, and GPT-based variants\u2014and consistently achieved significant gains across key financial NLP benchmarks. To maximize adaptability and accuracy, DeB3RTa integrates advanced fine-tuning techniques such as layer reinitialization, mixout regularization, stochastic weight averaging, and layer-wise learning rate decay, which together enhance its performance across varied and high-stakes NLP tasks. These findings underscore the efficacy of mixed-domain pretraining in building high-performance language models for specialized applications. With its robust performance in complex analytical and classification tasks, DeB3RTa offers a powerful tool for advancing NLP in the financial sector and supporting nuanced language processing needs in Portuguese-speaking contexts.<\/jats:p>","DOI":"10.3390\/bdcc9030051","type":"journal-article","created":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T03:47:46Z","timestamp":1740109666000},"page":"51","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["DeB3RTa: A Transformer-Based Model for the Portuguese Financial Domain"],"prefix":"10.3390","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1868-7213","authenticated-orcid":false,"given":"Higo","family":"Pires","sequence":"first","affiliation":[{"name":"Education Department, Federal Institute of Maranh\u00e3o, Pinheiro 65200-000, MA, Brazil"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7950-7693","authenticated-orcid":false,"given":"Leonardo","family":"Paucar","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering, Federal University of Maranh\u00e3o, S\u00e3o Lu\u00eds 65080-805, MA, Brazil"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0005-8299","authenticated-orcid":false,"given":"Joao Paulo","family":"Carvalho","sequence":"additional","affiliation":[{"name":"Instituto de Engenharia de Sistemas e Computadores\u2013Investiga\u00e7\u00e3o e Desenvolvimento (INESC-ID)\/Instituto Superior T\u00e9cnico, Universidade de Lisboa, 1000-029 Lisbon, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2025,2,21]]},"reference":[{"key":"ref_1","unstructured":"Burstein, J., Doran, C., and Solorio, T. (2019, January 2\u20137). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA."},{"key":"ref_2","unstructured":"Aksoy, \u00c7., Ahmeto\u011flu, A., and G\u00fcng\u00f6r, T. (2020). Hierarchical Multitask Learning Approach for BERT. arXiv."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Wu, Y., Zhao, H., Li, Z., Zhang, S., Zhou, X., and Zhou, X. (2020). Semantics-aware BERT for Language Understanding. arXiv.","DOI":"10.1609\/aaai.v34i05.6510"},{"key":"ref_4","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"OpenAI Blog"},{"key":"ref_5","unstructured":"Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., and Askell, A. (2020, January 6\u201312). Language models are few-shot learners. Proceedings of the NIPS \u201920, Online."},{"key":"ref_6","unstructured":"OpenAI, Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., and Altman, S. (2024). GPT-4 Technical Report. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Saravanan, S., and Sudha, K. (2022, January 8\u20139). GPT-3 Powered System for Content Generation and Transformation. Proceedings of the 2022 Fifth International Conference on Computational Intelligence and Communication Technologies (CCICT) 2022, Sonepat, India.","DOI":"10.1109\/CCiCT56684.2022.00096"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"e2218523120","DOI":"10.1073\/pnas.2218523120","article-title":"Using cognitive psychology to understand GPT-3","volume":"120","author":"Binz","year":"2023","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_9","unstructured":"Araci, D. (2019). FinBERT: Financial Sentiment Analysis with Pre-trained Language Models. arXiv."},{"key":"ref_10","unstructured":"Yang, Y., Uy, M.C.S., and Huang, A. (2020). FinBERT: A Pretrained Language Model for Financial Communications. arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Liu, Z., Huang, D., Huang, K., Li, Z., and Zhao, J. (2021, January 7\u201315). FinBERT: A pre-trained financial language representation model for financial text mining. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI\u201920, Yokohama, Japan.","DOI":"10.24963\/ijcai.2020\/622"},{"key":"ref_12","unstructured":"Muresan, S., Nakov, P., and Villavicencio, A. (2022, January 22\u201327). FiNER: Financial Numeric Entity Recognition for XBRL Tagging. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland."},{"key":"ref_13","unstructured":"Goldberg, Y., Kozareva, Z., and Zhang, Y. (2022, January 7\u201311). When FLUE Meets FLANG: Benchmarks and Large Pretrained Language Model for Financial Domain. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Delgadillo, J., Kinyua, J., and Mutigwe, C. (2024). FinSoSent: Advancing Financial Market Sentiment Analysis through Pretrained Large Language Models. Big Data Cogn. Comput., 8.","DOI":"10.3390\/bdcc8080087"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Cao, Y., Yang, L., Wei, C., and Wang, H. (2023, January 17\u201319). Financial Text Sentiment Classification Based on Baichuan2 Instruction Finetuning Model. Proceedings of the 2023 5th International Conference on Frontiers Technology of Information and Computer (ICFTIC), Qiangdao, China.","DOI":"10.1109\/ICFTIC59930.2023.10454145"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"201","DOI":"10.3905\/jpm.2023.1.512","article-title":"From ELIZA to ChatGPT: The evolution of natural language processing and financial applications","volume":"49","author":"Lo","year":"2023","journal-title":"J. Portf. Manag."},{"key":"ref_17","unstructured":"Inserte, P.R., Nakhl\u00e9, M., Qader, R., Caillaut, G., and Liu, J. (2024). Large Language Model Adaptation for Financial Sentiment Analysis. arXiv."},{"key":"ref_18","unstructured":"Jurafsky, D., Chai, J., Schluter, N., and Tetreault, J. (2020, January 5\u201310). Unsupervised Cross-lingual Representation Learning at Scale. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"110901","DOI":"10.1016\/j.asoc.2023.110901","article-title":"BERT models for Brazilian Portuguese: Pretraining, evaluation and tokenization analysis","volume":"149","author":"Souza","year":"2023","journal-title":"Appl. Soft Comput."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1109\/JAS.2022.106004","article-title":"A Survey on Negative Transfer","volume":"10","author":"Zhang","year":"2023","journal-title":"IEEE\/CAA J. Autom. Sin."},{"key":"ref_21","unstructured":"Calzolari, N., Choukri, K., Cieri, C., Declerck, T., Goggi, S., Hasida, K., Isahara, H., Maegaard, B., Mariani, J., and Mazo, H. (2018, January 7\u201312). The brWaC Corpus: A New Open Resource for Brazilian Portuguese. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan."},{"key":"ref_22","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I. (2023). Attention Is All You Need. arXiv."},{"key":"ref_23","unstructured":"Moens, M.F., Huang, X., Specia, L., and Yih, S.W.T. (2021, January 7\u201311). How Suitable Are Subword Segmentation Strategies for Translating Non-Concatenative Morphology?. Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, Punta Cana, Dominican Republic."},{"key":"ref_24","unstructured":"Muresan, S., Nakov, P., and Villavicencio, A. (2022, January 22\u201327). \u201cIs Whole Word Masking Always Better for Chinese BERT?\u201d: Probing on Chinese Grammatical Error Correction. Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland."},{"key":"ref_25","unstructured":"Levine, Y., Lenz, B., Lieber, O., Abend, O., Leyton-Brown, K., Tennenholtz, M., and Shoham, Y. (2020). PMI-Masking: Principled masking of correlated spans. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1162\/tacl_a_00300","article-title":"SpanBERT: Improving Pre-training by Representing and Predicting Spans","volume":"8","author":"Joshi","year":"2020","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"He, W., Dai, Y., Yang, M., Sun, J., Huang, F., Si, L., and Li, Y. (2022, January 11\u201315). Unified dialog model pre-training for task-oriented dialog understanding and generation. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022, Madrid, Spain.","DOI":"10.1145\/3477495.3532069"},{"key":"ref_28","unstructured":"Kingma, D.P., and Ba, J. (2017). Adam: A Method for Stochastic Optimization. arXiv."},{"key":"ref_29","unstructured":"Loshchilov, I., and Hutter, F. (2019). Decoupled Weight Decay Regularization. arXiv."},{"key":"ref_30","unstructured":"Heo, B., Chun, S., Oh, S.J., Han, D., Yun, S., Kim, G., Uh, Y., and Ha, J.W. (2021). AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights. arXiv."},{"key":"ref_31","unstructured":"Liu, L., Jiang, H., He, P., Chen, W., Liu, X., Gao, J., and Han, J. (2021). On the Variance of the Adaptive Learning Rate and Beyond. arXiv."},{"key":"ref_32","unstructured":"Defazio, A., and Jelassi, S. (2021). Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization. arXiv."},{"key":"ref_33","unstructured":"Dodge, J., Ilharco, G., Schwartz, R., Farhadi, A., Hajishirzi, H., and Smith, N. (2020). Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping. arXiv."},{"key":"ref_34","unstructured":"Yosinski, J., Clune, J., Bengio, Y., and Lipson, H. (2014, January 8\u201313). How transferable are features in deep neural networks?. Proceedings of the 27th International Conference on Neural Information Processing Systems\u2014Volume 2, NIPS\u201914, Montreal, QC, Canada."},{"key":"ref_35","unstructured":"Zhang, T., Wu, F., Katiyar, A., Weinberger, K.Q., and Artzi, Y. (2021). Revisiting Few-sample BERT Fine-tuning. arXiv."},{"key":"ref_36","unstructured":"Lee, C., Cho, K., and Kang, W. (2020). Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models. arXiv."},{"key":"ref_37","first-page":"1929","article-title":"Dropout: A Simple Way to Prevent Neural Networks from Overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_38","unstructured":"Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., and Fergus, R. (2013, January 17\u201319). Regularization of Neural Networks using DropConnect. Proceedings of the 30th International Conference on Machine Learning, Proceedings of Machine Learning Research, Atlanta, GA, USA."},{"key":"ref_39","unstructured":"Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., and Wilson, A.G. (2019). Averaging Weights Leads to Wider Optima and Better Generalization. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Guo, H., Jin, J., and Liu, B. (2023). Stochastic Weight Averaging Revisited. Appl. Sci., 13.","DOI":"10.3390\/app13052935"},{"key":"ref_41","unstructured":"Goldberg, Y., Kozareva, Z., and Zhang, Y. (2022, January 7\u201311). Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging. Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates."},{"key":"ref_42","unstructured":"Talman, A., Celikkanat, H., Virpioja, S., Heinonen, M., and Tiedemann, J. (2023, January 22\u201324). Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging. Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), T\u00f3rshavn, Faroe Islands."},{"key":"ref_43","unstructured":"Onal, E., Fl\u00f6ge, K., Caldwell, E., Sheverdin, A., and Fortuin, V. (2024). Gaussian Stochastic Weight Averaging for Bayesian Low-Rank Adaptation of Large Language Models. arXiv."},{"key":"ref_44","unstructured":"Gurevych, I., and Miyao, Y. (2018, January 15\u201320). Universal Language Model Fine-tuning for Text Classification. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia."},{"key":"ref_45","unstructured":"Clark, K., Luong, M.T., Le, Q.V., and Manning, C.D. (2020). ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. arXiv."},{"key":"ref_46","unstructured":"You, Y., Li, J., Reddi, S., Hseu, J., Kumar, S., Bhojanapalli, S., Song, X., Demmel, J., Keutzer, K., and Hsieh, C.J. (2020). Large Batch Optimization for Deep Learning: Training BERT in 76 minutes. arXiv."},{"key":"ref_47","unstructured":"Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., and Le, Q.V. (2019, January 8\u201314). XLNet: Generalized autoregressive pretraining for language understanding. Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"100488","DOI":"10.1016\/j.patter.2022.100488","article-title":"Quantifying the advantage of domain-specific pre-training on named entity recognition tasks in materials science","volume":"3","author":"Trewartha","year":"2022","journal-title":"Patterns"},{"key":"ref_49","unstructured":"Inui, K., Jiang, J., Ng, V., and Wan, X. (2019, January 3\u20137). SciBERT: A Pretrained Language Model for Scientific Text. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China."},{"key":"ref_50","unstructured":"Rumshisky, A., Roberts, K., Bethard, S., and Naumann, T. (2020, January 19). BioBERTpt\u2014A Portuguese Neural Language Model for Clinical Named Entity Recognition. Proceedings of the 3rd Clinical Natural Language Processing Workshop, Online."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"6633213","DOI":"10.1155\/2021\/6633213","article-title":"ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition","volume":"2021","author":"Boudjellal","year":"2021","journal-title":"Complexity"},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"691","DOI":"10.1016\/j.ejor.2024.01.023","article-title":"Industry-sensitive language modeling for business","volume":"315","author":"Borchert","year":"2024","journal-title":"Eur. J. Oper. Res."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"104333","DOI":"10.1016\/j.frl.2023.104333","article-title":"GPT has become financially literate: Insights from financial literacy tests of GPT and a preliminary test of how people use it as a source of advice","volume":"58","author":"Niszczota","year":"2023","journal-title":"Financ. Res. Lett."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Chatzimina, M.E., Papadaki, H.A., Pontikoglou, C., and Tsiknakis, M. (2024). A Comparative Sentiment Analysis of Greek Clinical Conversations Using BERT, RoBERTa, GPT-2, and XLNet. Bioengineering, 11.","DOI":"10.3390\/bioengineering11060521"},{"key":"ref_55","unstructured":"He, P., Liu, X., Gao, J., and Chen, W. (2021). DeBERTa: Decoding-enhanced BERT with Disentangled Attention. arXiv."},{"key":"ref_56","unstructured":"Liang, W., and Liang, Y. (2024). BPDec: Unveiling the Potential of Masked Language Modeling Decoder in BERT pretraining. arXiv."},{"key":"ref_57","unstructured":"Broder, A. (1997, January 13). On the resemblance and containment of documents. Proceedings of the Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171), Salerno, Italy."},{"key":"ref_58","unstructured":"Muresan, S., Nakov, P., and Villavicencio, A. (2022, January 22\u201327). Deduplicating Training Data Makes Language Models Better. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland."},{"key":"ref_59","unstructured":"Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2020). DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"100048","DOI":"10.1016\/j.nlp.2023.100048","article-title":"A survey of GPT-3 family large language models including ChatGPT and GPT-4","volume":"6","author":"Kalyan","year":"2024","journal-title":"Nat. Lang. Process. J."},{"key":"ref_61","first-page":"203","article-title":"An Economic Analysis of Hate Crime","volume":"28","author":"Gale","year":"2002","journal-title":"East. Econ. J."},{"key":"ref_62","unstructured":"Curthoys, A. (2013). Identifying the Effect of Unemployment on Hate Crime, Syracuse University. Ren\u00e9e Crown University Honors Thesis Projects\u2014All. 33."},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Dharmapala, D., and McAdams, R.H. (2002). Words that kill: An economic perspective on hate speech and hate crimes. SSRN Electron. J.","DOI":"10.2139\/ssrn.300695"},{"key":"ref_64","first-page":"93","article-title":"Hate in the Machine: Anti-Black and Anti-Muslim Social Media Posts as Predictors of Offline Racially and Religiously Aggravated Crime","volume":"60","author":"Williams","year":"2019","journal-title":"Br. J. Criminol."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"1146","DOI":"10.1126\/science.aap9559","article-title":"The spread of true and false news online","volume":"359","author":"Vosoughi","year":"2018","journal-title":"Science"},{"key":"ref_66","doi-asserted-by":"crossref","first-page":"9524705","DOI":"10.1155\/2021\/9524705","article-title":"Sentiment Classification for Financial Texts Based on Deep Learning","volume":"2021","author":"Dong","year":"2021","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Singh, R., Sharma, V., Kashyap, R., and Manwal, M. (2024, January 14\u201315). Automated Multi-Page Document Classification and Information Extraction for Insurance Applications using Deep Learning Techniques. Proceedings of the 2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India.","DOI":"10.1109\/ICRITO61523.2024.10522111"},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"de Pelle, R., and Moreira, V. (2016, January 5). Offensive Comments in the Brazilian Web: A dataset and baseline results. Proceedings of the Anais do VI Brazilian Workshop on Social Network Analysis and Mining, Porto Alegre, RS, Brazil.","DOI":"10.5753\/brasnam.2017.3260"},{"key":"ref_69","doi-asserted-by":"crossref","first-page":"113199","DOI":"10.1016\/j.eswa.2020.113199","article-title":"Towards automatically filtering fake news in Portuguese","volume":"146","author":"Silva","year":"2020","journal-title":"Expert Syst. Appl."},{"key":"ref_70","unstructured":"Carosia, A.E.d.O., Silva, A.E.A.d., and Coelho, G.P. (2022). Replication data for: Predicting the Brazilian Stock Market using Sentiment Analysis, Technical Indicators, and Stock Prices. Repos. Dados Pesqui. Unicamp."},{"key":"ref_71","unstructured":"Faria de Azevedo, R., Eduardo Muniz, T.H., Pimentel, C., Jose de Assis Foureaux, G., Caldeira Macedo, B., and Vasconcelos, D.d.L. (2024, January 20). BBRC: Brazilian Banking Regulation Corpora. Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing @ LREC-COLING 2024, Torino, Italy."},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Kudo, T., and Richardson, J. (November, January 31). SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Brussels, Belgium.","DOI":"10.18653\/v1\/D18-2012"},{"key":"ref_73","unstructured":"Erk, K., and Smith, N.A. (2016, January 7\u201312). Neural Machine Translation of Rare Words with Subword Units. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany."},{"key":"ref_74","unstructured":"Gurevych, I., and Miyao, Y. (2018, January 15\u201320). Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia."},{"key":"ref_75","unstructured":"Moens, M.F., Huang, X., Specia, L., and Yih, S.W.T. (2021, January 7\u201311). How to Train BERT with an Academic Budget. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic."},{"key":"ref_76","unstructured":"Micikevicius, P., Narang, S., Alben, J., Diamos, G., Elsen, E., Garcia, D., Ginsburg, B., Houston, M., Kuchaiev, O., and Venkatesh, G. (2018). Mixed Precision Training. arXiv."},{"key":"ref_77","unstructured":"Zong, C., Xia, F., Li, W., and Navigli, R. (2021, January 2\u20135). Optimizing Deeper Transformers on Small Datasets. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online."},{"key":"ref_78","doi-asserted-by":"crossref","first-page":"943","DOI":"10.1017\/S1351324923000438","article-title":"Improving short text classification with augmented data using GPT-3","volume":"30","author":"Balkus","year":"2023","journal-title":"Nat. Lang. Eng."},{"key":"ref_79","unstructured":"Chen, C.C., Takamura, H., Mathur, P., Sawhney, R., Huang, H.H., and Chen, H.H. (2023, January 20). Breaking the Bank with ChatGPT: Few-Shot Text Classification for Finance. Proceedings of the Fifth Workshop on Financial Technology and Natural Language Processing and the Second Multimodal AI For Financial Forecasting, Macao, China."},{"key":"ref_80","doi-asserted-by":"crossref","first-page":"60805","DOI":"10.1109\/ACCESS.2022.3180830","article-title":"SciDeBERTa: Learning DeBERTa for Science Technology Documents and Fine-Tuning Information Extraction Tasks","volume":"10","author":"Jeong","year":"2022","journal-title":"IEEE Access"},{"key":"ref_81","unstructured":"Wortsman, M., Liu, P.J., Xiao, L., Everett, K., Alemi, A., Adlam, B., Co-Reyes, J.D., Gur, I., Kumar, A., and Novak, R. (2023). Small-scale proxies for large-scale Transformer training instabilities. arXiv."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/3\/51\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:39:19Z","timestamp":1760027959000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/3\/51"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,21]]},"references-count":81,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2025,3]]}},"alternative-id":["bdcc9030051"],"URL":"https:\/\/doi.org\/10.3390\/bdcc9030051","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,21]]}}}