{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:21:30Z","timestamp":1760239290678,"version":"build-2065373602"},"reference-count":40,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2020,10,15]],"date-time":"2020-10-15T00:00:00Z","timestamp":1602720000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["UID\/CEC\/50021\/ 2019 and INCoDe 2030"],"award-info":[{"award-number":["UID\/CEC\/50021\/ 2019 and INCoDe 2030"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Two sentences can be related in many different ways. Distinct tasks in natural language processing aim to identify different semantic relations between sentences. We developed several models for natural language inference and semantic textual similarity for the Portuguese language. We took advantage of pre-trained models (BERT); additionally, we studied the roles of lexical features. We tested our models in several datasets\u2014ASSIN, SICK-BR and ASSIN2\u2014and the best results were usually achieved with ptBERT-Large, trained in a Brazilian corpus and tuned in the latter datasets. Besides obtaining state-of-the-art results, this is, to the best of our knowledge, the most all-inclusive study about natural language inference and semantic textual similarity for the Portuguese language.<\/jats:p>","DOI":"10.3390\/info11100484","type":"journal-article","created":{"date-parts":[[2020,10,15]],"date-time":"2020-10-15T09:02:03Z","timestamp":1602752523000},"page":"484","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Benchmarking Natural Language Inference and Semantic Textual Similarity for Portuguese"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5507-5355","authenticated-orcid":false,"given":"Pedro","family":"Fialho","sequence":"first","affiliation":[{"name":"INESC-ID, Rua Alves Redol 9, 1000-029 Lisboa, Portugal"},{"name":"Departamento de Inform\u00e1tica, Universidade de \u00c9vora, Rua Rom\u00e3o Ramalho, 59 7000-671 \u00c9vora, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2456-5028","authenticated-orcid":false,"given":"Lu\u00edsa","family":"Coheur","sequence":"additional","affiliation":[{"name":"INESC-ID, Rua Alves Redol 9, 1000-029 Lisboa, Portugal"},{"name":"Instituto Superior T\u00e9cnico, Universidade de Lisboa, Av. Rovisco Pais, 1 1049-001 Lisboa, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5086-059X","authenticated-orcid":false,"given":"Paulo","family":"Quaresma","sequence":"additional","affiliation":[{"name":"INESC-ID, Rua Alves Redol 9, 1000-029 Lisboa, Portugal"},{"name":"Departamento de Inform\u00e1tica, Universidade de \u00c9vora, Rua Rom\u00e3o Ramalho, 59 7000-671 \u00c9vora, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2020,10,15]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"i","DOI":"10.1017\/S1351324909990209","article-title":"Recognizing textual entailment: Rational, evaluation and approaches","volume":"15","author":"Dagan","year":"2009","journal-title":"Nat. Lang. Eng."},{"key":"ref_2","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018, January 1\u20136). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), New Orleans, LA, USA."},{"key":"ref_3","unstructured":"Rodrigues, R., Couto, P., and Rodrigues, I. (2020, October 01). IPR: The Semantic Textual Similarity and Recognizing Textual Entailment Systems. Available online: http:\/\/ceur-ws.org\/Vol-2583\/4_IPR.pdf."},{"key":"ref_4","unstructured":"Cabezudo, M.A.S., In\u00e1cio, M., Rodrigues, A.C., Casanova, E., and de Sousa, R.F. (2020, October 01). NILC at ASSIN 2: Exploring Multilingual Approaches. Available online: http:\/\/ceur-ws.org\/Vol-2583\/5_NILC.pdf."},{"key":"ref_5","unstructured":"Pires, T., Schlinger, E., and Garrette, D. (August, January 28). How Multilingual is Multilingual BERT?. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy."},{"key":"ref_6","first-page":"409","article-title":"Benchmarking Applied Semantic Inference: The PASCAL Recognising Textual Entailment Challenges","volume":"Volume 8001","author":"Dershowitz","year":"2014","journal-title":"Language, Culture, Computation. Computing\u2014Theory and Technology\u2014Essays Dedicated to Yaacov Choueka on the Occasion of His 75th Birthday, Part I"},{"key":"ref_7","unstructured":"Agirre, E., Diab, M., Cer, D., and Gonzalez-Agirre, A. (2012, January 7\u20138). SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity. Proceedings of the First Joint Conference on Lexical and Computational Semantics\u2014Volume 1: Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, Montr\u00e9al, QC, Canada."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., and Specia, L. (2017, January 3\u20134). SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Vancouver, BC, Canada.","DOI":"10.18653\/v1\/S17-2001"},{"key":"ref_9","first-page":"3","article-title":"Vis\u00e3o Geral da Avalia\u00e7\u00e3o de Similaridade Sem\u00e2ntica e Infer\u00eancia Textual","volume":"8","author":"Fonseca","year":"2016","journal-title":"Linguam\u00e1tica"},{"key":"ref_10","unstructured":"Marelli, M., Menini, S., Baroni, M., Bentivogli, L., Bernardi, R., and Zamparelli, R. (2014, January 26\u201331). A SICK cure for the evaluation of compositional distributional semantic models. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), Reykjavik, Iceland."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Villavicencio, A., Moreira, V., Abad, A., Caseli, H., Gamallo, P., Ramisch, C., Gon\u00e7alo Oliveira, H., and Paetzold, G.H. (2018). SICK-BR: A Portuguese Corpus for Inference. Computational Processing of the Portuguese Language, Springer International Publishing.","DOI":"10.1007\/978-3-319-99722-3"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Quaresma, P., Vieira, R., Alu\u00edsio, S., Moniz, H., Batista, F., and Gon\u00e7alves, T. (2020). The ASSIN 2 Shared Task: A Quick Overview. Computational Processing of the Portuguese Language, Springer International Publishing.","DOI":"10.1007\/978-3-030-41505-1"},{"key":"ref_13","first-page":"33","article-title":"INESC-ID@ASSIN: Medi\u00e7\u00e3o de Similaridade Sem\u00e2ntica e Reconhecimento de Infer\u00eancia Textual","volume":"8","author":"Fialho","year":"2016","journal-title":"Linguam\u00e1tica"},{"key":"ref_14","unstructured":"Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., and Weinberger, K.Q. (2013). Distributed Representations of Words and Phrases and their Compositionality. Advances in Neural Information Processing Systems 26, Curran Associates."},{"key":"ref_15","first-page":"15","article-title":"Blue Man Group no ASSIN: Usando Representa\u00e7\u00f5es Distribu\u00eddas para Similaridade Sem\u00e2ntica e Infer\u00eancia Textual","volume":"8","author":"Barbosa","year":"2016","journal-title":"Linguam\u00e1tica"},{"key":"ref_16","first-page":"59","article-title":"Solo Queue at ASSIN: Combinando Abordagens Tradicionais e Emergentes","volume":"8","author":"Hartmann","year":"2016","journal-title":"Linguam\u00e1tica"},{"key":"ref_17","first-page":"43","article-title":"ASAPP: Alinhamento Sem\u00e2ntico Autom\u00e1tico de Palavras aplicado ao Portugu\u00eas","volume":"8","author":"Rodrigues","year":"2016","journal-title":"Linguam\u00e1tica"},{"key":"ref_18","first-page":"23","article-title":"FlexSTS: Um Framework para Similaridade Sem\u00e2ntica Textual","volume":"8","author":"Freire","year":"2016","journal-title":"Linguam\u00e1tica"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Pinheiro, A., Ferreira, R., Ferreira, M.A.D., Rolim, V.B., and Ten\u00f3rio, J.V.S. (2017, January 2\u20135). Statistical and Semantic Features to Measure Sentence Similarity in Portuguese. Proceedings of the 2017 Brazilian Conference on Intelligent Systems (BRACIS), Uberl\u00e2ndia, Brazil.","DOI":"10.1109\/BRACIS.2017.40"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Rocha, G., and Cardoso, H.L. (2018). Recognizing Textual Entailment: Challenges in the Portuguese Language. Information, 9.","DOI":"10.3390\/info9040076"},{"key":"ref_21","first-page":"12:1","article-title":"ASAPP 2.0: Advancing the state-of-the-art of semantic textual similarity for Portuguese","volume":"Volume 62","author":"Henriques","year":"2018","journal-title":"Proceedings of the 7th Symposium on Languages, Applications and Technologies (SLATE 2018)"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., and Bowman, S. (2018, January 1). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Brussels, Belgium.","DOI":"10.18653\/v1\/W18-5446"},{"key":"ref_23","unstructured":"Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (2017). Attention is All you Need. Advances in Neural Information Processing Systems 30, Curran Associates."},{"key":"ref_24","unstructured":"Rodrigues, R.C., da Silva, J.R., de Castro, P.V.Q., da Silva, N.F.F., and da Silva Soares, A. (2020, October 10). Multilingual Transformer Ensembles for Portuguese Natural Language Tasks. Available online: http:\/\/ceur-ws.org\/Vol-2583\/3_DLB.pdf."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Papineni, K., Roukos, S., Ward, T., and Zhu, W.J. (2002, January 7\u201312). Bleu: A Method for Automatic Evaluation of Machine Translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA.","DOI":"10.3115\/1073083.1073135"},{"key":"ref_26","unstructured":"De Paiva, V., Rademaker, A., and de Melo, G. (2012, January 8\u201315). OpenWordNet-PT: An Open Brazilian Wordnet for Reasoning. Proceedings of the COLING 2012: Demonstration Papers, Mumbai, India."},{"key":"ref_27","unstructured":"Jawahar, G., Sagot, B., and Seddah, D. (August, January 28). What Does BERT Learn about the Structure of Language?. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy."},{"key":"ref_28","unstructured":"Tenney, I., Das, D., and Pavlick, E. (August, January 28). BERT Rediscovers the Classical NLP Pipeline. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Liu, N.F., Gardner, M., Belinkov, Y., Peters, M.E., and Smith, N.A. (2018, January 1\u20136). Linguistic Knowledge and Transferability of Contextual Representations. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), New Orleans, LA, USA.","DOI":"10.18653\/v1\/N19-1112"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1111\/j.1469-8137.1912.tb05611.x","article-title":"THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1","volume":"11","author":"Jaccard","year":"1912","journal-title":"New Phytol."},{"key":"ref_31","unstructured":"Kingma, D.P., and Ba, J. (2015, January 7\u20139). Adam: A Method for Stochastic Optimization. Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA."},{"key":"ref_32","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zadrozny, B., and Elkan, C. (2002, January 23\u201326). Transforming Classifier Scores into Accurate Multiclass Probability Estimates. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, AB, Canada.","DOI":"10.1145\/775047.775151"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Smola, A., Bartlett, P., Schoelkopf, B., and Schuurmans, D. (2000). Probabilistic outputs for support vector machines and comparison to regularize likelihood methods. Advances in Large Margin Classifiers, MIT Press.","DOI":"10.7551\/mitpress\/1113.001.0001"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Marelli, M., Bentivogli, L., Baroni, M., Bernardi, R., Menini, S., and Zamparelli, R. (2014, January 23\u201324). SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment. Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland.","DOI":"10.3115\/v1\/S14-2001"},{"key":"ref_36","unstructured":"De Souza, J.V.A., e Oliveira, L.E.S., Gumiel, Y.B., Carvalho, D.R., and Moro, C.M.C. (2020, October 01). Incorporating Multiple Feature Groups to a Siamese Neural Network for Semantic Textual Similarity Task in Portuguese Texts. Available online: http:\/\/ceur-ws.org\/Vol-2583\/6_PUCPR.pdf."},{"key":"ref_37","unstructured":"Santos, J., Alves, A., and Oliveira, H.G. (2020, October 01). ASAPPpy: A Python Framework for Portuguese STS. Available online: http:\/\/ceur-ws.org\/Vol-2583\/2_ASAPPpy.pdf."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"945","DOI":"10.3844\/jcssp.2018.945.956","article-title":"Enhancing Brazilian Portuguese Textual Entailment Recognition with a Hybrid Approach","volume":"14","author":"Silva","year":"2018","journal-title":"J. Comput. Sci."},{"key":"ref_39","unstructured":"Fonseca, E., and Alvarenga, J.P.R. (2019, January 15). Multilingual Transformer Ensembles for Portuguese Natural Language Tasks. Proceedings of the ASSIN 2 Shared Task: Evaluating Semantic Textual Similarity and Textual Entailment in Portuguese Co-Located with XII Symposium in Information and Human Language Technology (STIL 2019), Salvador, Brazil."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Kalouli, A.L., Buis, A., Real, L., Palmer, M., and de Paiva, V. (2019, January 1). Explaining Simple Natural Language Inference. Proceedings of the 13th Linguistic Annotation Workshop, Florence, Italy.","DOI":"10.18653\/v1\/W19-4016"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/11\/10\/484\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:21:51Z","timestamp":1760178111000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/11\/10\/484"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,15]]},"references-count":40,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2020,10]]}},"alternative-id":["info11100484"],"URL":"https:\/\/doi.org\/10.3390\/info11100484","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2020,10,15]]}}}