{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:57:12Z","timestamp":1760241432482,"version":"build-2065373602"},"reference-count":46,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2018,3,29]],"date-time":"2018-03-29T00:00:00Z","timestamp":1522281600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Recognizing textual entailment comprises the task of determining semantic entailment relations between text fragments. A text fragment entails another text fragment if, from the meaning of the former, one can infer the meaning of the latter. If such relation is bidirectional, then we are in the presence of a paraphrase. Automatically recognizing textual entailment relations captures major semantic inference needs in several natural language processing (NLP) applications. As in many NLP tasks, textual entailment corpora for English abound, while the same is not true for more resource-scarce languages such as Portuguese. Exploiting what seems to be the only Portuguese corpus for textual entailment and paraphrases (the ASSIN corpus), in this paper, we address the task of automatically recognizing textual entailment (RTE) and paraphrases from text written in the Portuguese language, by employing supervised machine learning techniques. We employ lexical, syntactic and semantic features, and analyze the impact of using semantic-based approaches in the performance of the system. We then try to take advantage of the bi-dialect nature of ASSIN to compensate its limited size. With the same aim, we explore modeling the task of recognizing textual entailment and paraphrases as a binary classification problem by considering the bidirectional nature of paraphrases as entailment relationships. Addressing the task as a multi-class classification problem, we achieve results in line with the winner of the ASSIN Challenge. In addition, we conclude that semantic-based approaches are promising in this task, and that combining data from European and Brazilian Portuguese is less straightforward than it may initially seem. The binary classification modeling of the problem does not seem to bring advantages to the original multi-class model, despite the outstanding results obtained by the binary classifier for recognizing textual entailments.<\/jats:p>","DOI":"10.3390\/info9040076","type":"journal-article","created":{"date-parts":[[2018,3,29]],"date-time":"2018-03-29T12:51:56Z","timestamp":1522327916000},"page":"76","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Recognizing Textual Entailment: Challenges in the Portuguese Language"],"prefix":"10.3390","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8252-7292","authenticated-orcid":false,"given":"Gil","family":"Rocha","sequence":"first","affiliation":[{"name":"LIACC\/DEI, Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1252-7515","authenticated-orcid":false,"given":"Henrique","family":"Lopes Cardoso","sequence":"additional","affiliation":[{"name":"LIACC\/DEI, Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2018,3,29]]},"reference":[{"key":"ref_1","unstructured":"Vorobej, M. (2009). A Theory of Argument, Cambridge University Press."},{"key":"ref_2","first-page":"1","article-title":"Consensus and objectivity of legal argumentation","volume":"Volume 423","author":"Araszkiewicz","year":"2012","journal-title":"Argumentation 2012: International Conference on Alternative Methods of Argumentation in Law: Conference Proceedings"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Dagan, I., Roth, D., Sammons, M., and Zanzotto, F.M. (2013). Recognizing Textual Entailment: Models and Applications, Morgan & Claypool Publishers. Synthesis Lectures on Human Language Technologies.","DOI":"10.1007\/978-3-031-02151-0"},{"key":"ref_4","first-page":"208","article-title":"Combining Textual Entailment and Argumentation Theory for Supporting Online Debates Interactions","volume":"Volume 2","author":"Cabrio","year":"2012","journal-title":"Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers"},{"key":"ref_5","unstructured":"Bikel, D.M., and Zitouni, I. (2012). Recognizing Textual Entailment. Multilingual Natural Language Applications: From Theory to Practice, Prentice Hall."},{"key":"ref_6","first-page":"135","article-title":"A Survey of Paraphrasing and Textual Entailment Methods","volume":"38","author":"Androutsopoulos","year":"2010","journal-title":"J. Artif. Int. Res."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Dagan, I., Glickman, O., and Magnini, B. (2006). The PASCAL Recognising Textual Entailment Challenge. Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment, Springer. Lecture Notes in Computer Science.","DOI":"10.1007\/11736790_9"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1162\/coli.2007.33.1.41","article-title":"Question Answering in Restricted Domains: An Overview","volume":"33","author":"Vicedo","year":"2007","journal-title":"Comput. Linguist."},{"key":"ref_9","unstructured":"Moens, M.F. (2009). Information Extraction: Algorithms and Prospects in a Retrieval Context, Springer."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1162\/coli_a_00002","article-title":"Generating Phrasal and Sentential Paraphrases: A Survey of Data-driven Methods","volume":"36","author":"Madnani","year":"2010","journal-title":"Comput. Linguist."},{"key":"ref_11","first-page":"297","article-title":"Robust Machine Translation Evaluation with Entailment Features","volume":"Volume 1","author":"Galley","year":"2009","journal-title":"Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"10:1","DOI":"10.1145\/2850417","article-title":"Argumentation Mining: State of the Art and Emerging Trends","volume":"16","author":"Lippi","year":"2016","journal-title":"ACM Trans. Internet Technol."},{"key":"ref_13","unstructured":"Rocha, G., Lopes Cardoso, H., and Teixeira, J. (2016, January 13\u201315). ArgMine: A Framework for Argumentation Mining. Proceedings of the 12th International Conference Computational Processing of the Portuguese Language, Student Research Workshop, Tomar, Portugal."},{"key":"ref_14","unstructured":"Fonseca, E., Santos, L., Criscuolo, M., and Alu\u00edsio, S. (2016, January 13\u201315). ASSIN: Avalia\u00e7\u00e3o de Similaridade Sem\u00e3ntica e Infer\u00eancia Textual. Proceedings of the 12th International Conference on Computational Processing of the Portuguese Language, Tomar, Portugal."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Fellbaum, C. (1998). WordNet: An Electronic Lexical Database, MIT Press. Language, Speech, and Communication.","DOI":"10.7551\/mitpress\/7287.001.0001"},{"key":"ref_16","unstructured":"Rockt\u00e4schel, T., Grefenstette, E., Hermann, K.M., Kocisk\u00fd, T., and Blunsom, P. (arXiv, 2015). Reasoning about Entailment with Neural Attention, arXiv."},{"key":"ref_17","unstructured":"De Marneffe, M.C., Rafferty, A.N., and Manning, C.D. (2008, January 15\u201320). Finding Contradictions in Text. Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, Columbus, OH, USA."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Lai, A., and Hockenmaier, J. (2014, January 23\u201324). Illinois-LH: A Denotational and Distributional Approach to Semantics. Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland.","DOI":"10.3115\/v1\/S14-2055"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Bos, J., and Markert, K. (2005, January 6\u20138). Recognising Textual Entailment with Logical Inference. Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, BC, Canada.","DOI":"10.3115\/1220575.1220654"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"763","DOI":"10.1162\/COLI_a_00266","article-title":"Representing Meaning with a Combination of Logical and Distributional Models","volume":"42","author":"Beltagy","year":"2016","journal-title":"Comput. Linguist."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Pakray, P., Neogi, S., Bhaskar, P., Poria, S., Bandyopadhyay, S., and Gelbukh, A.F. (2011, January 14\u201315). A Textual Entailment System Using Anaphora Resolution. Proceedings of the Text Analysis Conference (TAC), Gaithersburg, MD, USA.","DOI":"10.1109\/ICACTE.2010.5579163"},{"key":"ref_22","unstructured":"Bentivogli, L., Dagan, I., Dang, H.T., Giampiccolo, D., and Magnini, B. (2009, January 16\u201317). Fifth PASCAL Recognizing Textual Entailment Challenge. Proceedings of the Text Analysis Conference, Gaithersburg, MD, USA."},{"key":"ref_23","unstructured":"Nakov, P., and Zesch, T. (2014, January 23\u201324). SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment. Proceedings of the 8th International Workshop on Semantic Evaluation, COLING, Dublin, Ireland."},{"key":"ref_24","unstructured":"Chair, N.C.C., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., and Piperidis, S. (2014, January 26\u201331). A SICK Cure for the Evaluation of Compositional Distributional Semantic Models. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC\u201d14), Reykjavik, Iceland."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Nangia, N., Williams, A., Lazaridou, A., and Bowman, S.R. (arXiv, 2017). The RepEval 2017 Shared Task: Multi-Genre Natural Language Inference with Sentence Representations, arXiv.","DOI":"10.18653\/v1\/W17-5301"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Williams, A., Nangia, N., and Bowman, S.R. (arXiv, 2017). A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference, arXiv.","DOI":"10.18653\/v1\/N18-1101"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Bowman, S.R., Angeli, G., Potts, C., and Manning, C.D. (arXiv, 2015). A large annotated corpus for learning natural language inference, arXiv.","DOI":"10.18653\/v1\/D15-1075"},{"key":"ref_28","unstructured":"Papineni, K., Roukos, S., Ward, T., and Zhu, W.J. (2012, January 7\u201312). BLEU: A Method for Automatic Evaluation of Machine Translation. Proceedings of the 40th Annual Meeting Association Computational Linguistics, Philadelphia, PA, USA."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Magnini, B., Zanoli, R., Dagan, I., Eichler, K., Neumann, G., Noh, T., Pad\u00f3, S., Stern, A., and Levy, O. (2014, January 22\u201327). The Excitement Open Platform for Textual Inferences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, USA.","DOI":"10.3115\/v1\/P14-5008"},{"key":"ref_30","first-page":"59","article-title":"Solo Queue at ASSIN: Combinando Abordagens Tradicionais e Emergentes","volume":"8","author":"Hartmann","year":"2016","journal-title":"Linguam\u00e1tica"},{"key":"ref_31","unstructured":"Sparck Jones, K. (1988). Chapter A Statistical Interpretation of Term Specificity and Its Application in Retrieval. Document Retrieval Systems, Taylor Graham Publishing."},{"key":"ref_32","unstructured":"Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean, J. (2013, January 5\u201310). Distributed Representations of Words and Phrases and Their Compositionality. Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_33","first-page":"33","article-title":"INESC-ID@ASSIN: Medi\u00e7\u00e3o de Similaridade Sem\u00e2ntica e Reconhecimento de Infer\u00eancia Textual","volume":"8","author":"Fialho","year":"2016","journal-title":"Linguam\u00e1tica"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Lin, C.Y., and Och, F.J. (2004, January 21\u201326). Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-bigram Statistics. Proceedings of the 42nd Annual Meeting Association for Computational Linguistics, Barcelona, Spain.","DOI":"10.3115\/1218955.1219032"},{"key":"ref_35","first-page":"43","article-title":"ASAPP: Alinhamento Sem\u00e2ntico Autom\u00e1tico de Palavras aplicado ao Portugu\u00eas","volume":"8","author":"Rodrigues","year":"2016","journal-title":"Linguam\u00e1tica"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long Short-Term Memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_37","unstructured":"Bowman, S.R. (2016). Modeling Natural Language Semantics in Learned Representations. [Ph.D. Thesis, Stanford University]."},{"key":"ref_38","unstructured":"Consortium, T.F., Cooper, R., Crouch, D., Eijck, J.V., Fox, C., Genabith, J.V., Jaspars, J., Kamp, H., Milward, D., and Pinkal, M. (2018, March 28). Using the Framework;. Available online: https:\/\/files.ifi.uzh.ch\/cl\/hess\/classes\/seminare\/interface\/framework.pdf."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1162\/tacl_a_00166","article-title":"From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions","volume":"2","author":"Young","year":"2014","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Levy, O., Dagan, I., and Goldberger, J. (2014, January 26\u201327). Focused Entailment Graphs for Open IE Propositions. Proceedings of the Eighteenth Conference on Computational Natural Language Learning, Baltimore, MD, USA.","DOI":"10.3115\/v1\/W14-1610"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Pavlick, E., Rastogi, P., Ganitkevitch, J., Durme, B.V., and Callison-Burch, C. (2015, January 26\u201331). PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, China.","DOI":"10.3115\/v1\/P15-2070"},{"key":"ref_42","first-page":"993","article-title":"Latent Dirichlet Allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1007\/978-3-319-27653-3_7","article-title":"Yet Another Suite of Multilingual NLP Tools","volume":"Volume 563","year":"2015","journal-title":"Languages, Applications and Technologies. Communications in Computer and Information Science"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Silva, J., Ribeiro, R., Quaresma, P., Adami, A., and Branco, A. (2016, January 13\u201315). CONTO.PT: Groundwork for the Automatic Creation of a Fuzzy Portuguese Wordnet. Proceedings of the 12th International Conference on Computational Processing of the Portuguese Language, Tomar, Portugal.","DOI":"10.1007\/978-3-319-41552-9"},{"key":"ref_45","unstructured":"Al-Rfou, R., Perozzi, B., and Skiena, S. (2013, January 8\u20139). Polyglot: Distributed Word Representations for Multilingual NLP. Proceedings of the Seventeenth Conference on Computational Natural Language Learning, Sofia, Bulgaria."},{"key":"ref_46","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/9\/4\/76\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T14:59:00Z","timestamp":1760194740000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/9\/4\/76"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,3,29]]},"references-count":46,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2018,4]]}},"alternative-id":["info9040076"],"URL":"https:\/\/doi.org\/10.3390\/info9040076","relation":{},"ISSN":["2078-2489"],"issn-type":[{"type":"electronic","value":"2078-2489"}],"subject":[],"published":{"date-parts":[[2018,3,29]]}}}