{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T18:33:58Z","timestamp":1775586838586,"version":"3.50.1"},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2024,4,3]],"date-time":"2024-04-03T00:00:00Z","timestamp":1712102400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,4,3]],"date-time":"2024-04-03T00:00:00Z","timestamp":1712102400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004434","name":"Universit\u00e0 degli Studi di Firenze","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100004434","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Artif Intell Law"],"published-print":{"date-parts":[[2025,9]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Most of the existing natural language processing systems for legal texts are developed for the English language. Nevertheless, there are several application domains where multiple versions of the same documents are provided in different languages, especially inside the European Union. One notable example is given by Terms of Service (ToS). In this paper, we compare different approaches to the task of detecting potential unfair clauses in ToS across multiple languages. In particular, after developing an annotated corpus and a machine learning classifier for English, we consider and compare several strategies to extend the system to other languages: building a novel corpus and training a novel machine learning system for each language, from scratch; projecting annotations across documents in different languages, to avoid the creation of novel corpora; translating training documents while keeping the original annotations; translating queries at prediction time and relying on the English system only. An extended experimental evaluation conducted on a large, original dataset indicates that the time-consuming task of re-building a novel annotated corpus for each language can often be avoided with no significant degradation in terms of performance.\n<\/jats:p>","DOI":"10.1007\/s10506-024-09398-7","type":"journal-article","created":{"date-parts":[[2024,4,3]],"date-time":"2024-04-03T09:03:06Z","timestamp":1712134986000},"page":"641-689","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Unfair clause detection in terms of service across multiple languages"],"prefix":"10.1007","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9711-7042","authenticated-orcid":false,"given":"Andrea","family":"Galassi","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7083-3487","authenticated-orcid":false,"given":"Francesca","family":"Lagioia","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9622-0651","authenticated-orcid":false,"given":"Agnieszka","family":"Jab\u0142onowska","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9663-1071","authenticated-orcid":false,"given":"Marco","family":"Lippi","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,4,3]]},"reference":[{"key":"9398_CR1","unstructured":"Ajani G (2007) Coherence of terminology and search functions. In: 25 years of European Law online: the event: 25 ann\u00e9es de Droit europ\u00e9en en ligne: l\u2019\u00e9v\u00e9nement, Oficina de Publicaciones Oficiales de las Comunidades Europeas, pp 129\u2013136"},{"key":"9398_CR2","doi-asserted-by":"crossref","unstructured":"Bender EM (2011) On achieving and evaluating language-independence in NLP. Linguist Issues Lang Technol 6(3):1\u201328","DOI":"10.33011\/lilt.v6i.1239"},{"key":"9398_CR3","doi-asserted-by":"publisher","unstructured":"Chalkidis I, Fergadiotis M, Malakasiotis P, Aletras N, Androutsopoulos I (2020) LEGAL-BERT: the muppets straight out of law school. In: Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, pp 2898\u20132904. https:\/\/doi.org\/10.18653\/v1\/2020.findings-emnlp.261. https:\/\/aclanthology.org\/2020.findings-emnlp.261","DOI":"10.18653\/v1\/2020.findings-emnlp.261"},{"key":"9398_CR4","doi-asserted-by":"publisher","unstructured":"Chalkidis I, Fergadiotis M, Androutsopoulos I (2021) MultiEURLEX\u2014a multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer. In: Proceedings of the 2021 conference on empirical methods in natural language processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp 6974\u20136996. https:\/\/doi.org\/10.18653\/v1\/2021.emnlp-main.559. https:\/\/aclanthology.org\/2021.emnlp-main.559","DOI":"10.18653\/v1\/2021.emnlp-main.559"},{"key":"9398_CR5","doi-asserted-by":"publisher","unstructured":"Cotterell R, Heigold G (2017) Cross-lingual character-level neural morphological tagging. In: EMNLP, Copenhagen, Denmark, pp 748\u2013759. https:\/\/doi.org\/10.18653\/v1\/D17-1078","DOI":"10.18653\/v1\/D17-1078"},{"key":"9398_CR6","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805"},{"key":"9398_CR7","doi-asserted-by":"publisher","unstructured":"Drazewski K, Galassi A, Jab\u0142onowska A, Lagioia F, Lippi M, Micklitz HW, Sartor G, Tagiuri G, Torroni P (2021) A corpus for multilingual analysis of online terms of service. In: Proceedings of the natural legal language processing workshop 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 1\u20138. https:\/\/doi.org\/10.18653\/v1\/2021.nllp-1.1. https:\/\/aclanthology.org\/2021.nllp-1.1","DOI":"10.18653\/v1\/2021.nllp-1.1"},{"key":"9398_CR8","unstructured":"Eger S, Daxenberger J, Stab C, Gurevych I (2018) Cross-lingual argumentation mining: machine translation (and a bit of projection) is all you need! In: Proceedings of the 27th international conference on computational linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, USA, pp 831\u2013844. https:\/\/aclanthology.org\/C18-1071"},{"key":"9398_CR9","unstructured":"European\u00a0Parliament DGfT (2017) Translation services in the digital world\u2014a sneak peek into the (near) future: Dg trad conference 16\u201317 October 2017. https:\/\/data.europa.eu\/doi\/10.2861\/823102"},{"key":"9398_CR10","doi-asserted-by":"crossref","unstructured":"Feng F, Yang Y, Cer D, Arivazhagan N, Wang W (2022) Language-agnostic BERT sentence embedding. In: ACL (1). Association for Computational Linguistics, pp 878\u2013891","DOI":"10.18653\/v1\/2022.acl-long.62"},{"key":"9398_CR11","doi-asserted-by":"publisher","unstructured":"Galassi A, Drazewski K, Lippi M, Torroni P (2020) Cross-lingual annotation projection in legal texts. In: Proceedings of the 28th international conference on computational linguistics, pp 915\u2013926. https:\/\/doi.org\/10.18653\/v1\/2020.coling-main.79. https:\/\/aclanthology.org\/2020.coling-main.79","DOI":"10.18653\/v1\/2020.coling-main.79"},{"key":"9398_CR12","doi-asserted-by":"crossref","unstructured":"Guha N, Nyarko J, Ho DE, Re C, Chilton A, Narayana A, Chohlas-Wood A, Peters A, Waldon B, Rockmore D, Zambrano D, Talisman D, Hoque E, Surani F, Fagan F, Sarfaty G, Dickinson GM, Porat H, Hegland J, Wu J, Nudell J, Niklaus J, Nay JJ, Choi JH, Tobia K, Hagan M, Ma M, Livermore M, Rasumov-Rahe N, Holzenberger N, Kolt N, Henderson P, Rehaag S, Goel S, Gao S, Williams S, Gandhi S, Zur T, Iyer V, Li Z (2023) Legalbench: a collaboratively built benchmark for measuring legal reasoning in large language models. In: Thirty-seventh conference on neural information processing systems datasets and benchmarks track. https:\/\/openreview.net\/forum?id=WqSPQFxFRC","DOI":"10.2139\/ssrn.4583531"},{"key":"9398_CR13","unstructured":"Isbister T, Carlsson F, Sahlgren M (2021) Should we stop training more monolingual models, and simply use machine translation instead? In: Proceedings of the 23rd Nordic conference on computational linguistics (NoDaLiDa). Link\u00f6ping University Electronic Press, Sweden, Reykjavik, Iceland (Online), pp 385\u2013390. https:\/\/aclanthology.org\/2021.nodalida-main.42"},{"key":"9398_CR14","doi-asserted-by":"crossref","unstructured":"Jab\u0142onowska A, Lagioia F, Lippi M, Micklitz HW, Sartor G, Tagiuri G (2021) Assessing the cross-market generalization capability of the Claudette system. In: Legal knowledge and information systems. IOS Press, pp 62\u201367","DOI":"10.3233\/FAIA210318"},{"key":"9398_CR15","doi-asserted-by":"publisher","unstructured":"Kim JK, Kim YB, Sarikaya R, Fosler-Lussier E (2017) Cross-lingual transfer learning for POS tagging without cross-lingual resources. In: EMNLP, Copenhagen, Denmark, pp 2832\u20132838. https:\/\/doi.org\/10.18653\/v1\/D17-1302","DOI":"10.18653\/v1\/D17-1302"},{"key":"9398_CR16","unstructured":"Lample G, Conneau A, Ranzato M, Denoyer L, J\u00e9gou H (2018) Word translation without parallel data. In: 6th International conference on learning representations, ICLR 2018, Vancouver, BC, Canada, April 30\u2013May 3, 2018, conference track proceedings, OpenReview.net. https:\/\/openreview.net\/forum?id=H196sainb"},{"issue":"2","key":"9398_CR17","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1007\/s10506-019-09243-2","volume":"27","author":"M Lippi","year":"2019","unstructured":"Lippi M, Pa\u0142ka P, Contissa G, Lagioia F, Micklitz HW, Sartor G, Torroni P (2019) Claudette: an automated detector of potentially unfair clauses in online terms of service. Artif Intell Law 27(2):117\u2013139. https:\/\/doi.org\/10.1007\/s10506-019-09243-2","journal-title":"Artif Intell Law"},{"key":"9398_CR18","unstructured":"Loos MB (2017) Double Dutch-on the role of the transparency requirement with regard to the language in which standard contract terms for b2c-contracts must be drafted. J Eur Consum Mark Law 6(2). https:\/\/kluwerlawonline.com\/journalarticle\/Journal+of+European+Consumer+and+Market+Law\/6.2\/EuCML2017014"},{"key":"9398_CR19","doi-asserted-by":"publisher","unstructured":"Mielke SJ, Cotterell R, Gorman K, Roark B, Eisner J (2019) What kind of language is hard to language-model? In: Proceedings of the 57th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, pp 4975\u20134989. https:\/\/doi.org\/10.18653\/v1\/P19-1491. https:\/\/aclanthology.org\/P19-1491","DOI":"10.18653\/v1\/P19-1491"},{"key":"9398_CR20","doi-asserted-by":"crossref","unstructured":"Niklaus J, Matoshi V, Rani P, Galassi A, St\u00fcrmer M, Chalkidis I (2023) LEXTREME: a multi-lingual and multi-task benchmark for the legal domain. In: Bouamor H, Pino J, Bali K (eds) Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6\u201310, 2023. Association for Computational Linguistics, pp 3016\u20133054. https:\/\/aclanthology.org\/2023.findings-emnlp.200","DOI":"10.18653\/v1\/2023.findings-emnlp.200"},{"key":"9398_CR21","doi-asserted-by":"crossref","unstructured":"Per\u00e7in S, Galassi A, Lagioia F, Ruggeri F, Santin P, Sartor G, Torroni P (2022) Combining WordNet and word embeddings in data augmentation for legal texts. In: Proceedings of the natural legal language processing workshop 2022. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), pp 47\u201352. https:\/\/aclanthology.org\/2022.nllp-1.4","DOI":"10.18653\/v1\/2022.nllp-1.4"},{"key":"9398_CR22","doi-asserted-by":"publisher","unstructured":"Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: NAACL-HLT. Association for Computational Linguistics, New Orleans, Louisiana, pp 2227\u20132237. https:\/\/doi.org\/10.18653\/v1\/N18-1202. https:\/\/aclanthology.org\/N18-1202","DOI":"10.18653\/v1\/N18-1202"},{"key":"9398_CR23","doi-asserted-by":"publisher","first-page":"1185","DOI":"10.54648\/ERPL2012075","volume":"20","author":"B Pozzo","year":"2012","unstructured":"Pozzo B (2012) Multilingualism and the harmonization of European private law: problems and perspectives. Eur Rev Private L 20:1185","journal-title":"Eur Rev Private L"},{"key":"9398_CR24","doi-asserted-by":"crossref","unstructured":"Pozzo B (2016) The challenges of a multi-lingual approach. In: Research handbook on EU consumer and contract law, pp 138\u2013158","DOI":"10.4337\/9781782547372.00013"},{"key":"9398_CR25","unstructured":"Rivera\u00a0Pastor R, Tar \u00edn\u00a0Quir\u00f3s C, Villar\u00a0Garc\u00eda JP, Badia\u00a0Card\u00fas T, Melero Nogu\u00e9s M (2017) Language equality in the digital age: towards a human language project. www.europarl.europa.eu\/RegData\/etudes\/STUD\/2017\/598621\/EPRS_STU(2017)598621_EN.pdf"},{"key":"9398_CR26","doi-asserted-by":"publisher","unstructured":"Rocha G, Stab C, Lopes\u00a0Cardoso H, Gurevych I (2018) Cross-lingual argumentative relation identification: from English to Portuguese. In: Slonim N, Aharonov R (eds) Proceedings of the 5th workshop on argument mining. Association for Computational Linguistics, Brussels, Belgium, pp 144\u2013154. https:\/\/doi.org\/10.18653\/v1\/W18-5217. https:\/\/aclanthology.org\/W18-5217","DOI":"10.18653\/v1\/W18-5217"},{"key":"9398_CR27","doi-asserted-by":"publisher","DOI":"10.1007\/s10506-021-09288-2","author":"F Ruggeri","year":"2021","unstructured":"Ruggeri F, Lagioia F, Lippi M, Torroni P (2021) Detecting and explaining unfairness in consumer contracts through memory networks. Artif Intell Law. https:\/\/doi.org\/10.1007\/s10506-021-09288-2","journal-title":"Artif Intell Law"},{"key":"9398_CR28","unstructured":"Sakoe H (1971) Dynamic-programming approach to continuous speech recognition. In: 1971 Proceedings of the international congress of acoustics, Budapest, Budapest, Hungary"},{"key":"9398_CR29","volume-title":"Learning with kernels: support vector machines, regularization, optimization, and beyond","author":"B Sch\u00f6lkopf","year":"2002","unstructured":"Sch\u00f6lkopf B, Smola AJ, Bach F et al (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge"},{"key":"9398_CR30","doi-asserted-by":"publisher","first-page":"73","DOI":"10.4236\/blr.2012.33010","volume":"3","author":"D Tiscornia","year":"2012","unstructured":"Tiscornia D, Sagri MT (2012) Legal concepts and multilingual contexts in digital information. Beijing L Rev 3:73","journal-title":"Beijing L Rev"},{"key":"9398_CR31","unstructured":"Wei J, Tay Y, Bommasani R, Raffel C, Zoph B, Borgeaud S, Yogatama D, Bosma M, Zhou D, Metzler D, et\u00a0al (2022) Emergent abilities of large language models. arXiv preprint arXiv:2206.07682"},{"issue":"January 2000","key":"9398_CR32","first-page":"95","volume":"116","author":"S Whittaker","year":"2000","unstructured":"Whittaker S (2000) Unfair contract terms, public services and the construction of a European conception of contract. Law Q Rev 116(January 2000):95\u2013120","journal-title":"Law Q Rev"},{"key":"9398_CR33","doi-asserted-by":"publisher","unstructured":"Xu R, Yang Y, Otani N, Wu Y (2018) Unsupervised cross-lingual transfer of word embedding spaces. In: EMNLP, Brussels, Belgium, pp 2465\u20132474. https:\/\/doi.org\/10.18653\/v1\/D18-1268","DOI":"10.18653\/v1\/D18-1268"},{"key":"9398_CR34","doi-asserted-by":"publisher","unstructured":"Zhang Y, Gaddy D, Barzilay R, Jaakkola T (2016) Ten pairs to tag\u2014multilingual POS tagging via coarse mapping between embeddings. In: HLT-NAACL, San Diego, California, pp 1307\u20131317. https:\/\/doi.org\/10.18653\/v1\/N16-1156","DOI":"10.18653\/v1\/N16-1156"}],"container-title":["Artificial Intelligence and Law"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10506-024-09398-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10506-024-09398-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10506-024-09398-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,23]],"date-time":"2025-09-23T08:05:40Z","timestamp":1758614740000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10506-024-09398-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,3]]},"references-count":34,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,9]]}},"alternative-id":["9398"],"URL":"https:\/\/doi.org\/10.1007\/s10506-024-09398-7","relation":{},"ISSN":["0924-8463","1572-8382"],"issn-type":[{"value":"0924-8463","type":"print"},{"value":"1572-8382","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,3]]},"assertion":[{"value":"25 February 2024","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 April 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}