{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T23:38:06Z","timestamp":1776209886479,"version":"3.50.1"},"reference-count":66,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T00:00:00Z","timestamp":1687824000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T00:00:00Z","timestamp":1687824000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Artif Intell Law"],"published-print":{"date-parts":[[2024,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same document. Large parts of judicial branches all over the world are concerned with settling disagreements that arise, in part, from these different interpretations. In this context, it only seems natural that during the annotation of legal machine learning data sets, disagreement, how to report it, and how to handle it should play an important role. This article presents an analysis of the current state-of-the-art in the annotation of legal machine learning data sets. The results of the analysis show that all of the analysed data sets remove all traces of disagreement, instead of trying to utilise the information that might be contained in conflicting annotations. Additionally, the publications introducing the data sets often do provide little information about the process that derives the \u201cgold standard\u201d from the initial annotations, often making it difficult to judge the reliability of the annotation process. Based on the state-of-the-art, the article provides easily implementable suggestions on how to improve the handling and reporting of disagreement in the annotation of legal machine learning data sets.<\/jats:p>","DOI":"10.1007\/s10506-023-09369-4","type":"journal-article","created":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T08:03:36Z","timestamp":1687853016000},"page":"839-862","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["I beg to differ: how disagreement is handled in the annotation of legal machine learning data sets"],"prefix":"10.1007","volume":"32","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8120-3368","authenticated-orcid":false,"given":"Daniel","family":"Braun","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,6,27]]},"reference":[{"key":"9369_CR1","doi-asserted-by":"publisher","unstructured":"Akhtar S, Basile V, Patti V (2020) Modeling annotator perspective and polarized opinions to improve hate speech detection. In: Proceedings of the AAAI conference on human computation and crowdsourcing, vol 8, no 1, pp 151\u2013154. https:\/\/doi.org\/10.1609\/hcomp.v8i1.7473","DOI":"10.1609\/hcomp.v8i1.7473"},{"key":"9369_CR2","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1007\/978-94-024-0881-2_11","volume-title":"Inter-annotator agreement","author":"R Artstein","year":"2017","unstructured":"Artstein R (2017) Inter-annotator agreement. Springer, Dordrecht, pp 297\u2013313. https:\/\/doi.org\/10.1007\/978-94-024-0881-2_11"},{"issue":"4","key":"9369_CR3","doi-asserted-by":"publisher","first-page":"555","DOI":"10.1162\/coli.07-034-R2","volume":"34","author":"R Artstein","year":"2008","unstructured":"Artstein R, Poesio M (2008) Inter-coder agreement for computational linguistics. Comput Linguist 34(4):555\u2013596. https:\/\/doi.org\/10.1162\/coli.07-034-R2","journal-title":"Comput Linguist"},{"key":"9369_CR4","unstructured":"Basile V, Cabitza F, Campagner A et\u00a0al. (2021) Toward a perspectivist turn in ground truthing for predictive computing. CoRR arxiv:2109.04270"},{"key":"9369_CR5","doi-asserted-by":"crossref","unstructured":"Beigman\u00a0Klebanov B, Beigman E, Diermeier D (2008) Analyzing disagreements. In: Coling 2008: proceedings of the workshop on human judgements in computational linguistics. Coling 2008 Organizing Committee, Manchester, UK, pp 2\u20137. https:\/\/aclanthology.org\/W08-1202","DOI":"10.3115\/1611628.1611630"},{"key":"9369_CR6","doi-asserted-by":"publisher","unstructured":"Borchmann \u0141, Wisniewski D, Gretkowski A et\u00a0al. (2020) Contract discovery: Dataset and a few-shot semantic retrieval challenge with competitive baselines. In: Findings of the association for computational linguistics: EMNLP 2020. Association for Computational Linguistics, Online, pp 4254\u20134268. https:\/\/doi.org\/10.18653\/v1\/2020.findings-emnlp.380","DOI":"10.18653\/v1\/2020.findings-emnlp.380"},{"key":"9369_CR7","doi-asserted-by":"publisher","unstructured":"Braun D, Matthes F (2021) NLP for consumer protection: battling illegal clauses in German terms and conditions in online shopping. In: Proceedings of the 1st workshop on NLP for positive impact. Association for Computational Linguistics, Online, pp 93\u201399. https:\/\/doi.org\/10.18653\/v1\/2021.nlp4posimpact-1.10","DOI":"10.18653\/v1\/2021.nlp4posimpact-1.10"},{"key":"9369_CR8","doi-asserted-by":"publisher","unstructured":"Braun D, Matthes F (2022) Clause topic classification in German and English standard form contracts. In: Proceedings of the fifth workshop on e-commerce and NLP (ECNLP 5). Association for Computational Linguistics, Dublin, Ireland, pp 199\u2013209. https:\/\/doi.org\/10.18653\/v1\/2022.ecnlp-1.23","DOI":"10.18653\/v1\/2022.ecnlp-1.23"},{"key":"9369_CR9","doi-asserted-by":"publisher","first-page":"771","DOI":"10.1016\/j.ins.2020.09.049","volume":"545","author":"A Campagner","year":"2021","unstructured":"Campagner A, Ciucci D, Svensson CM et\u00a0al. (2021) Ground truthing from multi-rater labeling with three-way decision and possibility theory. Inf Sci 545:771\u2013790. https:\/\/doi.org\/10.1016\/j.ins.2020.09.049","journal-title":"Inf Sci"},{"key":"9369_CR10","doi-asserted-by":"publisher","unstructured":"Chalkidis I, Androutsopoulos I, Michos A (2017) Extracting contract elements. In: Proceedings of the 16th edition of the international conference on artificial intelligence and law. Association for Computing Machinery, New York, NY, USA, ICAIL \u201917, pp 19\u201328. https:\/\/doi.org\/10.1145\/3086512.3086515","DOI":"10.1145\/3086512.3086515"},{"key":"9369_CR11","doi-asserted-by":"publisher","unstructured":"Chalkidis I, Jana A, Hartung D et\u00a0al. (2022) LexGLUE: a benchmark dataset for legal language understanding in English. In: Proceedings of the 60th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Dublin, Ireland, pp 4310\u20134330. https:\/\/doi.org\/10.18653\/v1\/2022.acl-long.297","DOI":"10.18653\/v1\/2022.acl-long.297"},{"key":"9369_CR12","doi-asserted-by":"publisher","unstructured":"Chan B, Schweter S, M\u00f6ller T (2020) German\u2019s next language model. In: Proceedings of the 28th international conference on computational linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), pp 6788\u20136796. https:\/\/doi.org\/10.18653\/v1\/2020.coling-main.598","DOI":"10.18653\/v1\/2020.coling-main.598"},{"issue":"1","key":"9369_CR13","doi-asserted-by":"publisher","first-page":"124","DOI":"10.1016\/j.csi.2011.06.002","volume":"34","author":"M Chinosi","year":"2012","unstructured":"Chinosi M, Trombetta A (2012) BPMN: an introduction to the standard. Comput Stand Interfaces 34(1):124\u2013134. https:\/\/doi.org\/10.1016\/j.csi.2011.06.002","journal-title":"Comput Stand Interfaces"},{"issue":"4","key":"9369_CR14","doi-asserted-by":"publisher","first-page":"213","DOI":"10.1037\/h0026256","volume":"70","author":"J Cohen","year":"1968","unstructured":"Cohen J (1968) Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol Bull 70(4):213","journal-title":"Psychol Bull"},{"key":"9369_CR15","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1162\/tacl_a_00449","volume":"10","author":"AM Davani","year":"2022","unstructured":"Davani AM, D\u00edaz M, Prabhakaran V (2022) Dealing with disagreements: looking beyond the majority vote in subjective annotations. Trans Assoc Comput Linguist 10:92\u2013110. https:\/\/doi.org\/10.1162\/tacl_a_00449","journal-title":"Trans Assoc Comput Linguist"},{"key":"9369_CR16","doi-asserted-by":"publisher","unstructured":"Drawzeski K, Galassi A, Jablonowska A et\u00a0al. (2021) A corpus for multilingual analysis of online terms of service. In: Proceedings of the natural legal language processing workshop 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 1\u20138. https:\/\/doi.org\/10.18653\/v1\/2021.nllp-1.1","DOI":"10.18653\/v1\/2021.nllp-1.1"},{"key":"9369_CR17","doi-asserted-by":"publisher","first-page":"439","DOI":"10.1007\/978-3-030-32381-3_36","volume-title":"Chinese computational linguistics","author":"X Duan","year":"2019","unstructured":"Duan X, Wang B, Wang Z et\u00a0al. (2019) CJRC: a reliable human-annotated benchmark dataset for Chinese judicial reading comprehension. In: Sun M, Huang X, Ji H et\u00a0al. (eds) Chinese computational linguistics. Springer, Cham, pp 439\u2013451"},{"issue":"5","key":"9369_CR18","doi-asserted-by":"publisher","first-page":"378","DOI":"10.1037\/h0031619","volume":"76","author":"JL Fleiss","year":"1971","unstructured":"Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378","journal-title":"Psychol Bull"},{"issue":"12","key":"9369_CR19","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1145\/3458723","volume":"64","author":"T Gebru","year":"2021","unstructured":"Gebru T, Morgenstern J, Vecchione B et\u00a0al. (2021) Datasheets for datasets. Commun ACM 64(12):86\u201392. https:\/\/doi.org\/10.1145\/3458723","journal-title":"Commun ACM"},{"key":"9369_CR20","unstructured":"Glaser I, Scepankova E, Matthes F (2018) Classifying semantic types of legal sentences: portability of machine learning models. In: Legal knowledge and information systems. IOS Press, pp 61\u201370"},{"key":"9369_CR21","doi-asserted-by":"publisher","unstructured":"Gonzalez D, Zimmermann T, Nagappan N (2020) The state of the ML-universe: 10 years of artificial intelligence & machine learning software development on GitHub. In: Proceedings of the 17th international conference on mining software repositories. Association for Computing Machinery, New York, NY, USA, MSR \u201920, pp 431\u2013442. https:\/\/doi.org\/10.1145\/3379597.3387473","DOI":"10.1145\/3379597.3387473"},{"key":"9369_CR22","unstructured":"Grover C, Hachey B, Hughson I (2004) The HOLJ corpus. Supporting summarisation of legal texts. In: Proceedings of the 5th international workshop on linguistically interpreted Corpora. COLING, Geneva, Switzerland, pp 47\u201354. https:\/\/aclanthology.org\/W04-1907"},{"key":"9369_CR23","unstructured":"Guha N (2021) Datasets for machine learning in law. Tech. rep., Stanford University, https:\/\/github.com\/neelguha\/legal-ml-datasets"},{"key":"9369_CR24","doi-asserted-by":"publisher","unstructured":"Habernal I, Faber D, Recchia N et\u00a0al. (2022) Mining legal arguments in court decisions. arXiv preprint https:\/\/doi.org\/10.48550\/arXiv.2208.06178","DOI":"10.48550\/arXiv.2208.06178"},{"key":"9369_CR25","unstructured":"Hendrycks D, Burns C, Chen A et\u00a0al. (2021) CUAD: an expert-annotated NLP dataset for legal contract review. CoRR arxiv:2103.06268"},{"key":"9369_CR26","doi-asserted-by":"crossref","unstructured":"Holland S, Hosny A, Newman S et\u00a0al. (2020) The dataset nutrition label. Data protection and privacy, volume 12: data protection and democracy 12:1","DOI":"10.5040\/9781509932771.ch-001"},{"key":"9369_CR27","doi-asserted-by":"publisher","unstructured":"Jamison E, Gurevych I (2015) Noise or additional information? leveraging crowdsource annotation item agreement for natural language tasks. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, Portugal, pp 291\u2013297. https:\/\/doi.org\/10.18653\/v1\/D15-1035","DOI":"10.18653\/v1\/D15-1035"},{"key":"9369_CR28","unstructured":"Kalamkar P, Tiwari A, Agarwal A et\u00a0al. (2022) Corpus for automatic structuring of legal documents. CoRR arxiv:2201.13125"},{"key":"9369_CR29","unstructured":"Keymanesh M, Elsner M, Sarthasarathy S (2020) Toward domain-guided controllable summarization of privacy policies. In: NLLP@ KDD, pp 18\u201324"},{"key":"9369_CR30","doi-asserted-by":"publisher","unstructured":"Klemen M, Robnik-\u0160ikonja M (2022) ULFRI at SemEval-2022 task 4: leveraging uncertainty and additional knowledge for patronizing and condescending language detection. In: Proceedings of the 16th international workshop on semantic evaluation (SemEval-2022). Association for Computational Linguistics, Seattle, United States, pp 525\u2013532. https:\/\/doi.org\/10.18653\/v1\/2022.semeval-1.73","DOI":"10.18653\/v1\/2022.semeval-1.73"},{"key":"9369_CR31","doi-asserted-by":"publisher","first-page":"681","DOI":"10.1007\/978-3-031-08974-9_54","volume-title":"Information processing and management of uncertainty in knowledge-based systems","author":"P Kralj Novak","year":"2022","unstructured":"Kralj Novak P, Scantamburlo T, Pelicon A et\u00a0al. (2022) Handling disagreement in hate speech modelling. In: Ciucci D, Couso I, Medina J et\u00a0al. (eds) Information processing and management of uncertainty in knowledge-based systems. Springer, Cham, pp 681\u2013695"},{"key":"9369_CR32","volume-title":"Content analysis: an introduction to its methodology","author":"K Krippendorff","year":"2018","unstructured":"Krippendorff K (2018) Content analysis: an introduction to its methodology, 4th edn. Sage Publications, Thousand Oaks","edition":"4"},{"key":"9369_CR33","doi-asserted-by":"publisher","first-page":"159","DOI":"10.2307\/2529310","volume":"33","author":"JR Landis","year":"1977","unstructured":"Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159\u2013174","journal-title":"Biometrics"},{"key":"9369_CR34","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1016\/j.esp.2016.10.001","volume":"45","author":"S Li","year":"2017","unstructured":"Li S (2017) A corpus-based study of vague language in legislative texts: strategic use of vague terms. Engl Specif Purp 45:98\u2013109. https:\/\/doi.org\/10.1016\/j.esp.2016.10.001","journal-title":"Engl Specif Purp"},{"issue":"2","key":"9369_CR35","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1007\/s10506-019-09243-2","volume":"27","author":"M Lippi","year":"2019","unstructured":"Lippi M, Pa\u0142ka P, Contissa G et\u00a0al. (2019) Claudette: an automated detector of potentially unfair clauses in online terms of service. Artif Intell Law 27(2):117\u2013139","journal-title":"Artif Intell Law"},{"key":"9369_CR36","doi-asserted-by":"publisher","unstructured":"Locke D, Zuccon G (2018) A test collection for evaluating legal case law search. In: The 41st international ACM SIGIR conference on research & development in information retrieval. Association for Computing Machinery, New York, NY, USA, SIGIR \u201918, pp 1261\u20131264. https:\/\/doi.org\/10.1145\/3209978.3210161","DOI":"10.1145\/3209978.3210161"},{"key":"9369_CR37","doi-asserted-by":"publisher","unstructured":"Louis A, Spanakis G (2022) A statutory article retrieval dataset in French. In: Proceedings of the 60th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Dublin, Ireland, pp 6789\u20136803. https:\/\/doi.org\/10.18653\/v1\/2022.acl-long.468","DOI":"10.18653\/v1\/2022.acl-long.468"},{"key":"9369_CR38","unstructured":"L\u00fcbbe-Wolff G (2022) Beratungskulturen: Wie verfassungsgerichte arbeiten, und wovon es abh\u00e4ngt, ob sie integrieren oder polarisieren. Tech. rep, Konrad-Adenauer-Stiftung"},{"key":"9369_CR39","doi-asserted-by":"publisher","unstructured":"Manor L, Li JJ (2019) Plain English summarization of contracts. In: Proceedings of the natural legal language processing workshop 2019. Association for Computational Linguistics, Minneapolis, Minnesota, pp 1\u201311. https:\/\/doi.org\/10.18653\/v1\/W19-2201, https:\/\/aclanthology.org\/W19-2201","DOI":"10.18653\/v1\/W19-2201"},{"key":"9369_CR40","doi-asserted-by":"publisher","unstructured":"Ostendorff M, Blume T, Ostendorff S (2020) Towards an open platform for legal information. In: Proceedings of the ACM\/IEEE joint conference on digital libraries in 2020. Association for Computing Machinery, New York, NY, USA, JCDL \u201920, pp 385\u2013388. https:\/\/doi.org\/10.1145\/3383583.3398616","DOI":"10.1145\/3383583.3398616"},{"key":"9369_CR41","unstructured":"Ovesdotter\u00a0Alm C (2011) Subjective natural language problems: motivations, applications, characterizations, and implications. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Portland, Oregon, USA, pp 107\u2013112. https:\/\/aclanthology.org\/P11-2019"},{"key":"9369_CR42","unstructured":"Poudyal P, Savelka J, Ieven A et\u00a0al. (2020) ECHR: legal corpus for argument mining. In: Proceedings of the 7th workshop on argument mining. Association for Computational Linguistics, Online, pp 67\u201375. https:\/\/aclanthology.org\/2020.argmining-1.8"},{"key":"9369_CR43","doi-asserted-by":"publisher","unstructured":"Prabhakaran V, Mostafazadeh\u00a0Davani A, Diaz M (2021) On releasing annotator-level labels and information in datasets. In: Proceedings of the Joint 15th linguistic annotation workshop (LAW) and 3rd designing meaning representations (DMR) workshop. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 133\u2013138. https:\/\/doi.org\/10.18653\/v1\/2021.law-1.14","DOI":"10.18653\/v1\/2021.law-1.14"},{"key":"9369_CR44","doi-asserted-by":"publisher","unstructured":"Ramponi A, Leonardelli E (2022) DH-FBK at SemEval-2022 task 4: Leveraging annotators\u2019 disagreement and multiple data views for patronizing language detection. In: Proceedings of the 16th international workshop on semantic evaluation (SemEval-2022). Association for Computational Linguistics, Seattle, United States, pp 324\u2013334. https:\/\/doi.org\/10.18653\/v1\/2022.semeval-1.42","DOI":"10.18653\/v1\/2022.semeval-1.42"},{"key":"9369_CR45","doi-asserted-by":"publisher","unstructured":"Roegiest A, Hudek AK, McNulty A (2018) A dataset and an examination of identifying passages for due diligence. In: The 41st international ACM SIGIR conference on research & development in information retrieval. Association for Computing Machinery, New York, NY, USA, SIGIR \u201918, pp 465\u2013474. https:\/\/doi.org\/10.1145\/3209978.3210015","DOI":"10.1145\/3209978.3210015"},{"key":"9369_CR46","doi-asserted-by":"publisher","unstructured":"Rottger P, Vidgen B, Hovy D et\u00a0al. (2022) Two contrasting data annotation paradigms for subjective NLP tasks. In: Proceedings of the 2022 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, pp 175\u2013190. https:\/\/doi.org\/10.18653\/v1\/2022.naacl-main.13","DOI":"10.18653\/v1\/2022.naacl-main.13"},{"key":"9369_CR47","unstructured":"Sachdeva P, Barreto R, Bacon G et\u00a0al. (2022) The measuring hate speech corpus: leveraging Rasch measurement theory for data perspectivism. In: Proceedings of the 1st workshop on perspectivist approaches to NLP @LREC2022. European Language Resources Association, Marseille, France, pp 83\u201394. https:\/\/aclanthology.org\/2022.nlperspectives-1.11"},{"issue":"111","key":"9369_CR48","doi-asserted-by":"publisher","first-page":"343","DOI":"10.1016\/j.jss.2022.111343","volume":"190","author":"C Sas","year":"2022","unstructured":"Sas C, Capiluppi A (2022) Antipatterns in software classification taxonomies. J Syst Softw 190(111):343. https:\/\/doi.org\/10.1016\/j.jss.2022.111343","journal-title":"J Syst Softw"},{"key":"9369_CR49","unstructured":"\u0160avelka J, Ashley KD (2018) Segmenting us court decisions into functional and issue specific parts. In: Legal knowledge and information systems. IOS Press, pp 111\u2013120"},{"key":"9369_CR50","doi-asserted-by":"publisher","unstructured":"Savelka J, Xu H, Ashley KD (2019) Improving sentence retrieval from case law for statutory interpretation. In: Proceedings of the seventeenth international conference on artificial intelligence and law. Association for Computing Machinery, New York, NY, USA, ICAIL \u201919, pp 113\u2013122. https:\/\/doi.org\/10.1145\/3322640.3326736","DOI":"10.1145\/3322640.3326736"},{"key":"9369_CR51","unstructured":"Schwarzer M (2022) awesome-legal-data. Tech. rep., Open Justive e.V., https:\/\/github.com\/openlegaldata\/awesome-legal-data"},{"key":"9369_CR52","unstructured":"Steinberger R, Pouliquen B, Widiger A et\u00a0al. (2006) The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. In: Proceedings of the fifth international conference on language resources and evaluation (LREC\u201906). European Language Resources Association (ELRA), Genoa, Italy. http:\/\/www.lrec-conf.org\/proceedings\/lrec2006\/pdf\/340_pdf.pdf"},{"key":"9369_CR53","first-page":"665","volume-title":"Medical image computing and computer assisted intervention\u2014MICCAI 2019","author":"CH Sudre","year":"2019","unstructured":"Sudre CH, Anson BG, Ingala S et\u00a0al. (2019) Let\u2019s agree to disagree: learning highly debatable multirater labelling. In: Shen D, Liu T, Peters TM et\u00a0al. (eds) Medical image computing and computer assisted intervention\u2014MICCAI 2019. Springer, Cham, pp 665\u2013673"},{"key":"9369_CR54","unstructured":"Tiwari A, Kalamkar P, Agarwal A et\u00a0al. (2022) Must-read papers on legal intelligence. Tech. rep., OpenNyAI. https:\/\/github.com\/Legal-NLP-EkStep\/rhetorical-role-baseline"},{"key":"9369_CR55","unstructured":"Tuggener D, von D\u00e4niken P, Peetz T et\u00a0al. (2020) LEDGAR: a large-scale multi-label corpus for text classification of legal provisions in contracts. In: Proceedings of the twelfth language resources and evaluation conference. European Language Resources Association, Marseille, France, pp 1235\u20131241. https:\/\/aclanthology.org\/2020.lrec-1.155"},{"key":"9369_CR56","doi-asserted-by":"publisher","unstructured":"Urchs S, Mitrovi\u0107 J, Granitzer M (2021) Design and implementation of German legal decision corpora. In: Proceedings of the 13th international conference on agents and artificial intelligence\u2014volume 2: ICAART, INSTICC. SciTePress, pp 515\u2013521. https:\/\/doi.org\/10.5220\/0010187305150521","DOI":"10.5220\/0010187305150521"},{"key":"9369_CR57","unstructured":"Walker VR, Strong SR, Walker VE (2020) Automating the classification of finding sentences for linguistic polarity. In: Proceedings of the fourth workshop on automated semantic analysis of information in legal text"},{"key":"9369_CR58","unstructured":"Waltl B (2022) Legal text analytics. Tech. rep., Liquid Legal Institute e.V. https:\/\/github.com\/Liquid-Legal-Institute\/Legal-Text-Analytics"},{"key":"9369_CR59","doi-asserted-by":"publisher","unstructured":"Wilson S, Schaub F, Dara AA et\u00a0al. (2016) The creation and analysis of a website privacy policy corpus. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, Germany, pp 1330\u20131340. https:\/\/doi.org\/10.18653\/v1\/P16-1126, https:\/\/aclanthology.org\/P16-1126","DOI":"10.18653\/v1\/P16-1126"},{"key":"9369_CR60","doi-asserted-by":"publisher","DOI":"10.7717\/peerj-cs.134","volume":"3","author":"Y Wu","year":"2017","unstructured":"Wu Y, Wang N, Kropczynski J et\u00a0al. (2017) The appropriation of GitHub for curation. PeerJ Comput Sci 3:e134","journal-title":"PeerJ Comput Sci"},{"key":"9369_CR61","unstructured":"Wyner A, Peters W, Katz D (2013) A case study on legal case annotation. In: Legal knowledge and information systems. IOS Press, pp165\u2013174"},{"key":"9369_CR62","unstructured":"Xiao C, Zhong H, Guo Z et\u00a0al. (2019) CAIL2019-SCM: a dataset of similar case matching in legal domain. CoRR arxiv:1911.08962"},{"key":"9369_CR63","unstructured":"Xiao C, Zhong H, Sun Y (2021) Must-read papers on legal intelligence. Tech. rep., Tsinghua University. https:\/\/github.com\/thunlp\/LegalPapers"},{"key":"9369_CR64","doi-asserted-by":"publisher","unstructured":"Zahidi Y, El\u00a0Younoussi Y, Azroumahli C (2019) Comparative study of the most useful Arabic-supporting natural language processing and deep learning libraries. In: 2019 5th international conference on optimization and applications (ICOA), pp 1\u201310. https:\/\/doi.org\/10.1109\/ICOA.2019.8727617","DOI":"10.1109\/ICOA.2019.8727617"},{"key":"9369_CR65","doi-asserted-by":"publisher","unstructured":"Zhong H, Xiao C, Tu C et\u00a0al. (2020) JEC-QA: a legal-domain question answering dataset. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, no. 05, pp 9701\u20139708. https:\/\/doi.org\/10.1609\/aaai.v34i05.6519","DOI":"10.1609\/aaai.v34i05.6519"},{"key":"9369_CR66","first-page":"66","volume":"2019","author":"S Zimmeck","year":"2019","unstructured":"Zimmeck S, Story P, Smullen D et\u00a0al. (2019) Maps: scaling privacy compliance analysis to a million apps. Proc Priv Enhanc Technol 2019:66","journal-title":"Proc Priv Enhanc Technol"}],"container-title":["Artificial Intelligence and Law"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10506-023-09369-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10506-023-09369-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10506-023-09369-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,31]],"date-time":"2024-07-31T18:49:00Z","timestamp":1722451740000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10506-023-09369-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,27]]},"references-count":66,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,9]]}},"alternative-id":["9369"],"URL":"https:\/\/doi.org\/10.1007\/s10506-023-09369-4","relation":{},"ISSN":["0924-8463","1572-8382"],"issn-type":[{"value":"0924-8463","type":"print"},{"value":"1572-8382","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,27]]},"assertion":[{"value":"12 June 2023","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 June 2023","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}