{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T07:07:28Z","timestamp":1780470448197,"version":"3.54.1"},"reference-count":88,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2022,3,23]],"date-time":"2022-03-23T00:00:00Z","timestamp":1647993600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,3,23]],"date-time":"2022-03-23T00:00:00Z","timestamp":1647993600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Lang Resources &amp; Evaluation"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Identifying words which may cause difficulty for a reader is an essential step in most lexical text simplification systems prior to lexical substitution and can also be used for assessing the readability of a text. This task is commonly referred to as complex word identification (CWI) and is often modelled as a supervised classification problem. For training such systems, annotated datasets in which words and sometimes multi-word expressions are labelled regarding complexity are required. In this paper we analyze previous work carried out in this task and investigate the properties of CWI datasets for English. We develop a protocol for the annotation of lexical complexity and use this to annotate a new dataset, CompLex 2.0. We present experiments using both new and old datasets to investigate the nature of lexical complexity. We found that a Likert-scale annotation protocol provides an objective setting that is superior for identifying the complexity of words compared to a binary annotation protocol. We release a new dataset using our new protocol to promote the task of Lexical Complexity Prediction.<\/jats:p>","DOI":"10.1007\/s10579-022-09588-2","type":"journal-article","created":{"date-parts":[[2022,3,23]],"date-time":"2022-03-23T05:02:30Z","timestamp":1648011750000},"page":"1153-1194","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Predicting lexical complexity in English texts: the Complex 2.0 dataset"],"prefix":"10.1007","volume":"56","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1129-2750","authenticated-orcid":false,"given":"Matthew","family":"Shardlow","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1220-8605","authenticated-orcid":false,"given":"Richard","family":"Evans","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2346-3847","authenticated-orcid":false,"given":"Marcos","family":"Zampieri","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2022,3,23]]},"reference":[{"key":"9588_CR1","doi-asserted-by":"crossref","unstructured":"AbuRa\u2019ed, A., Saggion, H. (2018). LaSTUS\/TALN at Complex Word Identification (CWI) 2018 Shared Task. In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, United States.","DOI":"10.18653\/v1\/W18-0517"},{"key":"9588_CR2","doi-asserted-by":"crossref","unstructured":"Alfter, D., Pil\u00e1n, I. (2018). SB@GU at the Complex Word Identification 2018 Shared Task. In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, United States.","DOI":"10.18653\/v1\/W18-0537"},{"key":"9588_CR3","doi-asserted-by":"crossref","unstructured":"Aroyehun, S. T., Angel, J., Alvarez, D. A. P., Gelbukh, A. (2018). Complex word identification: convolutional neural network vs. feature engineering . In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, United States.","DOI":"10.18653\/v1\/W18-0538"},{"issue":"1","key":"9588_CR4","doi-asserted-by":"publisher","first-page":"161","DOI":"10.1186\/1471-2105-13-161","volume":"13","author":"M Bada","year":"2012","unstructured":"Bada, M., Eckert, M., Evans, D., Garcia, K., Shipley, K., Sitnikov, D., Baumgartner, W. A., Cohen, K. B., Verspoor, K., Blake, J. A., et al. (2012). Concept annotation in the craft corpus. BMC Bioinformatics, 13(1), 161.","journal-title":"BMC Bioinformatics"},{"key":"9588_CR5","doi-asserted-by":"crossref","unstructured":"Bingel, J., Schluter, N., Mart\u00ednez\u00a0Alonso, H. (2016). CoastalCPH at SemEval-2016 Task 11: The importance of designing your Neural Networks right. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1028\u20131033. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1160"},{"key":"9588_CR6","unstructured":"Biran, O., Brody, S., Elhadad, N. (2011). Putting it simply: A context-aware approach to lexical simplification. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: shortpapers (ACL-2011), pp. 496\u2013501. Portland, Oregon."},{"key":"9588_CR7","unstructured":"Bott, S., Rello, L., Drndarevic, B., Saggion, H. (2012). Can Spanish be simpler? LexSiS: Lexical Simplification for Spanish. In Proceedings of the 12th International Conference on Intelligent Text Processing and Computational Linguistics, Lecture Notes in Computer Science (pp. 8\u201315). Springer, Samos, Greece."},{"key":"9588_CR8","unstructured":"Brants, T., Franz, A. (2006). The google web 1t 5-gram corpus version 1.1. LDC2006T13."},{"key":"9588_CR9","doi-asserted-by":"crossref","unstructured":"Brooke, J., Uitdenbogerd, A., Baldwin, T. (2016). Melbourne at SemEval 2016 Task 11: Classifying Type-level Word Complexity using Random Forests with Corpus and Word List Features. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 975\u2013981. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1150"},{"key":"9588_CR10","doi-asserted-by":"crossref","unstructured":"Brysbaert, M., Buchmeier, M., Conrad, M., Jacobs, A.M., B\u00f6lte, J., B\u00f6hl, A. (2011). The word frequency effect. Experimental Psychology.","DOI":"10.1027\/1618-3169\/a000123"},{"key":"9588_CR11","doi-asserted-by":"crossref","unstructured":"Butnaru, A., Ionescu, R. T. (2018). UnibucKernel: A kernel-based learning method for complex word identification . In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, United States.","DOI":"10.18653\/v1\/W18-0519"},{"key":"9588_CR12","doi-asserted-by":"crossref","unstructured":"Choubey, P., Pateria, S. (2016). Garuda & Bhasha at SemEval-2016 Task 11: Complex Word Identification Using Aggregated Learning Models. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1006\u20131010. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1156"},{"issue":"2","key":"9588_CR13","doi-asserted-by":"publisher","first-page":"375","DOI":"10.1007\/s10579-014-9287-y","volume":"49","author":"C Christodouloupoulos","year":"2015","unstructured":"Christodouloupoulos, C., & Steedman, M. (2015). A massively parallel corpus: The bible in 100 languages. Language Resources and Evaluation, 49(2), 375\u2013395. https:\/\/doi.org\/10.1007\/s10579-014-9287-y","journal-title":"Language Resources and Evaluation"},{"issue":"6","key":"9588_CR14","first-page":"1084","volume":"16","author":"C Connine","year":"1990","unstructured":"Connine, C., Mullennix, J., Shernoff, E., & Yelen, J. (1990). Word familiarity and frequency in visual and auditory word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16(6), 1084\u20131096.","journal-title":"Journal of Experimental Psychology: Learning, Memory, and Cognition"},{"key":"9588_CR15","doi-asserted-by":"crossref","unstructured":"Davoodi, E., Kosseim, L. (2016). CLaC at SemEval-2016 Task 11: Exploring linguistic and psycho-linguistic Features for Complex Word Identification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 982\u2013985. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1151"},{"key":"9588_CR16","doi-asserted-by":"crossref","unstructured":"De\u00a0Hertog, D., Tack, A. (2018). Deep Learning Architecture for Complex Word Identification . In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, United States.","DOI":"10.18653\/v1\/W18-0539"},{"key":"9588_CR17","doi-asserted-by":"publisher","unstructured":"Deutsch, T., Jasbi, M., Shieber, S. (2020). Linguistic features for readability assessment. In Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 1\u201317. Association for Computational Linguistics, Seattle, WA, USA Online. https:\/\/doi.org\/10.18653\/v1\/2020.bea-1.1","DOI":"10.18653\/v1\/2020.bea-1.1"},{"key":"9588_CR18","unstructured":"Devlin, S., Tait, J. (1998). The use of a psycholinguistic database in the simplification of text for aphasic readers. Linguistic Databases pp. 161\u2013173."},{"key":"9588_CR19","doi-asserted-by":"crossref","unstructured":"Dolby, J. L., Resnikoff, H. L., MacMurray, F.L. (1963). A tape dictionary for linguistic experiments. In Proceedings of the American Federation of information processing societies: Fall Joint Computer Conference, pp. 419 \u2013 423. Spartan Books, Baltimore, MD.","DOI":"10.1145\/1463822.1463865"},{"key":"9588_CR20","doi-asserted-by":"publisher","first-page":"316","DOI":"10.1016\/0749-596X(90)90003-I","volume":"29","author":"E Dupoux","year":"1990","unstructured":"Dupoux, E., & Mehler, J. (1990). Monitoring the lexicon with normal and compressed speech: Frequency effects and the prelexical code. Journal of Memory & Language, 29, 316\u2013335.","journal-title":"Journal of Memory & Language"},{"key":"9588_CR21","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1007\/978-90-481-8847-5_10","volume-title":"Theory and applications of ontology: Computer applications","author":"C Fellbaum","year":"2010","unstructured":"Fellbaum, C. (2010). Wordnet. In R. Poli, M. Healy, & A. Kameas (Eds.), Theory and applications of ontology: Computer applications (pp. 231\u2013243). Amsterdam: Springer."},{"key":"9588_CR22","doi-asserted-by":"crossref","unstructured":"Finnimore, P., Fritzsch, E., King, D., Sneyd, A., Ur\u00a0Rehman, A., Alva-Manchego, F., Vlachos, A. (2019). Strong baselines for complex word identification across multiple languages. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 970\u2013977. Association for Computational Linguistics, Minneapolis, Minnesota.","DOI":"10.18653\/v1\/N19-1102"},{"issue":"4","key":"9588_CR23","first-page":"396","volume":"12","author":"K Gilhooly","year":"1980","unstructured":"Gilhooly, K., & Logie, R. (1980). Age-of-acquisition, imagery, concreteness, familiarity, and ambiguity measures for 1,944 words. Behavior Research Methods, 12(4), 396\u2013427.","journal-title":"Behavior Research Methods"},{"key":"9588_CR24","unstructured":"Gillin, N. (2016). Sensible at SemEval-2016 Task 11: Neural Nonsense mangled in ensemble mess. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 963\u2013968. Association for Computational Linguistics, San Diego, California."},{"key":"9588_CR25","doi-asserted-by":"crossref","unstructured":"Gooding, S., & Kochmar, E. (2018). CAMB at CWI Shared Task 2018: Complex Word Identification with Ensemble-Based Voting . In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, United States.","DOI":"10.18653\/v1\/W18-0520"},{"key":"9588_CR26","doi-asserted-by":"crossref","unstructured":"Gooding, S., Kochmar, E. (2019). Complex word identification as a sequence labelling task. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1148\u20131153. Association for Computational Linguistics, Florence, Italy.","DOI":"10.18653\/v1\/P19-1109"},{"key":"9588_CR27","doi-asserted-by":"crossref","unstructured":"Gooding, S., Kochmar, E., Sarkar, A., Blackwell, A. (2019). Comparative judgments are more consistent than binary classification for labelling word complexity. In Proceedings of the 13th Linguistic Annotation Workshop, pp. 208\u2013214.","DOI":"10.18653\/v1\/W19-4024"},{"key":"9588_CR28","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1145\/1656274.1656278","volume":"11","author":"M Hall","year":"2009","unstructured":"Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The weka data mining software: An update. SIGKDD Explorations Newsletter, 11, 10\u201318. https:\/\/doi.org\/10.1145\/1656274.1656278","journal-title":"SIGKDD Explorations Newsletter"},{"key":"9588_CR29","doi-asserted-by":"crossref","unstructured":"Hartmann, N., & dos Santos, L. B. (2018). NILC at CWI 2018: Exploring Feature Engineering and Feature Learning . In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, United States.","DOI":"10.18653\/v1\/W18-0540"},{"key":"9588_CR30","doi-asserted-by":"crossref","unstructured":"Horn, C., Manduca, C., & Kauchak, D. (2014). Learning a lexical simplifier using Wikipedia. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers, pp. 458\u2013463). Association for Computational Linguistics, Baltimore, Maryland.","DOI":"10.3115\/v1\/P14-2075"},{"key":"9588_CR31","doi-asserted-by":"crossref","unstructured":"Kajiwara, T., & Komachi, M. (2018). Complex word identification based on frequency in a learner corpus . In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, United States.","DOI":"10.18653\/v1\/W18-0521"},{"key":"9588_CR32","unstructured":"Kauchak, D. (2013). Improving text simplification language modeling using unsimplified text data. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers, pp. 1537\u20131546). Association for Computational Linguistics, Sofia, Bulgaria."},{"key":"9588_CR33","doi-asserted-by":"crossref","unstructured":"Kauchak, D. (2016). Pomona at SemEval-2016 Task 11: Predicting Word Complexity Based on Corpus Frequency. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1047\u20131051. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1164"},{"key":"9588_CR34","doi-asserted-by":"publisher","first-page":"563","DOI":"10.3758\/BF03197079","volume":"17","author":"S Kinoshita","year":"1989","unstructured":"Kinoshita, S. (1989). Generation enhances semantic processing? The role of distinctiveness in the generation effect. Memory & Cognition, 17, 563\u2013571.","journal-title":"Memory & Cognition"},{"key":"9588_CR35","unstructured":"Koehn, P. (2005). Europarl: A parallel corpus for statistical machine translation. In Proceedings of MT Summit."},{"key":"9588_CR36","doi-asserted-by":"crossref","unstructured":"Konkol, M. (2016). UWB at SemEval-2016 Task 11: Exploring Features for Complex Word Identification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1038\u20131041. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1162"},{"issue":"2","key":"9588_CR37","doi-asserted-by":"publisher","first-page":"299","DOI":"10.1016\/S0031-3203(99)00223-X","volume":"34","author":"LI Kuncheva","year":"2001","unstructured":"Kuncheva, L. I., Bezdek, J. C., & Duin, R. P. (2001). Decision templates for multiple classifier fusion: An experimental comparison. Pattern Recognition, 34(2), 299\u2013314.","journal-title":"Pattern Recognition"},{"key":"9588_CR38","doi-asserted-by":"crossref","unstructured":"Kuru, O. (2016). AI-KU at SemEval-2016 Task 11: Word Embeddings and Substring Features for Complex Word Identification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016, pp. 1042\u20131046). Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1163"},{"key":"9588_CR39","volume-title":"Computational analysis of present-day American English","author":"H Ku\u010dera","year":"1967","unstructured":"Ku\u010dera, H., & Francis, W. (1967). Computational analysis of present-day American English. Providence, RI: Brown University Press."},{"key":"9588_CR40","unstructured":"Leeds, B.G. (1976). Kindergarten children and the influence of letter shapes and meaningfulness of vocabulary as factors influencing word recognition. Technical Report ED136250, Department of Health, Education & Welfare, National Institute of Education."},{"issue":"8","key":"9588_CR41","doi-asserted-by":"publisher","first-page":"717","DOI":"10.1016\/j.ijmedinf.2013.03.001","volume":"82","author":"G Leroy","year":"2013","unstructured":"Leroy, G., Kauchak, D., & Mouradi, O. (2013). A user-study measuring the effects of lexical simplification and coherence enhancement on perceived and actual text difficulty. International Journal of Medical Informatics, 82(8), 717\u2013730.","journal-title":"International Journal of Medical Informatics"},{"key":"9588_CR42","volume-title":"Contributions to probability and statistics: Essays in Honor of Harold Hotelling","author":"H Levene","year":"1960","unstructured":"Levene, H. (1960). Robust tests for equality of variances. In I. Olkin (Ed.), Contributions to probability and statistics: Essays in Honor of Harold Hotelling. Palo Alto, CA: Stanford University Press."},{"key":"9588_CR43","doi-asserted-by":"publisher","unstructured":"Maddela, M., & Xu, W. (2018). A word-complexity lexicon and a neural readability ranking model for lexical simplification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3749\u20133760. Association for Computational Linguistics, Brussels, Belgium. https:\/\/doi.org\/10.18653\/v1\/D18-1410","DOI":"10.18653\/v1\/D18-1410"},{"key":"9588_CR44","doi-asserted-by":"crossref","unstructured":"Malmasi, S., Dras, M., & Zampieri, M. (2016). LTG at SemEval-2016 Task 11: Complex Word Identification with Classifier Ensembles. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 996\u20131000. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1154"},{"key":"9588_CR45","doi-asserted-by":"crossref","unstructured":"Malmasi, S., & Zampieri, M. (2016). MAZA at SemEval-2016 Task 11: Detecting Lexical Complexity Using a Decision Stump Meta-Classifier. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 991\u2013995. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1153"},{"key":"9588_CR46","doi-asserted-by":"crossref","unstructured":"Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J. R., Bethard, S., & McClosky, D. (2014). The stanford corenlp natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55\u201360.","DOI":"10.3115\/v1\/P14-5010"},{"key":"9588_CR47","volume-title":"Activation, competition, and frequency in lexical acess","author":"W Marslen-Wilson","year":"1990","unstructured":"Marslen-Wilson, W. (1990). Activation, competition, and frequency in lexical acess. Cambridge, MA: MIT Press."},{"key":"9588_CR48","doi-asserted-by":"crossref","unstructured":"Mart\u00ednez\u00a0Mart\u00ednez, J. M., & Tan, L. (2016). USAAR at SemEval-2016 Task 11: Complex Word Identification with Sense Entropy and Sentence Perplexity. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 958\u2013962. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1147"},{"issue":"2","key":"9588_CR49","doi-asserted-by":"publisher","first-page":"167","DOI":"10.1348\/000712600161763","volume":"91 Pt. 2","author":"C Morrison","year":"2000","unstructured":"Morrison, C., & Ellis, A. W. (2000). Real age of acquisition effects in word naming and lexical decision. British Journal of Psychology, 91 Pt. 2(2), 167-180 180.","journal-title":"British Journal of Psychology"},{"key":"9588_CR50","doi-asserted-by":"crossref","unstructured":"Mukherjee, N., Patra, B. G., Das, D., & Bandyopadhyay, S. (2016). JU_NLP at SemEval-2016 Task 11: Identifying Complex Words in a Sentence. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 986\u2013990. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1152"},{"key":"9588_CR51","doi-asserted-by":"crossref","unstructured":"Paetzold, G., & Specia, L. (2016). SemEval 2016 Task 11: Complex Word Identification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 560\u2013569. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1085"},{"key":"9588_CR52","doi-asserted-by":"crossref","unstructured":"Paetzold, G., & Specia, L. (2016). SV000gg at SemEval-2016 Task 11: Heavy Gauge Complex Word Identification with System Voting. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 969\u2013974. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1149"},{"key":"9588_CR53","doi-asserted-by":"crossref","unstructured":"Paetzold, G. H. (2016). Lexical simplification for non-native english speakers. Phd thesis, University of Sheffield.","DOI":"10.1609\/aaai.v30i1.9885"},{"key":"9588_CR54","unstructured":"Paetzold, G. H., Alva-Manchego, F., Specia, L. (2017). MASSAlign: Alignment and Annotation of Comparable Documents. In The Companion Volume of the IJCNLP 2017 Proceedings: System Demonstrations, pp. 1\u20134."},{"key":"9588_CR55","doi-asserted-by":"publisher","first-page":"549","DOI":"10.1613\/jair.5526","volume":"60","author":"GH Paetzold","year":"2017","unstructured":"Paetzold, G. H., & Specia, L. (2017). A survey on lexical simplification. Journal of Artificial Intelligence Research, 60, 549\u2013593.","journal-title":"Journal of Artificial Intelligence Research"},{"issue":"3","key":"9588_CR56","doi-asserted-by":"publisher","first-page":"255","DOI":"10.1037\/h0084295","volume":"45","author":"A Paivio","year":"1991","unstructured":"Paivio, A. (1991). Dual coding theory: Retrospect and current status. Canadian Journal of Psychology\/Revue canadienne de psychologie, 45(3), 255\u2013287.","journal-title":"Canadian Journal of Psychology\/Revue canadienne de psychologie"},{"issue":"1, Pt.2","key":"9588_CR57","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1037\/h0025327","volume":"76","author":"A Paivio","year":"1968","unstructured":"Paivio, A., Yuille, J. C., & Madigan, S. A. (1968). Concreteness, imagery, and meaningfulness values for 925 nouns. Journal of Experimental Psychology, 76(1, Pt.2), 1\u201325.","journal-title":"Journal of Experimental Psychology"},{"key":"9588_CR58","doi-asserted-by":"crossref","unstructured":"Palakurthi, A., & Mamidi, R. (2016). IIIT at SemEval-2016 Task 11: Complex Word Identification using Nearest Centroid Classification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1017\u20131021. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1158"},{"key":"9588_CR59","doi-asserted-by":"publisher","first-page":"492","DOI":"10.5840\/monist190616436","volume":"6","author":"CSS Peirce","year":"1906","unstructured":"Peirce, C. S. S. (1906). Prolegomena to an apology for pragmaticism. The Monist, 6, 492\u2013546.","journal-title":"The Monist"},{"key":"9588_CR60","doi-asserted-by":"crossref","unstructured":"Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of EMNLP.","DOI":"10.3115\/v1\/D14-1162"},{"key":"9588_CR61","doi-asserted-by":"crossref","unstructured":"Popovi\u0107, M. (2018). Complex Word Identification using Character n-grams. In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, United States.","DOI":"10.18653\/v1\/W18-0541"},{"key":"9588_CR62","doi-asserted-by":"crossref","unstructured":"Quijada, M., & Medero, J. (2016). HMC at SemEval-2016 Task 11: Identifying Complex Words Using Depth-limited Decision Trees. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1034\u20131037. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1161"},{"key":"9588_CR63","doi-asserted-by":"crossref","unstructured":"Ronzano, F., Abura\u2019ed, A., Espinosa\u00a0Anke, L., & Saggion, H. (2016). TALN at SemEval-2016 Task 11: Modelling Complex Words by Contextual, Lexical and Semantic Features. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016, pp. 1011\u20131016). Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1157"},{"key":"9588_CR64","doi-asserted-by":"crossref","unstructured":"Sag, I. A., Baldwin, T., Bond, F., Copestake, A. A., & Flickinger, D. (2002). Multiword expressions: A pain in the neck for nlp. In Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing, CICLing \u201902, p. 1-15. Springer, Berlin, Heidelberg.","DOI":"10.1007\/3-540-45715-1_1"},{"key":"9588_CR65","unstructured":"Schneider, N., Onuffer, S., Kazour, N., Danchik, E., Mordowanec, M. T., Conrad, H., & Smith, N. A. (2014). Comprehensive annotation of multiword expressions in a social web corpus. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC\u201914), pp. 455\u2013461. European Language Resources Association (ELRA), Reykjavik, Iceland."},{"key":"9588_CR66","first-page":"223","volume-title":"The psychology of word meaning","author":"PJ Schwanenflugel","year":"1991","unstructured":"Schwanenflugel, P. J. (1991). Why are abstract concepts hard to understand? In P. J. Schwanenflugel (Ed.), The psychology of word meaning (pp. 223\u2013250). Mahwah, NJ: Erlbaum."},{"issue":"5","key":"9588_CR67","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1016\/0749-596X(88)90022-8","volume":"27","author":"PJ Schwanenflugel","year":"1988","unstructured":"Schwanenflugel, P. J., Harnishfeger, K. K., & Stowe, R. W. (1988). Context availability and lexical decisions for abstract and concrete words. Journal of Memory and Language, 27(5), 499\u2013520.","journal-title":"Journal of Memory and Language"},{"issue":"6","key":"9588_CR68","doi-asserted-by":"publisher","first-page":"615","DOI":"10.1016\/0028-3932(82)90061-6","volume":"20","author":"J Segui","year":"1982","unstructured":"Segui, J., Mehler, J., Frauenfelder, U., & Morton, J. (1982). The word frequency effect and lexical access. Neuropsychologia, 20(6), 615\u2013627.","journal-title":"Neuropsychologia"},{"issue":"3\u20134","key":"9588_CR69","doi-asserted-by":"publisher","first-page":"591","DOI":"10.1093\/biomet\/52.3-4.591","volume":"52","author":"SS Shapiro","year":"1965","unstructured":"Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples)$$^{\\dagger }$$. Biometrika, 52(3\u20134), 591\u2013611. https:\/\/doi.org\/10.1093\/biomet\/52.3-4.591","journal-title":"Biometrika"},{"key":"9588_CR70","unstructured":"Shardlow, M. (2013). A comparison of techniques to automatically identify complex words. In 51st Annual Meeting of the Association for Computational Linguistics Proceedings of the Student Research Workshop, pp. 103\u2013109. Association for Computational Linguistics, Sofia, Bulgaria."},{"key":"9588_CR71","unstructured":"Shardlow, M. (2013). The CW corpus: A new resource for evaluating the identification of complex words. In The Second Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR 2013). Association for Computational Linguistics, Sofia, Bulgaria."},{"key":"9588_CR72","unstructured":"Shardlow, M., Cooper, M., & Zampieri, M. (2020). CompLex\u2014A new corpus for lexical complexity prediction from Likert Scale data. In Proceedings of the 1st Workshop on Tools and Resources to Empower People with REAding DIfficulties (READI), pp. 57\u201362. European Language Resources Association, Marseille, France."},{"key":"9588_CR73","doi-asserted-by":"crossref","unstructured":"Shardlow, M., Evans, R., Paetzold, G., & Zampieri, M. (2021). Semeval-2021 task 1: Lexical complexity prediction. In Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval-2021).","DOI":"10.18653\/v1\/2021.semeval-1.1"},{"key":"9588_CR74","unstructured":"Sanjay, S. P., & Soman, K. P. (2016). AmritaCEN at SemEval-2016 Task 11: Complex Word Identification using Word Embedding. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 1022\u20131027. Association for Computational Linguistics, San Diego, California."},{"key":"9588_CR75","doi-asserted-by":"publisher","first-page":"226","DOI":"10.1016\/j.jecp.2018.09.007","volume":"178","author":"LM Steacy","year":"2019","unstructured":"Steacy, L. M., & Compton, D. L. (2019). Examining the role of imageability and regularity in word reading accuracy and learning efficiency among first and second graders at risk for reading disabilities. Journal of Experimental Child Psychology, 178, 226\u2013250.","journal-title":"Journal of Experimental Child Psychology"},{"key":"9588_CR76","unstructured":"Strohmaier, D., Gooding, S., Taslimipoor, S., & Kochmar, E. (2020). SeCoDa: Sense complexity dataset. In Proceedings of the 12th Language Resources and Evaluation Conference, pp. 5962\u20135967. European Language Resources Association, Marseille, France. https:\/\/aclanthology.org\/2020.lrec-1.730"},{"key":"9588_CR77","unstructured":"Svartvik, J., & Quirk, R. (Eds.). (1980). Handbook of semantic word norms. Lund: Liver\/Gleerups."},{"key":"9588_CR78","unstructured":"Thorndike, E. L., & Lorge, I. (1944). The teacher\u2019s word book of 30,000 words. Teacher\u2019s College: Columbia University, New York, NY, USA."},{"key":"9588_CR79","volume-title":"Handbook of semantic word norms","author":"MP Toglia","year":"1978","unstructured":"Toglia, M. P., & Battig, W. F. (1978). Handbook of semantic word norms. Hillsdale, NJ, USA: Erlbaum."},{"key":"9588_CR80","doi-asserted-by":"crossref","unstructured":"Wani, N., Mathias, S., Gajjam, J. A., & Bhattacharyya, P. (2018). The Whole is Greater than the Sum of its Parts: Towards the Effectiveness of Voting Ensemble Classifiers for Complex Word Identification. In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, United States.","DOI":"10.18653\/v1\/W18-0522"},{"key":"9588_CR81","doi-asserted-by":"publisher","first-page":"6","DOI":"10.3758\/BF03202594","volume":"20","author":"M Wilson","year":"1988","unstructured":"Wilson, M. (1988). MRC psycholinguistic database: Machine-usable dictionary, version 2.00. Behavior Research Methods, Instruments, & Computers, 20, 6\u201310.","journal-title":"Behavior Research Methods, Instruments, & Computers"},{"key":"9588_CR82","doi-asserted-by":"crossref","unstructured":"Wr\u00f3bel, K. (2016). PLUJAGH at SemEval-2016 Task 11: Simple System for Complex Word Identification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 953\u2013957. Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1146"},{"key":"9588_CR83","doi-asserted-by":"publisher","unstructured":"Yaneva, V., Or\u0103san, C., Evans, R., & Rohanian, O. (2017). Combining multiple corpora for readability assessment for people with cognitive disabilities. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pp. 121\u2013132. Association for Computational Linguistics, Copenhagen, Denmark. https:\/\/doi.org\/10.18653\/v1\/W17-5013","DOI":"10.18653\/v1\/W17-5013"},{"key":"9588_CR84","doi-asserted-by":"crossref","unstructured":"Yimam, S.M., Biemann, C., Malmasi, S., Paetzold, G., Specia, L., \u0160tajner, S., Tack, A., & Zampieri, M. (2018). A report on the complex word identification shared task 2018. In Proceedings of BEA.","DOI":"10.18653\/v1\/W18-0507"},{"key":"9588_CR85","doi-asserted-by":"crossref","unstructured":"Yimam, S.M., \u0160tajner, S., Riedl, M., & Biemann, C. (2017). Multilingual and Cross-Lingual Complex Word Identification. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pp. 813\u2013822. INCOMA Ltd., Varna, Bulgaria.","DOI":"10.26615\/978-954-452-049-6_104"},{"issue":"11","key":"9588_CR86","doi-asserted-by":"publisher","first-page":"3002","DOI":"10.1523\/JNEUROSCI.5295-04.2005","volume":"25","author":"AP Yonelinas","year":"2005","unstructured":"Yonelinas, A. P., Otten, L. J., Shaw, K. N., & Rugg, M. D. (2005). Separating the brain regions involved in recollection and familiarity in recognition memory. Journal of Neuroscience, 25(11), 3002\u20133008.","journal-title":"Journal of Neuroscience"},{"key":"9588_CR87","unstructured":"Zampieri, M., Malmasi, S., Paetzold, G., & Specia, L. (2017). Complex Word Identification: Challenges in Data Annotation and System Performance. In Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017, pp. 59\u201363). Asian Federation of Natural Language Processing, Taipei, Taiwan."},{"key":"9588_CR88","doi-asserted-by":"crossref","unstructured":"Zampieri, M., Tan, L., & van Genabith, J. (2016). MacSaar at SemEval-2016 Task 11: Zipfian and Character Features for ComplexWord Identification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016, pp. 1001\u20131005). Association for Computational Linguistics, San Diego, California.","DOI":"10.18653\/v1\/S16-1155"}],"container-title":["Language Resources and Evaluation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10579-022-09588-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10579-022-09588-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10579-022-09588-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,31]],"date-time":"2022-10-31T13:22:08Z","timestamp":1667222528000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10579-022-09588-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,23]]},"references-count":88,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["9588"],"URL":"https:\/\/doi.org\/10.1007\/s10579-022-09588-2","relation":{},"ISSN":["1574-020X","1574-0218"],"issn-type":[{"value":"1574-020X","type":"print"},{"value":"1574-0218","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,23]]},"assertion":[{"value":"7 March 2022","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 March 2022","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}