{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T09:21:15Z","timestamp":1770542475162,"version":"3.49.0"},"reference-count":64,"publisher":"Springer Science and Business Media LLC","issue":"8","license":[{"start":{"date-parts":[[2023,3,8]],"date-time":"2023-03-08T00:00:00Z","timestamp":1678233600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,3,8]],"date-time":"2023-03-08T00:00:00Z","timestamp":1678233600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int. J. Mach. Learn. &amp; Cyber."],"published-print":{"date-parts":[[2023,8]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Today, we have access to a vast data amount, especially on the internet. Online news agencies play a vital role in this data generation, but most of their data is unstructured, requiring an enormous effort to extract important information. Thus, automated intelligent event detection mechanisms are invaluable to the community. In this research, we focus on identifying event details at the sentence and token levels from news articles, considering their fine granularity. Previous research has proposed various approaches ranging from traditional machine learning to deep learning, targeting event detection at these levels. Among these approaches, transformer-based approaches performed best, utilising transformers\u2019 transferability and context awareness, and achieved state-of-the-art results. However, they considered sentence and token level tasks as separate tasks even though their interconnections can be utilised for mutual task improvements. To fill this gap, we propose a novel learning strategy named <jats:italic>Two-phase Transfer Learning (TTL)<\/jats:italic> based on transformers, which allows the model to utilise the knowledge from a task at a particular data granularity for another task at different data granularity, and evaluate its performance in sentence and token level event detection. Also, we empirically evaluate how the event detection performance can be improved for different languages (high- and low-resource), involving monolingual and multilingual pre-trained transformers and language-based learning strategies along with the proposed learning strategy. Our findings mainly indicate the effectiveness of multilingual models in low-resource language event detection. Also, TTL can further improve model performance, depending on the involved tasks\u2019 learning order and their relatedness concerning final predictions.<\/jats:p>","DOI":"10.1007\/s13042-023-01795-9","type":"journal-article","created":{"date-parts":[[2023,3,8]],"date-time":"2023-03-08T05:03:24Z","timestamp":1678251804000},"page":"2739-2760","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["TTL: transformer-based two-phase transfer learning for cross-lingual news event detection"],"prefix":"10.1007","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4609-5001","authenticated-orcid":false,"given":"Hansi","family":"Hettiarachchi","sequence":"first","affiliation":[]},{"given":"Mariam","family":"Adedoyin-Olowe","sequence":"additional","affiliation":[]},{"given":"Jagdev","family":"Bhogal","sequence":"additional","affiliation":[]},{"given":"Mohamed Medhat","family":"Gaber","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,3,8]]},"reference":[{"key":"1795_CR1","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1007\/s10994-021-05988-7","volume":"111","author":"H Hettiarachchi","year":"2022","unstructured":"Hettiarachchi H, Adedoyin-Olowe M, Bhogal J, Gaber MM (2022) Embed2Detect: temporally clustered embedded words for event detection in social media. Mach Learn 111:49\u201387. https:\/\/doi.org\/10.1007\/s10994-021-05988-7","journal-title":"Mach Learn"},{"key":"1795_CR2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2020.106492","volume":"210","author":"A Balali","year":"2020","unstructured":"Balali A, Asadpour M, Campos R, Jatowt A (2020) Joint event extraction along shortest dependency paths using graph convolutional networks. Knowl-Based Syst 210:106492. https:\/\/doi.org\/10.1016\/j.knosys.2020.106492","journal-title":"Knowl-Based Syst"},{"key":"1795_CR3","doi-asserted-by":"crossref","unstructured":"Sha L, Qian F, Chang B, Sui Z (2018) Jointly extracting event triggers and arguments by dependency-bridge RNN and tensor-based argument interaction. In: Proceedings of the AAAI conference on artificial intelligence, vol 32(1)","DOI":"10.1609\/aaai.v32i1.12034"},{"issue":"2","key":"1795_CR4","doi-asserted-by":"publisher","first-page":"308","DOI":"10.1162\/dint_a_00092","volume":"3","author":"A H\u00fcrriyeto\u011flu","year":"2021","unstructured":"H\u00fcrriyeto\u011flu A, Y\u00f6r\u00fck E, Mutlu O, Duru\u015fan F, Yoltar \u00c7, Y\u00fcret D, G\u00fcrel B (2021) Cross-context news corpus for protest event-related knowledge base construction. Data Intell 3(2):308\u2013335. https:\/\/doi.org\/10.1162\/dint_a_00092","journal-title":"Data Intell"},{"key":"1795_CR5","doi-asserted-by":"publisher","unstructured":"Hettiarachchi H, Adedoyin-Olowe M, Bhogal J, Gaber MM (2021) DAAI at CASE 2021 task 1: Transformer-based multilingual socio-political and crisis event detection. In: Proceedings of the 4th workshop on challenges and applications of automated extraction of socio-political events from text (CASE 2021), pp 120\u2013130. Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/2021.case-1.16. https:\/\/aclanthology.org\/2021.case-1.16","DOI":"10.18653\/v1\/2021.case-1.16"},{"issue":"2","key":"1795_CR6","doi-asserted-by":"publisher","first-page":"132","DOI":"10.1007\/s10791-009-9113-0","volume":"13","author":"M Naughton","year":"2010","unstructured":"Naughton M, Stokes N, Carthy J (2010) Sentence-level event classification in unstructured texts. Inf Retr 13(2):132\u2013156. https:\/\/doi.org\/10.1007\/s10791-009-9113-0","journal-title":"Inf Retr"},{"key":"1795_CR7","unstructured":"Hong Y, Zhang J, Ma B, Yao J, Zhou G, Zhu Q (2011) Using cross-entity inference to improve event extraction. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies. Association for Computational Linguistics, Portland, Oregon, USA, pp 1127\u20131136. https:\/\/aclanthology.org\/P11-1113"},{"key":"1795_CR8","unstructured":"Chen C, Ng V (2012) Joint modeling for Chinese event extraction with rich linguistic features. In: Proceedings of COLING 2012. The COLING 2012 Organizing Committee, Mumbai, India, pp 529\u2013544. https:\/\/aclanthology.org\/C12-1033"},{"key":"1795_CR9","doi-asserted-by":"publisher","unstructured":"Chen Y, Xu L, Liu K, Zeng D, Zhao J (2015) Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (vol 1: long papers), pp 167\u2013176. Association for Computational Linguistics, Beijing, China. https:\/\/doi.org\/10.3115\/v1\/P15-1017. https:\/\/aclanthology.org\/P15-1017","DOI":"10.3115\/v1\/P15-1017"},{"key":"1795_CR10","doi-asserted-by":"publisher","unstructured":"Hassan A, Mahmood A (2017) Deep learning for sentence classification. In: 2017 IEEE long island systems, applications and technology conference (LISAT), pp 1\u20135. https:\/\/doi.org\/10.1109\/LISAT.2017.8001979","DOI":"10.1109\/LISAT.2017.8001979"},{"key":"1795_CR11","doi-asserted-by":"publisher","unstructured":"Pandey C, Ibrahim Z, Wu H, Iqbal E, Dobson R (2017) Improving RNN with Attention and embedding for adverse drug reactions. In: Proceedings of the 2017 international conference on digital health. DH \u201917. Association for Computing Machinery, New York, NY, USA, pp 67\u201371. https:\/\/doi.org\/10.1145\/3079452.3079501","DOI":"10.1145\/3079452.3079501"},{"key":"1795_CR12","doi-asserted-by":"publisher","unstructured":"Liu S, Li Y, Zhang F, Yang T, Zhou X (2019) Event detection without triggers. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 735\u2013744. https:\/\/doi.org\/10.18653\/v1\/N19-1080. https:\/\/aclanthology.org\/N19-1080","DOI":"10.18653\/v1\/N19-1080"},{"key":"1795_CR13","unstructured":"Alyafeai Z, AlShaibani MS, Ahmad I (2020) A survey on transfer learning in natural language processing. arXiv preprint arXiv:2007.04239"},{"key":"1795_CR14","unstructured":"Dumoulin V, Houlsby N, Evci U, Zhai X, Goroshin R, Gelly S, Larochelle H (2021) Comparing transfer and meta learning approaches on a unified few-shot classification benchmark. arXiv preprint arXiv:2104.02638"},{"key":"1795_CR15","doi-asserted-by":"publisher","unstructured":"Chowdhury A, Chaudhari D, Chaudhuri S, Jermaine C (2022) Meta-meta classification for one-shot learning. In: 2022 IEEE\/CVF winter conference on applications of computer vision (WACV), pp 1628\u20131637. https:\/\/doi.org\/10.1109\/WACV51458.2022.00169","DOI":"10.1109\/WACV51458.2022.00169"},{"key":"1795_CR16","doi-asserted-by":"publisher","unstructured":"Ruder S, Peters M.E, Swayamdipta S, Wolf T (2019) Transfer learning in natural language processing. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: tutorials. Association for Computational Linguistics, Minneapolis, Minnesota, pp 15\u201318. https:\/\/doi.org\/10.18653\/v1\/N19-5004. https:\/\/aclanthology.org\/N19-5004","DOI":"10.18653\/v1\/N19-5004"},{"key":"1795_CR17","doi-asserted-by":"publisher","unstructured":"Chowdhury A, Jiang M, Chaudhuri S, Jermaine C (2021) Few-shot image classification: just use a library of pre-trained feature extractors and a simple classifier. In: 2021 IEEE\/CVF international conference on computer vision (ICCV), pp 9425\u20139434. https:\/\/doi.org\/10.1109\/ICCV48922.2021.00931","DOI":"10.1109\/ICCV48922.2021.00931"},{"key":"1795_CR18","doi-asserted-by":"publisher","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171\u20134186. https:\/\/doi.org\/10.18653\/v1\/N19-1423. https:\/\/aclanthology.org\/N19-1423","DOI":"10.18653\/v1\/N19-1423"},{"key":"1795_CR19","doi-asserted-by":"publisher","unstructured":"Awasthy P, Ni J, Barker K, Florian R (2021) IBM MNLP IE at CASE 2021 task 1: multigranular and multilingual event detection on protest news. In: Proceedings of the 4th workshop on challenges and applications of automated extraction of socio-political events from text (CASE 2021). Association for Computational Linguistics, pp 138\u2013146. https:\/\/doi.org\/10.18653\/v1\/2021.case-1.18. https:\/\/aclanthology.org\/2021.case-1.18","DOI":"10.18653\/v1\/2021.case-1.18"},{"key":"1795_CR20","unstructured":"Lefever E, Hoste V (2016) A classification-based approach to economic event detection in Dutch news text. In: Proceedings of the tenth international conference on language resources and evaluation (LREC\u201916). European Language Resources Association (ELRA), Portoro\u017e, Slovenia, pp 330\u2013335. https:\/\/aclanthology.org\/L16-1051"},{"key":"1795_CR21","doi-asserted-by":"publisher","unstructured":"Basile A, Caselli T (2020) Protest event detection: when task-specific models outperform an event-driven method. In: Lecture notes in computer science. Springer, pp 97\u2013111. https:\/\/doi.org\/10.1007\/978-3-030-58219-7_9","DOI":"10.1007\/978-3-030-58219-7_9"},{"key":"1795_CR22","doi-asserted-by":"publisher","first-page":"13949","DOI":"10.1109\/ACCESS.2018.2814818","volume":"6","author":"A Hassan","year":"2018","unstructured":"Hassan A, Mahmood A (2018) Convolutional recurrent deep learning model for sentence classification. IEEE Access 6:13949\u201313957. https:\/\/doi.org\/10.1109\/ACCESS.2018.2814818","journal-title":"IEEE Access"},{"issue":"8","key":"1795_CR23","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735\u20131780. https:\/\/doi.org\/10.1162\/neco.1997.9.8.1735","journal-title":"Neural Comput"},{"issue":"1","key":"1795_CR24","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1109\/72.554195","volume":"8","author":"S Lawrence","year":"1997","unstructured":"Lawrence S, Giles CL, Tsoi AC, Back AD (1997) Face recognition: a convolutional neural-network approach. IEEE Trans Neural Netw 8(1):98\u2013113. https:\/\/doi.org\/10.1109\/72.554195","journal-title":"IEEE Trans Neural Netw"},{"key":"1795_CR25","unstructured":"Huynh T, He Y, Willis A, Rueger S (2016) Adverse drug reaction classification with deep neural networks. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers. The COLING 2016 Organizing Committee, Osaka, Japan, pp 877\u2013887. https:\/\/aclanthology.org\/C16-1084"},{"key":"1795_CR26","doi-asserted-by":"publisher","unstructured":"G\u00fcrel A, Emin E (2021) ALEM at CASE 2021 task 1: multilingual text classification on news articles. In: Proceedings of the 4th workshop on challenges and applications of automated extraction of socio-political events from text (CASE 2021). Association for Computational Linguistics, pp 147\u2013151. https:\/\/doi.org\/10.18653\/v1\/2021.case-1.19. https:\/\/aclanthology.org\/2021.case-1.19","DOI":"10.18653\/v1\/2021.case-1.19"},{"key":"1795_CR27","doi-asserted-by":"publisher","unstructured":"Hu T, Team SN (2021) \u201cNo Conflict\u201d at CASE 2021 task 1: pretraining for sentence-level protest event detection. In: Proceedings of the 4th workshop on challenges and applications of automated extraction of socio-political events from text (CASE 2021). Association for Computational Linguistics, pp 152\u2013160. https:\/\/doi.org\/10.18653\/v1\/2021.case-1.20. https:\/\/aclanthology.org\/2021.case-1.20","DOI":"10.18653\/v1\/2021.case-1.20"},{"key":"1795_CR28","unstructured":"Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692"},{"key":"1795_CR29","doi-asserted-by":"publisher","unstructured":"Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzm\u00e1n F, Grave E, Ott M, Zettlemoyer L, Stoyanov V (2020) Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 8440\u20138451. Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/2020.acl-main.747. https:\/\/aclanthology.org\/2020.acl-main.747","DOI":"10.18653\/v1\/2020.acl-main.747"},{"key":"1795_CR30","doi-asserted-by":"publisher","unstructured":"Re F, Vegh D, Atzenhofer D, Team SN (2021) \u201cDaDeFrNi\u201d at CASE 2021 task 1: document and sentence classification for protest event detection. In: Proceedings of the 4th workshop on challenges and applications of automated extraction of socio-political events from text (CASE 2021). Association for Computational Linguistics, pp 171\u2013178. https:\/\/doi.org\/10.18653\/v1\/2021.case-1.22. https:\/\/aclanthology.org\/2021.case-1.22","DOI":"10.18653\/v1\/2021.case-1.22"},{"key":"1795_CR31","doi-asserted-by":"publisher","unstructured":"Kalyan P, Reddy D, Hande A, Priyadharshini R, Sakuntharaj R, Chakravarthi BR (2021) IIITT at CASE 2021 task 1: leveraging pretrained language models for multilingual protest detection. In: Proceedings of the 4th workshop on challenges and applications of automated extraction of socio-political events from text (CASE 2021). Association for Computational Linguistics, pp 98\u2013104. https:\/\/doi.org\/10.18653\/v1\/2021.case-1.13. https:\/\/aclanthology.org\/2021.case-1.13","DOI":"10.18653\/v1\/2021.case-1.13"},{"key":"1795_CR32","doi-asserted-by":"publisher","unstructured":"\u00c7elik F, Dalk\u0131l\u0131\u00e7 T, Beyhan F, Yeniterzi R (2021) SU-NLP at CASE 2021 task 1: protest news detection for English. In: Proceedings of the 4th workshop on challenges and applications of automated extraction of socio-political events from text (CASE 2021). Association for Computational Linguistics, pp 131\u2013137. https:\/\/doi.org\/10.18653\/v1\/2021.case-1.17. https:\/\/aclanthology.org\/2021.case-1.17","DOI":"10.18653\/v1\/2021.case-1.17"},{"key":"1795_CR33","doi-asserted-by":"publisher","unstructured":"H\u00fcrriyeto\u011flu A, Mutlu O, Y\u00f6r\u00fck E, Liza FF, Kumar R, Ratan S (2021) Multilingual protest news detection\u2014shared task 1, CASE 2021. In: Proceedings of the 4th workshop on challenges and applications of automated extraction of socio-political events from text (CASE 2021), pp 79\u201391. Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/2021.case-1.11. https:\/\/aclanthology.org\/2021.case-1.11","DOI":"10.18653\/v1\/2021.case-1.11"},{"key":"1795_CR34","unstructured":"Li Q, Ji H, Huang L (2013) Joint event extraction via structured prediction with global features. In: Proceedings of the 51st annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Sofia, Bulgaria, pp 73\u201382. https:\/\/aclanthology.org\/P13-1008"},{"key":"1795_CR35","doi-asserted-by":"publisher","unstructured":"M\u2019hamdi M, Freedman M, May J (2019) Contextualized cross-lingual event trigger extraction with minimal resources. In: Proceedings of the 23rd conference on computational natural language learning (CoNLL). Association for Computational Linguistics, Hong Kong, China, pp 656\u2013665. https:\/\/doi.org\/10.18653\/v1\/K19-1061. https:\/\/aclanthology.org\/K19-1061","DOI":"10.18653\/v1\/K19-1061"},{"issue":"5","key":"1795_CR36","doi-asserted-by":"publisher","first-page":"4987","DOI":"10.1007\/s10489-021-02695-7","volume":"52","author":"S Lu","year":"2022","unstructured":"Lu S, Li S, Xu Y, Wang K, Lan H, Guo J (2022) Event detection from text using path-aware graph convolutional network. Appl Intell 52(5):4987\u20134998. https:\/\/doi.org\/10.1007\/s10489-021-02695-7","journal-title":"Appl Intell"},{"key":"1795_CR37","doi-asserted-by":"publisher","unstructured":"Nguyen TH, Cho K, Grishman R (2016) Joint event extraction via recurrent neural networks. In: Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies. Association for Computational Linguistics, San Diego, California, pp 300\u2013309. https:\/\/doi.org\/10.18653\/v1\/N16-1034. https:\/\/aclanthology.org\/N16-1034","DOI":"10.18653\/v1\/N16-1034"},{"key":"1795_CR38","doi-asserted-by":"publisher","unstructured":"Yang S, Feng D, Qiao L, Kan Z, Li D (2019) Exploring pre-trained language models for event extraction and generation. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Florence, Italy, pp 5284\u20135294. https:\/\/doi.org\/10.18653\/v1\/P19-1522. https:\/\/aclanthology.org\/P19-1522","DOI":"10.18653\/v1\/P19-1522"},{"key":"1795_CR39","doi-asserted-by":"publisher","unstructured":"Vivek\u00a0Kalyan S, Paul T, Shaun T, Andrews M (2021) Handshakes AI research at CASE 2021 task 1: exploring different approaches for multilingual tasks. In: Proceedings of the 4th workshop on challenges and applications of automated extraction of socio-political events from text (CASE 2021). Association for Computational Linguistics, pp 92\u201397. https:\/\/doi.org\/10.18653\/v1\/2021.case-1.12. https:\/\/aclanthology.org\/2021.case-1.12","DOI":"10.18653\/v1\/2021.case-1.12"},{"key":"1795_CR40","doi-asserted-by":"publisher","unstructured":"Nugent T, Petroni F, Raman N, Carstens L, Leidner JL (2017) A comparison of classification models for natural disaster and critical event detection from news. In: 2017 IEEE international conference on big data (big data), pp 3750\u20133759. https:\/\/doi.org\/10.1109\/BigData.2017.8258374","DOI":"10.1109\/BigData.2017.8258374"},{"key":"1795_CR41","doi-asserted-by":"publisher","unstructured":"Allan J, Papka R, Lavrenko V (1998) On-line new event detection and tracking. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval. SIGIR \u201998. Association for Computing Machinery, New York, NY, USA, pp 37\u201345. https:\/\/doi.org\/10.1145\/290941.290954","DOI":"10.1145\/290941.290954"},{"key":"1795_CR42","doi-asserted-by":"publisher","unstructured":"Lin Y, Ji H, Huang F, Wu L (2020) A joint neural model for information extraction with global features. In: Proceedings of the 58th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp 7999\u20138009. https:\/\/doi.org\/10.18653\/v1\/2020.acl-main.713. https:\/\/aclanthology.org\/2020.acl-main.713","DOI":"10.18653\/v1\/2020.acl-main.713"},{"key":"1795_CR43","doi-asserted-by":"publisher","unstructured":"Ranasinghe T, Orasan C, Mitkov R (2020) TransQuest: translation quality estimation with cross-lingual transformers. In: Proceedings of the 28th international conference on computational linguistics. International Committee on Computational Linguistics, Barcelona, Spain, pp 5070\u20135081. https:\/\/doi.org\/10.18653\/v1\/2020.coling-main.445. https:\/\/aclanthology.org\/2020.coling-main.445","DOI":"10.18653\/v1\/2020.coling-main.445"},{"key":"1795_CR44","doi-asserted-by":"publisher","first-page":"1833","DOI":"10.1007\/s13042-021-01491-6","volume":"13","author":"C Gao","year":"2022","unstructured":"Gao C, Zhang X, Liu H, Yun W (2022) A joint extraction model of entities and relations based on relation decomposition. Int J Mach Learn Cybernet 13:1833\u20131845. https:\/\/doi.org\/10.1007\/s13042-021-01491-6","journal-title":"Int J Mach Learn Cybernet"},{"key":"1795_CR45","volume-title":"Advances in neural information processing systems","author":"A Vaswani","year":"2017","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30. Curran Associates Inc., New York"},{"issue":"1","key":"1795_CR46","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40537-016-0043-6","volume":"3","author":"K Weiss","year":"2016","unstructured":"Weiss K, Khoshgoftaar TM, Wang D (2016) A survey of transfer learning. J Big data 3(1):1\u201340. https:\/\/doi.org\/10.1186\/s40537-016-0043-6","journal-title":"J Big data"},{"issue":"5","key":"1795_CR47","doi-asserted-by":"publisher","first-page":"3246","DOI":"10.1109\/TGRS.2019.2951445","volume":"58","author":"X He","year":"2020","unstructured":"He X, Chen Y, Ghamisi P (2020) Heterogeneous transfer learning for hyperspectral image classification based on convolutional neural network. IEEE Trans Geosci Remote Sens 58(5):3246\u20133263. https:\/\/doi.org\/10.1109\/TGRS.2019.2951445","journal-title":"IEEE Trans Geosci Remote Sens"},{"issue":"1","key":"1795_CR48","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1109\/JPROC.2020.3004555","volume":"109","author":"F Zhuang","year":"2021","unstructured":"Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2021) A comprehensive survey on transfer learning. Proc IEEE 109(1):43\u201376. https:\/\/doi.org\/10.1109\/JPROC.2020.3004555","journal-title":"Proc IEEE"},{"issue":"1","key":"1795_CR49","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40537-017-0089-0","volume":"4","author":"O Day","year":"2017","unstructured":"Day O, Khoshgoftaar TM (2017) A survey on heterogeneous transfer learning. J Big Data 4(1):1\u201342. https:\/\/doi.org\/10.1186\/s40537-017-0089-0","journal-title":"J Big Data"},{"key":"1795_CR50","doi-asserted-by":"publisher","unstructured":"Shi X, Liu Q, Fan W, Yu PS, Zhu R (2010) Transfer learning on heterogenous feature spaces via spectral transformation. In: 2010 IEEE international conference on data mining, pp 1049\u20131054. https:\/\/doi.org\/10.1109\/ICDM.2010.65","DOI":"10.1109\/ICDM.2010.65"},{"key":"1795_CR51","doi-asserted-by":"crossref","unstructured":"Moon S, Carbonell J (2016) Proactive transfer learning for heterogeneous feature and label spaces. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 706\u2013721","DOI":"10.1007\/978-3-319-46227-1_44"},{"key":"1795_CR52","unstructured":"Cruz JCB, Tan JA, Cheng C (2020) Localization of fake news detection via multitask transfer learning. In: Proceedings of the 12th language resources and evaluation conference. European Language Resources Association, Marseille, France, pp 2596\u20132604. https:\/\/aclanthology.org\/2020.lrec-1.316"},{"key":"1795_CR53","doi-asserted-by":"crossref","unstructured":"Mathew B, Saha P, Yimam S.M, Biemann C, Goyal P, Mukherjee A (2021) Hatexplain: a benchmark dataset for explainable hate speech detection. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 14867\u201314875","DOI":"10.1609\/aaai.v35i17.17745"},{"issue":"12","key":"1795_CR54","doi-asserted-by":"publisher","first-page":"5586","DOI":"10.1109\/TKDE.2021.3070203","volume":"34","author":"Y Zhang","year":"2021","unstructured":"Zhang Y, Yang Q (2021) A survey on multi-task learning. IEEE Trans Knowl Data Eng 34(12):5586\u20135609. https:\/\/doi.org\/10.1109\/TKDE.2021.3070203","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"1795_CR55","doi-asserted-by":"publisher","unstructured":"Hettiarachchi H, Ranasinghe T (2021) TransWiC at SemEval-2021 task 2: transformer-based multilingual and cross-lingual word-in-context disambiguation. In: Proceedings of the 15th international workshop on semantic evaluation (SemEval-2021). Association for Computational Linguistics, pp 771\u2013779. https:\/\/doi.org\/10.18653\/v1\/2021.semeval-1.102. https:\/\/aclanthology.org\/2021.semeval-1.102","DOI":"10.18653\/v1\/2021.semeval-1.102"},{"key":"1795_CR56","doi-asserted-by":"publisher","unstructured":"Ranasinghe T, Zampieri M (2020) Multilingual offensive language identification with cross-lingual embeddings. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, pp 5838\u20135844. https:\/\/doi.org\/10.18653\/v1\/2020.emnlp-main.470. https:\/\/aclanthology.org\/2020.emnlp-main.470","DOI":"10.18653\/v1\/2020.emnlp-main.470"},{"issue":"4","key":"1795_CR57","doi-asserted-by":"publisher","first-page":"399","DOI":"10.1007\/s10009-020-00554-3","volume":"22","author":"DB Abeywickrama","year":"2020","unstructured":"Abeywickrama DB, Bicocchi N, Mamei M, Zambonelli F (2020) The sota approach to engineering collective adaptive systems. Int J Softw Tools Technol Transf 22(4):399\u2013415. https:\/\/doi.org\/10.1007\/s10009-020-00554-3","journal-title":"Int J Softw Tools Technol Transf"},{"key":"1795_CR58","doi-asserted-by":"crossref","unstructured":"Tjong Kim\u00a0Sang EF, De\u00a0Meulder F (2003) Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL 2003, pp 142\u2013147. https:\/\/aclanthology.org\/W03-0419","DOI":"10.3115\/1119176.1119195"},{"key":"1795_CR59","doi-asserted-by":"publisher","unstructured":"Souza F, Nogueira R, Lotufo R (2020) BERTimbau: pretrained BERT models for Brazilian Portuguese. In: 9th Brazilian conference on intelligent systems, BRACIS, Rio Grande do Sul, Brazil, October 20\u201323 (to appear). Springer, Berlin, Heidelberg, pp 403\u2013417. https:\/\/doi.org\/10.1007\/978-3-030-61377-8_28","DOI":"10.1007\/978-3-030-61377-8_28"},{"key":"1795_CR60","unstructured":"Canete J, Chaperon G, Fuentes R, Ho J.-H, Kang H, P\u00e9rez J (2020) Spanish pre-trained bert model and evaluation data. In: PML4DC at ICLR 2020"},{"key":"1795_CR61","doi-asserted-by":"publisher","unstructured":"Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Le\u00a0Scao T, Gugger S, Drame M, Lhoest Q, Rush A (2020) Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations. Association for Computational Linguistics, pp 38\u201345. https:\/\/doi.org\/10.18653\/v1\/2020.emnlp-demos.6. https:\/\/aclanthology.org\/2020.emnlp-demos.6","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"1795_CR62","unstructured":"Wu Y, Schuster M, Chen Z, Le Q.V, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, Kaiser L, Gouws S, Kato Y, Kudo T, Kazawa H, Stevens K, Kurian G, Patil N, Wang W, Young C, Smith J, Riesa J, Rudnick A, Vinyals O, Corrado G, Hughes M, Dean J (2016) Google\u2019s neural machine translation system: bridging the gap between human and machine translation. CoRR arXiV:1609.08144"},{"key":"1795_CR63","doi-asserted-by":"publisher","unstructured":"Kudo T Richardson J (2018) Sentence piece: a simple and language independent subword tokenizer and detokenizer for neural text processing. In: Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations. Association for Computational Linguistics, Brussels, Belgium, pp 66\u201371. https:\/\/doi.org\/10.18653\/v1\/D18-2012. https:\/\/aclanthology.org\/D18-2012","DOI":"10.18653\/v1\/D18-2012"},{"key":"1795_CR64","doi-asserted-by":"publisher","unstructured":"Hettiarachchi H Ranasinghe T (2020) InfoMiner at WNUT-2020 task 2: transformer-based covid-19 informative tweet extraction. In: Proceedings of the sixth workshop on noisy user-generated text (W-NUT 2020). Association for Computational Linguistics, pp 359\u2013365. https:\/\/doi.org\/10.18653\/v1\/2020.wnut-1.49. https:\/\/aclanthology.org\/2020.wnut-1.49","DOI":"10.18653\/v1\/2020.wnut-1.49"}],"container-title":["International Journal of Machine Learning and Cybernetics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13042-023-01795-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s13042-023-01795-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13042-023-01795-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,17]],"date-time":"2023-06-17T08:27:55Z","timestamp":1686990475000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s13042-023-01795-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,8]]},"references-count":64,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2023,8]]}},"alternative-id":["1795"],"URL":"https:\/\/doi.org\/10.1007\/s13042-023-01795-9","relation":{},"ISSN":["1868-8071","1868-808X"],"issn-type":[{"value":"1868-8071","type":"print"},{"value":"1868-808X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,8]]},"assertion":[{"value":"7 June 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 January 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 March 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no competing interests to declare that are relevant to the content of this article.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"The codebase is publicly available on","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Code availability"}}]}}