{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T10:29:31Z","timestamp":1774434571445,"version":"3.50.1"},"reference-count":76,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T00:00:00Z","timestamp":1771632000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T00:00:00Z","timestamp":1771891200000},"content-version":"vor","delay-in-days":3,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62071481"],"award-info":[{"award-number":["62071481"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2026,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Model merging offers a training-free solution to the prohibitive storage costs of the one-model-per-task paradigm. However, current merging techniques present a difficult trade-off. Creating a single model is storage-efficient but hurts performance, while adapter-based methods perform well but introduce a new storage burden. This work introduces CF-STAR, a framework that resolves this dilemma by creating highly compressible adapters. Our approach redefines the adapter itself. Instead of representing the deviation from a pre-trained model, our adapter is defined by the deviation from the multi-task average. We call this new representation the\n                    <jats:italic>centralized<\/jats:italic>\n                    task vector (CTV). This CTV represents a purer form of task-specific knowledge within the merging context, making it fundamentally more compressible. CF-STAR exploits this with a novel\n                    <jats:italic>low-rank plus sparse<\/jats:italic>\n                    decomposition tailored to this representation, capturing both global structure and critical details. Furthermore, the entire pipeline is designed to be synergistic with low-bit quantization, further enabling extreme compression. On diverse benchmarks spanning image classification, NLP, and dense prediction, CF-STAR sets a new state of the art on the accuracy-storage Pareto frontier, achieving up to\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$40\\times $$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mrow>\n                            <mml:mn>40<\/mml:mn>\n                            <mml:mo>\u00d7<\/mml:mo>\n                          <\/mml:mrow>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    adapter compression over strong baselines while maintaining competitive performance.\n                  <\/jats:p>","DOI":"10.1007\/s40747-026-02238-y","type":"journal-article","created":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T02:34:36Z","timestamp":1771641276000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["CF-STAR: Highly compressible adapters for model merging via centralized task vectors"],"prefix":"10.1007","volume":"12","author":[{"given":"Jialin","family":"Wu","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jian","family":"Yang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiajun","family":"Wen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Junjie","family":"Cao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhiyong","family":"Yu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2026,2,21]]},"reference":[{"key":"2238_CR1","doi-asserted-by":"crossref","unstructured":"Yuan L, Chen Y, Wang T, Yu W, Shi Y, Jiang Z-H, Tay FE, Feng J, Yan S (2021) Tokens-to-token vit: Training vision transformers from scratch on imagenet. In: Proceedings of the IEEE\/CVF International Conference on computer vision, pp. 558\u2013567","DOI":"10.1109\/ICCV48922.2021.00060"},{"key":"2238_CR2","unstructured":"Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692"},{"issue":"2","key":"2238_CR3","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1145\/3605943","volume":"56","author":"B Min","year":"2024","unstructured":"Min B, Ross H, Sulem E, Veyseh APB, Nguyen TH, Sainz O, Agirre E, Heintz I, Roth D (2024) Recent advances in natural language processing via large pre-trained language models: a survey. ACM Comput Surv 56(2):30\u201313040. https:\/\/doi.org\/10.1145\/3605943","journal-title":"ACM Comput Surv"},{"key":"2238_CR4","doi-asserted-by":"publisher","unstructured":"Xin Y, Luo S, Zhou H, Du J, Liu X, Fan Y, Li Q, Du Y (2024) Parameter-efficient fine-tuning for pre-trained vision models: a survey. CoRR arXiv:abs\/2402.02242https:\/\/doi.org\/10.48550\/ARXIV.2402.02242","DOI":"10.48550\/ARXIV.2402.02242"},{"key":"2238_CR5","doi-asserted-by":"publisher","unstructured":"Li W, Peng Y, Zhang M, Ding L, Hu H, Shen L (2023) Deep model fusion: a survey. CoRR arXiv:abs\/2309.15698, https:\/\/doi.org\/10.48550\/ARXIV.2309.15698","DOI":"10.48550\/ARXIV.2309.15698"},{"key":"2238_CR6","unstructured":"Yang E, Shen L, Guo G, Wang X, Cao X, Zhang J, Tao D (2024) Model merging in llms, mllms, and beyond: methods, theories, applications and opportunities. arXiv preprint arXiv:2408.07666"},{"key":"2238_CR7","unstructured":"Ilharco G, Ribeiro MT, Wortsman M, Schmidt L, Hajishirzi H, Farhadi A (2023) Editing models with task arithmetic. In: The Eleventh International Conference on learning representations, ICLR 2023. OpenReview.net, Kigali, Rwanda. https:\/\/openreview.net\/forum?id=6t0Kwf8-jrj. Accessed 13th Feb 2026"},{"key":"2238_CR8","unstructured":"Yadav P, Tam D, Choshen L, Raffel CA, Bansal M (2023) Ties-merging: Resolving interference when merging models. In: Oh A, Naumann T, Globerson A, Saenko K, Hardt M, Levine S (eds) Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10-16, 2023. http:\/\/papers.nips.cc\/paper_files\/paper\/2023\/hash\/1644c9af28ab7916874f6fd6228a9bcf-Abstract-Conference.html. Accessed 13th Feb 2026"},{"key":"2238_CR9","unstructured":"Wang K, Dimitriadis N, Ortiz-Jim\u00e9nez G, Fleuret F, Frossard P (2024) Localizing task information for improved model merging and compression. In: Forty-first International Conference on machine learning, ICML 2024. OpenReview.net, Vienna, Austria. https:\/\/openreview.net\/forum?id=DWT9uiGjxT. Accessed 13th Feb 2026"},{"key":"2238_CR10","unstructured":"Huang C, Ye P, Chen T, He T, Yue X, Ouyang W (2024) Emr-merging: Tuning-free high-performance model merging. In: Globersons A, Mackey L, Belgrave D, Fan A, Paquet U, Tomczak JM, Zhang, C (eds) Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10-15, 2024 . http:\/\/papers.nips.cc\/paper_files\/paper\/2024\/hash\/dda5cac5272a9bcd4bc73d90bc725ef1-Abstract-Conference.html. Accessed 13th Feb 2026"},{"key":"2238_CR11","unstructured":"Lu Z, Fan C, Wei W, Qu X, Chen D, Cheng Y (2024) Twin-merging: Dynamic integration of modular expertise in model merging. In: Globersons A, Mackey L, Belgrave D, Fan A, Paquet U, Tomczak JM, Zhang C (eds) Advances in neural information processing systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10-15, 2024 . http:\/\/papers.nips.cc\/paper_files\/paper\/2024\/hash\/8fcd17eb91bae20d9826786d7d6be799-Abstract-Conference.html. Accessed 13th Feb 2026"},{"key":"2238_CR12","doi-asserted-by":"crossref","unstructured":"Qi B, Li F, Wang Z, Gao J, Li D, Ye P, Zhou B (2025) Less is more: efficient model merging with binary task switch. In: Proceedings of the Computer Vision and Pattern Recognition Conference, pp. 15265\u201315274","DOI":"10.1109\/CVPR52734.2025.01422"},{"key":"2238_CR13","unstructured":"Draxler F, Veschgini K, Salmhofer M, Hamprecht F (2018) Essentially no barriers in neural network energy landscape. In: International Conference on machine learning, pp. 1309\u20131318. PMLR"},{"key":"2238_CR14","unstructured":"Garipov T, Izmailov P, Podoprikhin D, Vetrov DP, Wilson AG (2018) Loss surfaces, mode connectivity, and fast ensembling of dnns. In: Advances in neural information processing systems 31"},{"key":"2238_CR15","first-page":"15300","volume":"33","author":"N Tatro","year":"2020","unstructured":"Tatro N, Chen P-Y, Das P, Melnyk I, Sattigeri P, Lai R (2020) Optimizing mode connectivity via neuron alignment. Adv Neural Inf Process Syst 33:15300\u201315311","journal-title":"Adv Neural Inf Process Syst"},{"key":"2238_CR16","first-page":"66727","volume":"36","author":"G Ortiz-Jimenez","year":"2023","unstructured":"Ortiz-Jimenez G, Favero A, Frossard P (2023) Task arithmetic in the tangent space: improved editing of pre-trained models. Adv Neural Inf Process Syst 36:66727\u201366754","journal-title":"Adv Neural Inf Process Syst"},{"key":"2238_CR17","unstructured":"Entezari R, Sedghi H, Saukh O, Neyshabur B (2022) The role of permutation invariance in linear mode connectivity of neural networks. In: The Tenth International Conference on learning representations, ICLR 2022. OpenReview.net, Virtual Event. https:\/\/openreview.net\/forum?id=dNigytemkL. Accessed 13th Feb 2026"},{"key":"2238_CR18","unstructured":"Ainsworth SK, Hayase J, Srinivasa SS (2023) Git re-basin: merging models modulo permutation symmetries. In: The Eleventh International Conference on learning representations, ICLR 2023. OpenReview.net, Kigali, Rwanda. https:\/\/openreview.net\/forum?id=CQsmMYmlP5T. Accessed 13th Feb 2026"},{"key":"2238_CR19","unstructured":"Stoica G, Bolya D, Bjorner J, Ramesh P, Hearn T, Hoffman J (2024) Zipit! merging models from different tasks without training. In: The Twelfth International Conference on learning representations, ICLR 2024. OpenReview.net, Vienna, Austria. https:\/\/openreview.net\/forum?id=LEYUkvdUhq. Accessed 13th Feb 2026"},{"key":"2238_CR20","unstructured":"Tang A, Shen L, Luo Y, Zhan Y, Hu H, Du B, Chen Y, Tao D (2024) Parameter-efficient multi-task model fusion with partial linearization. In: The Twelfth International Conference on learning representations, ICLR 2024. OpenReview.net, Vienna, Austria. https:\/\/openreview.net\/forum?id=iynRvVVAmH. Accessed 13th Feb 2026"},{"key":"2238_CR21","doi-asserted-by":"crossref","unstructured":"Wan F, Zhong L, Yang Z, Chen R, Quan X (2024) Fusechat: knowledge fusion of chat models. arXiv preprint arXiv:2408.07990","DOI":"10.18653\/v1\/2025.emnlp-main.1096"},{"key":"2238_CR22","unstructured":"Singh SP, Jaggi M (2020) Model fusion via optimal transport. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in neural information processing systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, Virtual . https:\/\/proceedings.neurips.cc\/paper\/2020\/hash\/fb2697869f56484404c8ceee2985b01d-Abstract.html. Accessed 13th Feb 2026"},{"key":"2238_CR23","doi-asserted-by":"crossref","unstructured":"Matena M, Raffel C (2022) Merging models with fisher-weighted averaging. In: Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A (eds) Advances in neural information processing systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28-December 9, 2022. http:\/\/papers.nips.cc\/paper_files\/paper\/2022\/hash\/70c26937fbf3d4600b69a129031b66ec-Abstract-Conference.html. Accessed 13th Feb 2026","DOI":"10.52202\/068431-1287"},{"key":"2238_CR24","unstructured":"Jin X, Ren X, Preotiuc-Pietro D, Cheng P (2023) Dataless knowledge fusion by merging weights of language models. In: The Eleventh International Conference on learning representations, ICLR 2023. OpenReview.net, Kigali, Rwanda. https:\/\/openreview.net\/forum?id=FCnohuR6AnM. Accessed 13th Feb 2026"},{"key":"2238_CR25","unstructured":"Yang E, Wang Z, Shen L, Liu S, Guo G, Wang X, Tao D (2024) Adamerging: adaptive model merging for multi-task learning. In: The Twelfth International Conference on learning representations, ICLR 2024. OpenReview.net, Vienna, Austria. https:\/\/openreview.net\/forum?id=nZP6NgD3QY. Accessed 13th Feb 2026"},{"key":"2238_CR26","unstructured":"Yu L, Yu B, Yu H, Huang F, Li Y (2024) Language models are Super Mario: absorbing abilities from homologous models as a free lunch. In: Forty-first International Conference on machine learning, ICML 2024. OpenReview.net, Vienna, Austria. https:\/\/openreview.net\/forum?id=fq0NaiU8Ex. Accessed 13th Feb 2026"},{"key":"2238_CR27","doi-asserted-by":"crossref","unstructured":"Du G, Lee J, Li J, Jiang R, Guo Y, Yu S, Liu H, Goh SK, Tang H, He D, Zhang M (2024) Parameter competition balancing for model merging. In: Globersons A, Mackey L, Belgrave D., Fan A, Paquet U, Tomczak JM, Zhang C (eds) Advances in neural information processing systems 38: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024, Vancouver, BC, Canada, December 10-15, 2024. http:\/\/papers.nips.cc\/paper_files\/paper\/2024\/hash\/99fc8bc48b917c301a80cb74d91c0c06-Abstract-Conference.html. Accessed 13th Feb 2026","DOI":"10.52202\/079017-2691"},{"key":"2238_CR28","unstructured":"Han Z, Gao C, Liu J, Zhang J, Zhang SQ (2024) Parameter-efficient fine-tuning for large models: a comprehensive survey. Trans Mach Learn Res. https:\/\/doi.org\/10.48550\/arXiv.2403.14608"},{"key":"2238_CR29","unstructured":"Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, Wang L, Chen W (2022) Lora: Low-rank adaptation of large language models. In: The Tenth International Conference on learning representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, Virtual Event. https:\/\/openreview.net\/forum?id=nZeVKeeFYf9. Accessed 13th Feb 2026"},{"key":"2238_CR30","doi-asserted-by":"crossref","unstructured":"Dettmers T, Pagnoni A, Holtzman A, Zettlemoyer L (2023) Qlora: efficient finetuning of quantized llms. In: Oh A, Naumann T, Globerson A, Saenko K, Hardt M, Levine S (eds) Advances in neural information processing systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10-16, 2023. http:\/\/papers.nips.cc\/paper_files\/paper\/2023\/hash\/1feb87871436031bdc0f2beaa62a049b-Abstract-Conference.html. Accessed 13th Feb 2026","DOI":"10.52202\/075280-0441"},{"key":"2238_CR31","doi-asserted-by":"publisher","unstructured":"Zaken EB, Goldberg Y, Ravfogel S (2022) Bitfit: simple parameter-efficient fine-tuning for transformer-based masked language-models. In: Muresan S, Nakov P, Villavicencio A (eds) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pp. 1\u20139. Association for Computational Linguistics, Dublin, Ireland. https:\/\/doi.org\/10.18653\/V1\/2022.ACL-SHORT.1","DOI":"10.18653\/V1\/2022.ACL-SHORT.1"},{"key":"2238_CR32","doi-asserted-by":"publisher","unstructured":"Hohman F, Kery MB, Ren D, Moritz D (2024) Model compression in practice: lessons learned from practitioners creating on-device machine learning experiences. In: Mueller FF, Kyburz P, Williamson JR, Sas C, Wilson ML, Dugas POT, Shklovski I (eds) Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI 2024, pp. 645\u2013164518. ACM, Honolulu, HI, USA. https:\/\/doi.org\/10.1145\/3613904.3642109","DOI":"10.1145\/3613904.3642109"},{"key":"2238_CR33","doi-asserted-by":"crossref","unstructured":"Gholami A, Kim S, Dong Z, Yao Z, Mahoney MW, Keutzer K (2021) A survey of quantization methods for efficient neural network inference. CoRR arxiv:2103.13630","DOI":"10.1201\/9781003162810-13"},{"key":"2238_CR34","first-page":"241","volume":"22","author":"T Hoefler","year":"2021","unstructured":"Hoefler T, Alistarh D, Ben-Nun T, Dryden N, Peste A (2021) Sparsity in deep learning: pruning and growth for efficient inference and training in neural networks. J Mach Learn Res 22:241\u20131241124","journal-title":"J Mach Learn Res"},{"issue":"6","key":"2238_CR35","doi-asserted-by":"publisher","first-page":"1789","DOI":"10.1007\/s11263-021-01453-z","volume":"129","author":"J Gou","year":"2021","unstructured":"Gou J, Yu B, Maybank SJ, Tao D (2021) Knowledge distillation: a survey. Int J Comput Vis 129(6):1789\u20131819","journal-title":"Int J Comput Vis"},{"issue":"3","key":"2238_CR36","doi-asserted-by":"publisher","first-page":"60","DOI":"10.3390\/computers12030060","volume":"12","author":"Z Li","year":"2023","unstructured":"Li Z, Li H, Meng L (2023) Model compression for deep neural networks: a survey. Computers 12(3):60","journal-title":"Computers"},{"key":"2238_CR37","doi-asserted-by":"publisher","unstructured":"Xiong F, Cheng R, Chen W, Zhang Z, Guo Y, Yuan C, Xu R (2024) Multi-task model merging via adaptive weight disentanglement. CoRR arXiv:abs\/2411.18729, https:\/\/doi.org\/10.48550\/ARXIV.2411.18729","DOI":"10.48550\/ARXIV.2411.18729"},{"key":"2238_CR38","unstructured":"Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever I (2021) Learning transferable visual models from natural language supervision. In: Meila M, Zhang T (eds) Proceedings of the 38th International Conference on Machine Learning, ICML 2021. Proceedings of Machine Learning Research, vol. 139, pp. 8748\u20138763. PMLR, Virtual Event. http:\/\/proceedings.mlr.press\/v139\/radford21a.html. Accessed 13th Feb 2026"},{"key":"2238_CR39","doi-asserted-by":"crossref","unstructured":"Xiao J, Hays J, Ehinger KA, Oliva A, Torralba A (2010) Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE Computer Society Conference on computer vision and pattern recognition, pp. 3485\u20133492. IEEE","DOI":"10.1109\/CVPR.2010.5539970"},{"key":"2238_CR40","doi-asserted-by":"crossref","unstructured":"Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 554\u2013561","DOI":"10.1109\/ICCVW.2013.77"},{"issue":"10","key":"2238_CR41","doi-asserted-by":"publisher","first-page":"1865","DOI":"10.1109\/JPROC.2017.2675998","volume":"105","author":"G Cheng","year":"2017","unstructured":"Cheng G, Han J, Lu X (2017) Remote sensing image scene classification: benchmark and state of the art. Proc IEEE 105(10):1865\u20131883","journal-title":"Proc IEEE"},{"issue":"7","key":"2238_CR42","doi-asserted-by":"publisher","first-page":"2217","DOI":"10.1109\/JSTARS.2019.2918242","volume":"12","author":"P Helber","year":"2019","unstructured":"Helber P, Bischke B, Dengel A, Borth D (2019) Eurosat: a novel dataset and deep learning benchmark for land use and land cover classification. IEEE J Sel Top Appl Earth Observ Remote Sens 12(7):2217\u20132226","journal-title":"IEEE J Sel Top Appl Earth Observ Remote Sens"},{"key":"2238_CR43","unstructured":"Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY et al (2011) Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning, vol. 2011, p. 4. Granada"},{"key":"2238_CR44","doi-asserted-by":"publisher","first-page":"323","DOI":"10.1016\/j.neunet.2012.02.016","volume":"32","author":"J Stallkamp","year":"2012","unstructured":"Stallkamp J, Schlipsing M, Salmen J, Igel C (2012) Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition. Neural Netw 32:323\u2013332","journal-title":"Neural Netw"},{"issue":"11","key":"2238_CR45","doi-asserted-by":"publisher","first-page":"2278","DOI":"10.1109\/5.726791","volume":"86","author":"Y LeCun","year":"1998","unstructured":"LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278\u20132324","journal-title":"Proc IEEE"},{"key":"2238_CR46","doi-asserted-by":"crossref","unstructured":"Cimpoi M, Maji S, Kokkinos I, Mohamed S, Vedaldi A (2014) Describing textures in the wild. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 3606\u20133613","DOI":"10.1109\/CVPR.2014.461"},{"key":"2238_CR47","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.5143773","author":"G Ilharco","year":"2021","unstructured":"Ilharco G, Wortsman M, Wightman R, Gordon C, Carlini N, Taori R, Dave A, Shankar V, Namkoong H, Miller J, Hajishirzi H, Farhadi A, Schmidt L (2021). OpenCLIP Zenodo. https:\/\/doi.org\/10.5281\/zenodo.5143773","journal-title":"OpenCLIP Zenodo"},{"key":"2238_CR48","doi-asserted-by":"crossref","unstructured":"Nilsback M-E, Zisserman A (2008) Automated flower classification over a large number of classes. In: 2008 Sixth Indian Conference on computer vision, graphics & image processing, pp. 722\u2013729. IEEE","DOI":"10.1109\/ICVGIP.2008.47"},{"key":"2238_CR49","doi-asserted-by":"crossref","unstructured":"Veeling BS, Linmans J, Winkens J, Cohen T, Welling M (2018) Rotation equivariant cnns for digital pathology. In: Medical Image Computing and Computer Assisted Intervention\u2013MICCAI 2018: 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part II 11, pp. 210\u2013218. Springer","DOI":"10.1007\/978-3-030-00934-2_24"},{"key":"2238_CR50","doi-asserted-by":"crossref","unstructured":"Goodfellow IJ, Erhan D, Carrier PL, Courville A, Mirza M, Hamner B, Cukierski W, Tang Y, Thaler D, Lee D-H, et al (2013) Challenges in representation learning: A report on three machine learning contests. In: Neural Information Processing: 20th International Conference, ICONIP 2013, Daegu, Korea, November 3-7, 2013. Proceedings, Part III 20, pp. 117\u2013124. Springer","DOI":"10.1007\/978-3-642-42051-1_16"},{"key":"2238_CR51","doi-asserted-by":"crossref","unstructured":"Parkhi OM, Vedaldi A, Zisserman A, Jawahar C (2012) Cats and dogs. In: 2012 IEEE Conference on computer vision and pattern recognition, pp. 3498\u20133505. IEEE","DOI":"10.1109\/CVPR.2012.6248092"},{"key":"2238_CR52","unstructured":"Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the Fourteenth International Conference on artificial intelligence and statistics, pp. 215\u2013223. JMLR Workshop and Conference Proceedings"},{"key":"2238_CR53","unstructured":"Krizhevsky A, Hinton G, et al (2009) Learning multiple layers of features from tiny images"},{"key":"2238_CR54","doi-asserted-by":"crossref","unstructured":"Bossard L, Guillaumin M, Van\u00a0Gool L (2014) Food-101\u2013mining discriminative components with random forests. In: Computer vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI 13, pp. 446\u2013461. Springer","DOI":"10.1007\/978-3-319-10599-4_29"},{"key":"2238_CR55","unstructured":"Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747"},{"key":"2238_CR56","doi-asserted-by":"crossref","unstructured":"Cohen G, Afshar S, Tapson J, Van\u00a0Schaik A (2017) Emnist: extending mnist to handwritten letters. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2921\u20132926. IEEE","DOI":"10.1109\/IJCNN.2017.7966217"},{"key":"2238_CR57","unstructured":"Clanuwat T, Bober-Irizar M, Kitamoto A, Lamb A, Yamamoto K, Ha D (2018) Deep learning for classical Japanese literature. arXiv preprint arXiv:1812.01718"},{"key":"2238_CR58","doi-asserted-by":"crossref","unstructured":"Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on empirical methods in natural language processing, pp. 1631\u20131642","DOI":"10.18653\/v1\/D13-1170"},{"key":"2238_CR59","unstructured":"Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, et al (2021) Learning transferable visual models from natural language supervision. In: International Conference on machine learning, pp. 8748\u20138763. PMLR"},{"key":"2238_CR60","unstructured":"Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized BERT pretraining approach. CoRR arxiv:1907.11692"},{"key":"2238_CR61","doi-asserted-by":"publisher","first-page":"625","DOI":"10.1162\/TACL_A_00290","volume":"7","author":"A Warstadt","year":"2019","unstructured":"Warstadt A, Singh A, Bowman SR (2019) Neural network acceptability judgments. Trans Assoc Comput Linguist 7:625\u2013641. https:\/\/doi.org\/10.1162\/TACL_A_00290","journal-title":"Trans Assoc Comput Linguist"},{"key":"2238_CR62","doi-asserted-by":"crossref","unstructured":"Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, 18-21 October 2013, A Meeting of SIGDAT, a Special Interest Group of The ACL, pp. 1631\u20131642. ACL, Seattle, Washington, USA. https:\/\/aclanthology.org\/D13-1170\/. Accessed 13th Feb 2026","DOI":"10.18653\/v1\/D13-1170"},{"key":"2238_CR63","unstructured":"Dolan WB, Brockett C (2005) Automatically constructing a corpus of sentential paraphrases. In: Proceedings of the Third International Workshop on Paraphrasing, IWP@IJCNLP 2005. Asian Federation of Natural Language Processing, Jeju Island, Korea. https:\/\/aclanthology.org\/I05-5002\/. Accessed 13th Feb 2026"},{"key":"2238_CR64","doi-asserted-by":"crossref","unstructured":"Cer DM, Diab MT, Agirre E, Lopez-Gazpio I, Specia L (2017) Semeval-2017 task 1: semantic textual similarity-multilingual and cross-lingual focused evaluation. arxiv:1708.00055","DOI":"10.18653\/v1\/S17-2001"},{"key":"2238_CR65","unstructured":"Iyer S, Dandekar N, Csernai K, et al (2017) First quora dataset release: Question pairs. data. quora. com"},{"key":"2238_CR66","doi-asserted-by":"publisher","unstructured":"Williams A, Nangia N, Bowman SR (2018) A broad-coverage challenge corpus for sentence understanding through inference. In: Walker MA, Ji H, Stent A (eds) Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, Volume 1 (Long Papers), pp. 1112\u20131122. Association for Computational Linguistics, New Orleans, Louisiana, USA. https:\/\/doi.org\/10.18653\/V1\/N18-1101","DOI":"10.18653\/V1\/N18-1101"},{"key":"2238_CR67","doi-asserted-by":"publisher","unstructured":"Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) Squad: 100, 000+ questions for machine comprehension of text. In: Su J, Carreras X, Duh K (eds) Proceedings of the 2016 Conference on empirical methods in natural language processing, EMNLP 2016, pp. 2383\u20132392. The Association for Computational Linguistics, Austin, Texas, USA. https:\/\/doi.org\/10.18653\/V1\/D16-1264","DOI":"10.18653\/V1\/D16-1264"},{"key":"2238_CR68","doi-asserted-by":"crossref","unstructured":"Giampiccolo D, Magnini B, Dagan I, Dolan B (2007) The third PASCAL recognizing textual entailment challenge. In: Sekine S, Inui K, Dagan I, Dolan B, Giampiccolo D, Magnini B (eds) Proceedings of the ACL-PASCAL@ACL 2007 Workshop on Textual Entailment and Paraphrasing, pp. 1\u20139. Association for Computational Linguistics, Prague, Czech Republic. https:\/\/aclanthology.org\/W07-1401\/. Accessed 13th Feb 2026","DOI":"10.3115\/1654536.1654538"},{"key":"2238_CR69","doi-asserted-by":"publisher","unstructured":"Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. In: Fitzgibbon AW, Lazebnik S, Perona P, Sato Y, Schmid C (eds) Computer Vision-ECCV 2012-12th European Conference on Computer Vision, Proceedings, Part V. Lecture Notes in Computer Science, vol. 7576, pp. 746\u2013760. Springer, Florence, Italy. https:\/\/doi.org\/10.1007\/978-3-642-33715-4_54","DOI":"10.1007\/978-3-642-33715-4_54"},{"key":"2238_CR70","doi-asserted-by":"publisher","unstructured":"Tang A, Shen L, Luo Y, Hu H, Du B, Tao D (2024) Fusionbench: a comprehensive benchmark of deep model fusion. CoRR arXiv:abs\/2406.03280, https:\/\/doi.org\/10.48550\/ARXIV.2406.03280","DOI":"10.48550\/ARXIV.2406.03280"},{"key":"2238_CR71","doi-asserted-by":"crossref","unstructured":"Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition, pp. 248\u2013255 . Ieee","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"2238_CR72","doi-asserted-by":"publisher","DOI":"10.1016\/J.NEUNET.2024.106796","volume":"181","author":"X Liu","year":"2025","unstructured":"Liu X, Bai Y, Lu Y, Soltoggio A, Kolouri S (2025) Wasserstein task embedding for measuring task similarities. Neural Netw 181:106796. https:\/\/doi.org\/10.1016\/J.NEUNET.2024.106796","journal-title":"Neural Netw"},{"issue":"12","key":"2238_CR73","doi-asserted-by":"publisher","first-page":"5586","DOI":"10.1109\/TKDE.2021.3070203","volume":"34","author":"Y Zhang","year":"2021","unstructured":"Zhang Y, Yang Q (2021) A survey on multi-task learning. IEEE Trans Knowl Data Eng 34(12):5586\u20135609","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"3","key":"2238_CR74","doi-asserted-by":"publisher","first-page":"50","DOI":"10.1109\/MSP.2020.2975749","volume":"37","author":"T Li","year":"2020","unstructured":"Li T, Sahu AK, Talwalkar A, Smith V (2020) Federated learning: challenges, methods, and future directions. IEEE Signal Process Mag 37(3):50\u201360","journal-title":"IEEE Signal Process Mag"},{"key":"2238_CR75","unstructured":"McMahan B, Moore E, Ramage D, Hampson S, Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. In: Singh A, Zhu XJ (eds) Proceedings of the 20th International Conference on artificial intelligence and statistics, AISTATS 2017, 20-22 April 2017, Fort Lauderdale, FL, USA. Proceedings of Machine Learning Research, vol. 54, pp. 1273\u20131282. PMLR, Fort Lauderdale, FL . http:\/\/proceedings.mlr.press\/v54\/mcmahan17a.html. Accessed 13th Feb 2026"},{"key":"2238_CR76","doi-asserted-by":"publisher","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pp. 770\u2013778. IEEE Computer Society, Las Vegas. https:\/\/doi.org\/10.1109\/CVPR.2016.90","DOI":"10.1109\/CVPR.2016.90"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-026-02238-y","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-026-02238-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-026-02238-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T07:56:54Z","timestamp":1774425414000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-026-02238-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,21]]},"references-count":76,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,3]]}},"alternative-id":["2238"],"URL":"https:\/\/doi.org\/10.1007\/s40747-026-02238-y","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,21]]},"assertion":[{"value":"30 June 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 January 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 February 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"114"}}