{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T18:01:40Z","timestamp":1776448900149,"version":"3.51.2"},"reference-count":46,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2024,12,2]],"date-time":"2024-12-02T00:00:00Z","timestamp":1733097600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Ministry of Economy, Innovation, Digitization, and  Energy of the State of North Rhine-Westphalia within the projects Digital.Zirkul\u00e4r.Ruhr and Circular Performer Emscher Lippe"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>Research and applications in artificial intelligence have recently shifted with the rise of large pretrained models, which deliver state-of-the-art results across numerous tasks. However, the substantial increase in parameters introduces a need for parameter-efficient training strategies. Despite significant advancements, limited research has explored parameter-efficient fine-tuning (PEFT) methods in the context of transformer-based models for instance segmentation. Addressing this gap, this study investigates the effectiveness of PEFT methods, specifically adapters and Low-Rank Adaptation (LoRA), applied to two models across four benchmark datasets. Integrating sequentially arranged adapter modules and applying LoRA to deformable attention\u2014explored here for the first time\u2014achieves competitive performance while fine-tuning only about 1\u20136% of model parameters, a marked improvement over the 40\u201355% required in traditional fine-tuning. Key findings indicate that using 2\u20133 adapters per transformer block offers an optimal balance of performance and efficiency. Furthermore, LoRA, exhibits strong parameter efficiency when applied to deformable attention, and in certain cases surpasses adapter configurations. These results show that the impact of PEFT techniques varies based on dataset complexity and model architecture, underscoring the importance of context-specific tuning. Overall, this work demonstrates the potential of PEFT to enable scalable, customizable, and computationally efficient transfer learning for instance segmentation tasks.<\/jats:p>","DOI":"10.3390\/make6040133","type":"journal-article","created":{"date-parts":[[2024,12,5]],"date-time":"2024-12-05T09:30:18Z","timestamp":1733391018000},"page":"2783-2807","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Parameter-Efficient Fine-Tuning of Large Pretrained Models for Instance Segmentation Tasks"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9683-5920","authenticated-orcid":false,"given":"Nermeen","family":"Abou Baker","sequence":"first","affiliation":[{"name":"Computer Science Department, Ruhr West University of Applied Sciences, L\u00fctzowstra\u00dfe 5, 46236 Bottrop, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-4693-3797","authenticated-orcid":false,"given":"David","family":"Rohrschneider","sequence":"additional","affiliation":[{"name":"Computer Science Department, Ruhr West University of Applied Sciences, L\u00fctzowstra\u00dfe 5, 46236 Bottrop, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1230-9446","authenticated-orcid":false,"given":"Uwe","family":"Handmann","sequence":"additional","affiliation":[{"name":"Computer Science Department, Ruhr West University of Applied Sciences, L\u00fctzowstra\u00dfe 5, 46236 Bottrop, Germany"}]}],"member":"1968","published-online":{"date-parts":[[2024,12,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., and Lo, W.Y. (2023). Segment Anything. arXiv.","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Rohrschneider, D., Abou Baker, N., and Handmann, U. (2023, January 19\u201321). Double Transfer Learning to Detect Lithium-Ion Batteries on X-Ray Images. Proceedings of the 17th International Work-Conference on Artificial Neural Networks (IWANN), Ponta Delgada, Portugal.","DOI":"10.1007\/978-3-031-43085-5_14"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Qiu, Y., and Jin, Y. (2024). ChatGPT and Finetuned BERT: A comparative Study for Developing Intelligent Design Dupport Systems. Intell. Syst. Appl., 21.","DOI":"10.1016\/j.iswa.2023.200308"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Ebrahim, F., and Joy, M. (2024, January 20). Few-Shot Issue Report Classification with Adapters. Proceedings of the International Workshop on NL-Based Software Engineering, Lisbon, Portugal.","DOI":"10.1145\/3643787.3648039"},{"key":"ref_5","unstructured":"Dettmers, T., Pagnoni, A., Holtzman, A., and Zettlemoyer, L. (2024, January 10\u201315). QLoRA: Efficient Finetuning of Quantized LLMs. Proceedings of the 37th International Conference on Neural Information Processing Systems (NeurIPS), New Orleans, LA, USA."},{"key":"ref_6","unstructured":"Zhang, R., Han, J., Liu, C., Zhou, A., Lu, P., Qiao, Y., Li, H., and Gao, P. (2024, January 7\u201311). LLaMA-Adapter: Efficient Fine-tuning of Large Language Models with Zero-initialized Attention. Proceedings of the 12th International Conference on Learning Representations (ICLR), Vienna, Austria."},{"key":"ref_7","unstructured":"Stickland, A.C., Berard, A., and Nikoulina, V. (2021, January 10\u201311). Multilingual Domain Adaptation for NMT: Decoupling Language and Domain Information with Adapters. Proceedings of the 6th Conference on Machine Translation, Punta Cana, Dominican Republic."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Bapna, A., and Firat, O. (2019, January 3\u20137). Simple, Scalable Adaptation for Neural Machine Translation. Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China.","DOI":"10.18653\/v1\/D19-1165"},{"key":"ref_9","unstructured":"Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. (2022, January 25\u201329). LoRA: Low-Rank Adaptation of Large Language Models. Proceedings of the 10th International Conference on Learning Representations (ICLR), Online."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Li, X.L., and Liang, P. (2021, January 1\u20136). Prefix-Tuning: Optimizing Continuous Prompts for Generation. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Bangkok, Thailand.","DOI":"10.18653\/v1\/2021.acl-long.353"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Chen, G., Liu, F., Meng, Z., and Liang, S. (2022, January 7\u201311). Revisiting Parameter-Efficient Tuning: Are We Really There Yet?. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Abu Dhabi, United Arab Emirates.","DOI":"10.18653\/v1\/2022.emnlp-main.168"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Lester, B., Al-Rfou, R., and Constant, N. (2021, January 7\u201311). The Power of Scale for Parameter-Efficient Prompt Tuning. Proceedings of the Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic.","DOI":"10.18653\/v1\/2021.emnlp-main.243"},{"key":"ref_13","unstructured":"Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., De Laroussilhe, Q., Gesmundo, A., Attariyan, M., and Gelly, S. (2019, January 9\u201315). Parameter-Efficient Transfer Learning for NLP. Proceedings of the 36th International Conference on Machine Learning (ICML), Long Beach, CA, USA."},{"key":"ref_14","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2021, January 3\u20137). An Image is Worth 16\u00d716 Words: Transformers for Image Recognition at Scale. Proceedings of the 9th International Conference on Learning Representations (ICLR), Online."},{"key":"ref_15","unstructured":"Chen, Z., Duan, Y., Wang, W., He, J., Lu, T., Dai, J., and Qiao, Y. (2023, January 1\u20135). Vision Transformer Adapter for Dense Predictions. Proceedings of the 11th International Conference on Learning Representations (ICLR), Online."},{"key":"ref_16","unstructured":"Abou Baker, N., and Handmann, U. (2023, January 4\u20136). Don\u2019t Waste SAM. Proceedings of the 31st European Symposium on Artificial Neural Networks (ESANN), Bruges, Belgium."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Chen, T., Zhu, L., Ding, C., Cao, R., Wang, Y., Zhang, S., Li, Z., Sun, L., Zang, Y., and Mao, P. (2023, January 2\u20133). SAM-Adapter: Adapting Segment Anything in Underperformed Scenes. Proceedings of the International Conference on Computer Vision Workshops (ICCVW), Paris, France.","DOI":"10.1109\/ICCVW60793.2023.00361"},{"key":"ref_18","unstructured":"Wu, J., Ji, W., Liu, Y., Fu, H., Xu, M., Xu, Y., and Jin, Y. (2023). Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation. arXiv."},{"key":"ref_19","unstructured":"Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., and Clark, J. (2021, January 18\u201324). Learning Transferable Visual Models From Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning (ICML), Online."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"581","DOI":"10.1007\/s11263-023-01891-x","article-title":"CLIP-Adapter: Better Vision-Language Models with Feature Adapters","volume":"132","author":"Gao","year":"2024","journal-title":"Int. J. Comput. Vis."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Caron, M., Touvron, H., Misra, I., J\u00e9gou, H., Mairal, J., Bojanowski, P., and Joulin, A. (2021, January 11\u201317). Emerging Properties in Self-Supervised Vision Transformers. Proceedings of the International Conference on Computer Vision (ICCV), Online.","DOI":"10.1109\/ICCV48922.2021.00951"},{"key":"ref_22","unstructured":"Oquab, M., Darcet, T., Moutakanni, T., Vo, H.V., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., and El-Nouby, A. (2024). DINOv2: Learning Robust Visual Features without Supervision. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zhang, B., Chen, Y., Bai, L., Zhao, Y., Sun, Y., Yuan, Y., Zhang, J., and Ren, H. (2024). Learning to Adapt Foundation Model DINOv2 for Capsule Endoscopy Diagnosis. arXiv.","DOI":"10.1016\/j.procs.2024.11.024"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1013","DOI":"10.1007\/s11548-024-03083-5","article-title":"Surgical-DINO: Adapter Learning of Foundation Models for Depth Estimation in Endoscopic Surgery","volume":"19","author":"Cui","year":"2024","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"277","DOI":"10.3103\/S1060992X2306005X","article-title":"Low Rank Adaptation for Stable Domain Adaptation of Vision Transformers","volume":"32","author":"Filatov","year":"2023","journal-title":"Opt. Mem. Neural Netw."},{"key":"ref_26","unstructured":"Chavan, A., Liu, Z., Gupta, D., Xing, E., and Shen, Z. (2023). One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Chen, X., Liu, J., Wang, Y., Wang, P.P., Brand, M., Wang, G., and Koike-Akino, T. (2024). SuperLoRA: Parameter-Efficient Unified Adaptation of Multi-Layer Attention Modules. arXiv.","DOI":"10.1109\/CVPRW63382.2024.00804"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Lei\u00f1ena, J., Saiz, F.A., and Barandiaran, I. (2024). Latent Diffusion Models to Enhance the Performance of Visual Defect Segmentation Networks in Steel Surface Inspection. Sensors, 24.","DOI":"10.3390\/s24186016"},{"key":"ref_29","unstructured":"Zou, X., Yang, J., Zhang, H., Li, F., Li, L., Wang, J., Wang, L., Gao, J., and Lee, Y.J. (2024, January 10\u201315). Segment Everything Everywhere All at Once. Proceedings of the 37th International Conference on Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Li, F., Zhang, H., Xu, H., Liu, S., Zhang, L., Ni, L.M., and Shum, H.Y. (2023, January 18\u201322). Mask DINO: Towards a Unified Transformer-Based Framework for Object Detection and Segmentation. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00297"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1109\/JAS.2021.1004210","article-title":"Scribble-Supervised Video Object Segmentation","volume":"9","author":"Huang","year":"2022","journal-title":"IEEE\/CAA J. Autom. Sin."},{"key":"ref_32","unstructured":"Ravi, N., Gabeur, V., Hu, Y.T., Hu, R., Ryali, C., Ma, T., Khedr, H., R\u00e4dle, R., Rolland, C., and Gustafson, L. (2024). SAM 2: Segment Anything in Images and Videos. arXiv."},{"key":"ref_33","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (\u20131, January 26). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11\u201317). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. Proceedings of the International Conference on Computer Vision (ICCV), Online.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_35","unstructured":"Cheng, B., Schwing, A., and Kirillov, A. (2021, January 6\u201314). Per-Pixel Classification is Not All You Need for Semantic Segmentation. Proceedings of the 35th International Conference on Neural Information Processing Systems (NeurIPS), Online."},{"key":"ref_36","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2021, January 3\u20137). Deformable DETR: Deformable Transformers for End-to-End Object Detection. Proceedings of the 9th International Conference on Learning Representations (ICLR), Online."},{"key":"ref_37","unstructured":"Yang, J., Li, C., Dai, X., and Gao, J. (December, January 28). Focal Modulation Networks. Proceedings of the 36th International Conference on Neural Information Processing Systems (NeurIPS), New Orleans, LA, USA."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., and Girdhar, R. (2022, January 19\u201324). Masked-attention Mask Transformer for Universal Image Segmentation. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00135"},{"key":"ref_40","unstructured":"Lin, B. (2024, November 14). LoRA-Torch: PyTorch Reimplementation of LoRA. Available online: https:\/\/github.com\/Baijiong-Lin\/LoRA-Torch."},{"key":"ref_41","unstructured":"Trotter, C., Atkinson, G., Sharpe, M., Richardson, K., McGough, A.S., Wright, N., Burville, B., and Berggren, P. (2020). NDD20: A large-Scale Few-shot Dolphin Dataset for Coarse and Fine-grained Categorisation. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Bashkirova, D., Abdelfattah, M., Zhu, Z., Akl, J., Alladkani, F., Hu, P., Ablavsky, V., Calli, B., Bargal, S.A., and Saenko, K. (2022, January 22\u201324). ZeroWaste Dataset: Towards Deformable Object Segmentation in Cluttered Scenes. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.02047"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Qiu, L., Xiong, Z., Wang, X., Liu, K., Li, Y., Chen, G., Han, X., and Cui, S. (2022, January 22\u201324). ETHSeg: An Amodel Instance Segmentation Network and a Real-world Dataset for X-Ray Waste Inspection. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00232"},{"key":"ref_44","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. (July, January 26). The Cityscapes Dataset for Semantic Urban Scene Understanding. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Zhang, H., Li, F., Zou, X., Liu, S., Li, C., Yang, J., and Zhang, L. (2023, January 2\u20136). A Simple Framework for Open-Vocabulary Segmentation and Detection. Proceedings of the International Conference on Computer Vision (ICCV), Paris, France.","DOI":"10.1109\/ICCV51070.2023.00100"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Jain, J., Li, J., Chiu, M.T., Hassani, A., Orlov, N., and Shi, H. (2023, January 18\u201322). OneFormer: One Transformer to Rule Universal Image Segmentation. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00292"}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/4\/133\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:45:08Z","timestamp":1760114708000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/4\/133"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,2]]},"references-count":46,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["make6040133"],"URL":"https:\/\/doi.org\/10.3390\/make6040133","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,2]]}}}