{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,12]],"date-time":"2026-05-12T14:17:15Z","timestamp":1778595435655,"version":"3.51.4"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T00:00:00Z","timestamp":1707436800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T00:00:00Z","timestamp":1707436800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002386","name":"Cairo University","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100002386","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Pre-trained BERT models have demonstrated exceptional performance in the context of text classification tasks. Certain problem domains necessitate data distribution without data sharing. Federated Learning (FL) allows multiple clients to collectively train a global model by sharing learned models rather than raw data. However, the adoption of BERT, a large model, within a Federated Learning framework incurs substantial communication costs. To address this challenge, we propose a novel framework, FedFreezeBERT, for BERT-based text classification. FedFreezeBERT works by adding an aggregation architecture on top of BERT to obtain better sentence embedding for classification while freezing BERT parameters. Keeping the model parameters frozen, FedFreezeBERT reduces the communication costs by a large factor compared to other state-of-the-art methods. FedFreezeBERT is implemented in a distributed version where the aggregation architecture only is being transferred and aggregated by FL algorithms such as FedAvg or FedProx. FedFreezeBERT is also implemented in a centralized version where the data embeddings extracted by BERT are sent to the central server to train the aggregation architecture. The experiments show that FedFreezeBERT achieves new state-of-the-art performance on Arabic sentiment analysis on the ArSarcasm-v2 dataset with a 12.9% and 1.2% improvement over FedAvg\/FedProx and the previous SOTA respectively. FedFreezeBERT also reduces the communication cost by 5<jats:inline-formula><jats:alternatives><jats:tex-math>$$\\times$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mo>\u00d7<\/mml:mo>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> compared to the previous SOTA.<\/jats:p>","DOI":"10.1186\/s40537-024-00885-x","type":"journal-article","created":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T16:02:13Z","timestamp":1707494533000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["Federated Freeze BERT for text classification"],"prefix":"10.1186","volume":"11","author":[{"given":"Omar","family":"Galal","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ahmed H.","family":"Abdel-Gawad","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mona","family":"Farouk","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,2,9]]},"reference":[{"key":"885_CR1","doi-asserted-by":"crossref","unstructured":"Abdul-Mageed M, Elmadany A, Nagoudi EMB. Arbert & marbert: deep bidirectional transformers for arabic. arXiv preprint. 2020. arXiv:2101.01785.","DOI":"10.18653\/v1\/2021.acl-long.551"},{"key":"885_CR2","unstructured":"Abu\u00a0Farha I, Zaghouani W, Magdy W (2021) Overview of the WANLP 2021 shared task on sarcasm and sentiment detection in Arabic. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual). 2021. pp. 296\u2013305, https:\/\/aclanthology.org\/2021.wanlp-1.36."},{"key":"885_CR3","unstructured":"Acar DAE, Zhao Y, Navarro RM, et\u00a0al. Federated learning based on dynamic regularization. arXiv preprint. 2021. arXiv:2111.04263."},{"key":"885_CR4","doi-asserted-by":"crossref","unstructured":"Bisong E, Bisong E. Google Collaboratory. Building machine learning and deep learning models on google cloud platform: a comprehensive guide for beginners. 2019. pp. 59\u201364.","DOI":"10.1007\/978-1-4842-4470-8_7"},{"key":"885_CR5","first-page":"1877","volume":"33","author":"T Brown","year":"2020","unstructured":"Brown T, Mann B, Ryder N, et al. Language models are few-shot learners. Adv Neural Inform Process Syst. 2020;33:1877\u2013901.","journal-title":"Adv Neural Inform Process Syst"},{"key":"885_CR6","unstructured":"Devlin J, Chang MW, Lee K, et\u00a0al. Bert: Pre-training of deep bidirectional transformers for language understanding. 2018. arXiv preprint arXiv:1810.04805."},{"key":"885_CR7","doi-asserted-by":"publisher","first-page":"154290","DOI":"10.1109\/ACCESS.2019.2946594","volume":"7","author":"Z Gao","year":"2019","unstructured":"Gao Z, Feng A, Song X, et al. Target-dependent sentiment classification with bert. IEEE Access. 2019;7:154290\u20139.","journal-title":"IEEE Access"},{"key":"885_CR8","doi-asserted-by":"crossref","unstructured":"He A, Wang J, Huang Z, et\u00a0al. Fedsmart: An auto updating federated learning optimization mechanism. In: Web and Big Data: 4th International Joint Conference, APWeb-WAIM 2020, Tianjin, China, September 18-20, 2020, Proceedings, Part I 4. Cham: Springer; 2020. pp. 716\u2013724.","DOI":"10.1007\/978-3-030-60259-8_52"},{"key":"885_CR9","first-page":"14068","volume":"33","author":"C He","year":"2020","unstructured":"He C, Annavaram M, Avestimehr S. Group knowledge transfer: federated learning of large cnns at the edge. Adv Neural Inform Process Syst. 2020;33:14068\u201380.","journal-title":"Adv Neural Inform Process Syst"},{"key":"885_CR10","unstructured":"He C, Li S, So J, et\u00a0al. Fedml: A research library and benchmark for federated machine learning. arXiv preprint. 2020. arXiv:2007.13518."},{"key":"885_CR11","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, et\u00a0al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. pp. 770\u2013778.","DOI":"10.1109\/CVPR.2016.90"},{"key":"885_CR12","doi-asserted-by":"crossref","unstructured":"Hilmkil A, Callh S, Barbieri M, et\u00a0al (2021) Scaling federated learning for fine-tuning of large language models. In: Natural Language Processing and Information Systems: 26th International Conference on Applications of Natural Language to Information Systems, NLDB 2021, Saarbr\u00fccken, Germany, June 23\u201325, 2021, Proceedings, Springer; 2021. pp. 15\u201323.","DOI":"10.1007\/978-3-030-80599-9_2"},{"key":"885_CR13","unstructured":"Karimi A, Rossi L, Prati A. Improving bert performance for aspect-based sentiment analysis. arXiv preprint. 2020. arXiv:2010.11731."},{"key":"885_CR14","doi-asserted-by":"publisher","first-page":"113","DOI":"10.1007\/978-1-4842-2766-4_8","volume-title":"Deep learning with python: a hands-on introduction","author":"N Ketkar","year":"2017","unstructured":"Ketkar N, Ketkar N. Stochastic gradient descent. In: Ketkar N, editor. Deep learning with python: a hands-on introduction. Berkeley: Apress; 2017. p. 113\u201332."},{"key":"885_CR15","unstructured":"Kingma DP, Ba J. Adam: a method for stochastic optimization. arXiv preprint. 2014. arXiv:1412.6980."},{"key":"885_CR16","doi-asserted-by":"crossref","unstructured":"Lai G, Xie Q, Liu H, et\u00a0al. Race: Large-scale reading comprehension dataset from examinations. arXiv preprint. 2017. arXiv:1704.04683.","DOI":"10.18653\/v1\/D17-1082"},{"key":"885_CR17","unstructured":"Lan Z, Chen M, Goodman S, et\u00a0al. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint. 2019. arXiv:1909.11942."},{"key":"885_CR18","first-page":"429","volume":"2","author":"T Li","year":"2020","unstructured":"Li T, Sahu AK, Zaheer M, et al. Federated optimization in heterogeneous networks. Proc Mach Learn Syst. 2020;2:429\u201350.","journal-title":"Proc Mach Learn Syst"},{"key":"885_CR19","doi-asserted-by":"crossref","unstructured":"Lin BY, He C, Zeng Z, et\u00a0al. Fednlp: a research platform for federated learning in natural language processing. arXiv preprint. 2021. arXiv:2104.08815.","DOI":"10.18653\/v1\/2022.findings-naacl.13"},{"key":"885_CR20","doi-asserted-by":"crossref","unstructured":"Lit Z, Sit S, Wang J, et\u00a0al. Federated split bert for heterogeneous text classification. In: 2022 International joint conference on neural networks (IJCNN), IEEE; 2022. pp 1\u20138.","DOI":"10.1109\/IJCNN55064.2022.9892845"},{"key":"885_CR21","volume-title":"Twitter API: up and running: learn how to build applications with the Twitter API","author":"K Makice","year":"2009","unstructured":"Makice K. Twitter API: up and running: learn how to build applications with the Twitter API. Sebastopol: O\u2019 Reilly Media Inc; 2009."},{"key":"885_CR22","unstructured":"McMahan B, Moore E, Ramage D, et\u00a0al. Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics, PMLR; 2017. pp. 1273\u20131282."},{"key":"885_CR23","doi-asserted-by":"crossref","unstructured":"Melekhov I, Kannala J, Rahtu E. Siamese network features for image matching. In: 2016 23rd international conference on pattern recognition (ICPR), IEEE; 2016. pp. 378\u2013383","DOI":"10.1109\/ICPR.2016.7899663"},{"key":"885_CR24","doi-asserted-by":"publisher","unstructured":"Galal O, Abdel-Gawad AH, Farouk M. Rethinking of bert sentence embedding for text classification. Research Square preprint. 2024. https:\/\/doi.org\/10.21203\/rs.3.rs-3920665\/v1.","DOI":"10.21203\/rs.3.rs-3920665\/v1"},{"key":"885_CR25","unstructured":"Paszke A, Gross S, Chintala S, et al. Pytorch Computer software Version. 2016;03:1."},{"key":"885_CR26","unstructured":"Reddi S, Charles Z, Zaheer M, et\u00a0al. Adaptive federated optimization. arXiv preprint. 2020.arXiv:2003.00295."},{"key":"885_CR27","doi-asserted-by":"crossref","unstructured":"Reimers N, Gurevych I. Sentence-bert: Sentence embeddings using siamese bert-networks. 2019. arXiv preprint arXiv:1908.10084.","DOI":"10.18653\/v1\/D19-1410"},{"key":"885_CR28","unstructured":"Sanh V, Debut L, Chaumond J, et\u00a0al. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint. arXiv:1910.01108."},{"key":"885_CR29","unstructured":"Socher R, Perelygin A, Wu J, et\u00a0al. Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing. 2013. pp. 1631\u20131642."},{"key":"885_CR30","unstructured":"Vaswani A, Shazeer N, Parmar N, et\u00a0al (2017) Attention is all you need. Adv Neural Inform Process Syst. 2017; 30."},{"key":"885_CR31","first-page":"7611","volume-title":"Advances in neural information processing systems","author":"J Wang","year":"2020","unstructured":"Wang J, Liu Q, Liang H, et al. Tackling the objective inconsistency problem in heterogeneous federated optimization. In: Larochelle H, Ranzato M, Hadsell R, et al., editors. Advances in neural information processing systems, vol. 33. Red Hook: Curran Associates Inc; 2020. p. 7611\u201323."},{"key":"885_CR32","doi-asserted-by":"crossref","unstructured":"Wang Y, Huang M, Zhu X, et\u00a0al. Attention-based lstm for aspect-level sentiment classification. In: Proceedings of the 2016 conference on empirical methods in natural language processing. 2016. pp. 606\u2013615.","DOI":"10.18653\/v1\/D16-1058"},{"key":"885_CR33","doi-asserted-by":"crossref","unstructured":"Williams A, Nangia N, Bowman SR. A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint. 2017. arXiv:1704.05426.","DOI":"10.18653\/v1\/N18-1101"},{"key":"885_CR34","doi-asserted-by":"publisher","first-page":"38","DOI":"10.18653\/v1\/2020.emnlp-demos.6","volume-title":"Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations","author":"T Wolf","year":"2020","unstructured":"Wolf T, Debut L, Sanh V, et al. Transformers: state-of-the-art natural language processing. In: Liu Q, Schlangen D, editors., et al., Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations. Stroudsburg: Association for Computational Linguistics; 2020. p. 38\u201345."},{"key":"885_CR35","unstructured":"Zhuang Z, Liu M, Cutkosky A, et\u00a0al. Understanding adamw through proximal methods and scale-freeness. arXiv preprint. 2022. arXiv:2202.00089."}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-024-00885-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s40537-024-00885-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-024-00885-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T16:07:09Z","timestamp":1707494829000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-024-00885-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,9]]},"references-count":35,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["885"],"URL":"https:\/\/doi.org\/10.1186\/s40537-024-00885-x","relation":{},"ISSN":["2196-1115"],"issn-type":[{"value":"2196-1115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,9]]},"assertion":[{"value":"31 August 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 January 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 February 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"28"}}