{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T18:43:16Z","timestamp":1772908996508,"version":"3.50.1"},"reference-count":71,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2022,8,24]],"date-time":"2022-08-24T00:00:00Z","timestamp":1661299200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62072204, and 62102157"],"award-info":[{"award-number":["62072204, and 62102157"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"crossref","award":["HUST:2020kfyXJJS019"],"award-info":[{"award-number":["HUST:2020kfyXJJS019"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2022,8,31]]},"abstract":"<jats:p>The fast growth of<jats:italic>pre-trained models<\/jats:italic>(PTMs) has brought natural language processing to a new era, which has become a dominant technique for various<jats:italic>natural language processing<\/jats:italic>(NLP) applications. Every user can download the weights of PTMs, then fine-tune the weights for a task on the local side. However, the pre-training of a model relies heavily on accessing a large-scale of training data and requires a vast amount of computing resources. These strict requirements make it impossible for any single client to pre-train such a model. To grant clients with limited computing capability to participate in pre-training a large model, we propose a new learning approach,<jats:sc>FedBERT<\/jats:sc>, that takes advantage of the federated learning and split learning approaches, resorting to pre-training BERT in a federated way.<jats:sc>FedBERT<\/jats:sc>can prevent sharing the raw data information and obtain excellent performance. Extensive experiments on seven GLUE tasks demonstrate that<jats:sc>FedBERT<\/jats:sc>can maintain its effectiveness without communicating to the sensitive local data of clients.<\/jats:p>","DOI":"10.1145\/3510033","type":"journal-article","created":{"date-parts":[[2022,2,4]],"date-time":"2022-02-04T22:33:18Z","timestamp":1644013998000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":114,"title":["<scp>FedBERT<\/scp>: When Federated Learning Meets Pre-training"],"prefix":"10.1145","volume":"13","author":[{"given":"Yuanyishu","family":"Tian","sequence":"first","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}]},{"given":"Yao","family":"Wan","sequence":"additional","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}]},{"given":"Lingjuan","family":"Lyu","sequence":"additional","affiliation":[{"name":"Sony AI, Tokyo, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0336-0522","authenticated-orcid":false,"given":"Dezhong","family":"Yao","sequence":"additional","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}]},{"given":"Hai","family":"Jin","sequence":"additional","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}]},{"given":"Lichao","family":"Sun","sequence":"additional","affiliation":[{"name":"Lehigh University, Bethlehem, PA, USA"}]}],"member":"320","published-online":{"date-parts":[[2022,8,24]]},"reference":[{"key":"e_1_3_2_2_2","article-title":"FedSL: Federated split learning on distributed sequential data in recurrent neural networks","author":"Abedi Ali","year":"2020","unstructured":"Ali Abedi and Shehroz S. Khan. 2020. FedSL: Federated split learning on distributed sequential data in recurrent neural networks. arXiv preprint arXiv:2011.03180 (2020).","journal-title":"arXiv preprint arXiv:2011.03180"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/3320269.3384740"},{"key":"e_1_3_2_4_2","article-title":"SciBERT: A pretrained language model for scientific text","author":"Beltagy Iz","year":"2019","unstructured":"Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676 (2019).","journal-title":"arXiv preprint arXiv:1903.10676"},{"key":"e_1_3_2_5_2","volume-title":"Proceedings of the 2nd Text Analysis Conference","author":"Bentivogli Luisa","year":"2009","unstructured":"Luisa Bentivogli, Bernardo Magnini, Ido Dagan, Hoa Trang Dang, and Danilo Giampiccolo. 2009. The fifth PASCAL recognizing textual entailment challenge. In Proceedings of the 2nd Text Analysis Conference. NIST."},{"key":"e_1_3_2_6_2","volume-title":"Proceedings of the Conference on Machine Learning and Systems (MLSys\u201919)","author":"Bonawitz Keith","year":"2019","unstructured":"Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chlo\u00e9 Kiddon, Jakub Konecn\u00fd, Stefano Mazzocchi, Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, and Jason Roselander. 2019. Towards federated learning at scale: System design. In Proceedings of the Conference on Machine Learning and Systems (MLSys\u201919). mlsys.org."},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3133956.3133982"},{"key":"e_1_3_2_8_2","article-title":"SplitNN-driven vertical partitioning","author":"Ceballos Iker","year":"2020","unstructured":"Iker Ceballos, Vivek Sharma, Eduardo Mugica, Abhishek Singh, Alberto Roman, Praneeth Vepakomma, and Ramesh Raskar. 2020. SplitNN-driven vertical partitioning. arXiv preprint arXiv:2008.04137 (2020).","journal-title":"arXiv preprint arXiv:2008.04137"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S17-2001"},{"key":"e_1_3_2_10_2","volume-title":"Proceedings of International Conference on Learning Representations","author":"Clark Kevin","year":"2020","unstructured":"Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. ELECTRA: Pre-training text encoders as discriminators rather than generators. In Proceedings of International Conference on Learning Representations. OpenReview.net."},{"key":"e_1_3_2_11_2","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2019).","journal-title":"arXiv preprint arXiv:1810.04805"},{"key":"e_1_3_2_12_2","volume-title":"Proceedings of the 3rd International Workshop on Paraphrasing","author":"Dolan William B.","year":"2005","unstructured":"William B. Dolan and Chris Brockett. 2005. Automatically constructing a corpus of sentential paraphrases. In Proceedings of the 3rd International Workshop on Paraphrasing. Asian Federation of Natural Language Processing."},{"key":"e_1_3_2_13_2","article-title":"Unified language model pre-training for natural language understanding and generation","author":"Dong Li","year":"2019","unstructured":"Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, and Hsiao-Wuen Hon. 2019. Unified language model pre-training for natural language understanding and generation. arXiv preprint arXiv:1905.03197 (2019).","journal-title":"arXiv preprint arXiv:1905.03197"},{"key":"e_1_3_2_14_2","article-title":"End-to-end evaluation of federated learning and split learning for internet of things","author":"Gao Yansong","year":"2020","unstructured":"Yansong Gao, Minki Kim, Sharif Abuadbba, Yeonjae Kim, Chandra Thapa, Kyuyeon Kim, Seyit A. Camtepe, Hyoungshick Kim, and Surya Nepal. 2020. End-to-end evaluation of federated learning and split learning for internet of things. arXiv preprint arXiv:2003.13376 (2020).","journal-title":"arXiv preprint arXiv:2003.13376"},{"key":"e_1_3_2_15_2","article-title":"FedNER: Medical named entity recognition with federated learning","author":"Ge Suyu","year":"2020","unstructured":"Suyu Ge, Fangzhao Wu, Chuhan Wu, Tao Qi, Yongfeng Huang, and Xing Xie. 2020. FedNER: Medical named entity recognition with federated learning. arXiv preprint arXiv:2003.09288 (2020).","journal-title":"arXiv preprint arXiv:2003.09288"},{"key":"e_1_3_2_16_2","article-title":"Differentially private federated learning: A client level perspective","author":"Geyer Robin C.","year":"2017","unstructured":"Robin C. Geyer, Tassilo Klein, and Moin Nabi. 2017. Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557 (2017).","journal-title":"arXiv preprint arXiv:1712.07557"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jnca.2018.05.003"},{"key":"e_1_3_2_18_2","article-title":"Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption","author":"Hardy Stephen","year":"2017","unstructured":"Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio Patrini, Guillaume Smith, and Brian Thorne. 2017. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677 (2017).","journal-title":"arXiv preprint arXiv:1711.10677"},{"key":"e_1_3_2_19_2","article-title":"FedML: A research library and benchmark for federated machine learning","author":"He Chaoyang","year":"2020","unstructured":"Chaoyang He, Songze Li, Jinhyun So, Mi Zhang, Hongyi Wang, Xiaoyang Wang, Praneeth Vepakomma, Abhishek Singh, Hang Qiu, Li Shen, Peilin Zhao, Yan Kang, Yang Liu, Ramesh Raskar, Qiang Yang, Murali Annavaram, and Salman Avestimehr. 2020. FedML: A research library and benchmark for federated machine learning. arXiv preprint arXiv:2007.13518 (2020).","journal-title":"arXiv preprint arXiv:2007.13518"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58607-2_5"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3357909"},{"key":"e_1_3_2_22_2","article-title":"Advances and open problems in federated learning","author":"Kairouz Peter","year":"2019","unstructured":"Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aur\u00e9lien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Keith Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D\u2019Oliveira, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adri\u00e0 Gasc\u00f3n, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Za\u00efd Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Khodak, Jakub Konecn\u00fd, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancr\u00e8de Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer \u00d6zg\u00fcr, Rasmus Pagh, Mariana Raykova, Hang Qi, Daniel Ramage, Ramesh Raskar, Dawn Song, Weikang Song, Sebastian U. Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tram\u00e8r, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X. Yu, Han Yu, and Sen Zhao. 2019. Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019).","journal-title":"arXiv preprint arXiv:1912.04977"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-020-0186-1"},{"key":"e_1_3_2_24_2","first-page":"1188","volume-title":"Proceedings of the 31st International Conference on Machine Learning","author":"Le Quoc","year":"2014","unstructured":"Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on Machine Learning. PMLR, 1188\u20131196."},{"issue":"4","key":"e_1_3_2_25_2","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","article-title":"BioBERT: A pre-trained biomedical language representation model for biomedical text mining","volume":"36","author":"Lee Jinhyuk","year":"2020","unstructured":"Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2020. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 4 (2020), 1234\u20131240.","journal-title":"Bioinformatics"},{"key":"e_1_3_2_26_2","article-title":"VisualBERT: A simple and performant baseline for vision and language","author":"Li Liunian Harold","year":"2019","unstructured":"Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang. 2019. VisualBERT: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557 (2019).","journal-title":"arXiv preprint arXiv:1908.03557"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2020.2975749"},{"key":"e_1_3_2_28_2","article-title":"On the convergence of FedAvg on non-IID data","author":"Li Xiang","year":"2019","unstructured":"Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang, and Zhihua Zhang. 2019. On the convergence of FedAvg on non-IID data. arXiv preprint arXiv:1907.02189 (2019).","journal-title":"arXiv preprint arXiv:1907.02189"},{"key":"e_1_3_2_29_2","first-page":"1013","volume-title":"Proceedings of IEEE Conference on Computer Vision and Pattern Recognition","author":"Liu Quande","year":"2021","unstructured":"Quande Liu, Cheng Chen, Jing Qin, Qi Dou, and Pheng-Ann Heng. 2021. FedDG: Federated domain generalization on medical image segmentation via episodic learning in continuous frequency space. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 1013\u20131023."},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i08.7021"},{"key":"e_1_3_2_31_2","article-title":"RoBERTa: A robustly optimized BERT pretraining approach","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019).","journal-title":"arXiv preprint arXiv:1907.11692"},{"key":"e_1_3_2_32_2","article-title":"VilBERT: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks","author":"Lu Jiasen","year":"2019","unstructured":"Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. 2019. VilBERT: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. arXiv preprint arXiv:1908.02265 (2019).","journal-title":"arXiv preprint arXiv:1908.02265"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN48605.2020.9207618"},{"key":"e_1_3_2_34_2","article-title":"Privacy and robustness in federated learning: Attacks and defenses","author":"Lyu Lingjuan","year":"2020","unstructured":"Lingjuan Lyu, Han Yu, Xingjun Ma, Lichao Sun, Jun Zhao, Qiang Yang, and Philip S. Yu. 2020. Privacy and robustness in federated learning: Attacks and defenses. arXiv preprint arXiv:2012.06337 (2020).","journal-title":"arXiv preprint arXiv:2012.06337"},{"key":"e_1_3_2_35_2","article-title":"Threats to federated learning: A survey","volume":"2003","author":"Lyu Lingjuan","year":"2020","unstructured":"Lingjuan Lyu, Han Yu, and Qiang Yang. 2020. Threats to federated learning: A survey. CoRR abs\/2003.02133 (2020).","journal-title":"CoRR"},{"key":"e_1_3_2_36_2","article-title":"Neural compression and filtering for edge-assisted real-time object detection in challenged networks","author":"Matsubara Yoshitomo","year":"2020","unstructured":"Yoshitomo Matsubara and Marco Levorato. 2020. Neural compression and filtering for edge-assisted real-time object detection in challenged networks. arXiv preprint arXiv:2007.15818 (2020).","journal-title":"arXiv preprint arXiv:2007.15818"},{"key":"e_1_3_2_37_2","first-page":"1273","volume-title":"Proceedings of the 20th International Conference on Artificial Intelligence and Statistics","author":"McMahan Brendan","year":"2017","unstructured":"Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Ag\u00fcera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. PMLR, 1273\u20131282."},{"key":"e_1_3_2_38_2","article-title":"Federated learning of deep networks using model averaging","author":"McMahan H. Brendan","year":"2016","unstructured":"H. Brendan McMahan, Eider Moore, Daniel Ramage, and Blaise Ag\u00fcera y Arcas. 2016. Federated learning of deep networks using model averaging. arXiv preprint arXiv:1602.05629 (2016).","journal-title":"arXiv preprint arXiv:1602.05629"},{"key":"e_1_3_2_39_2","volume-title":"Proceedings of the 6th International Conference on Learning Representations","author":"McMahan H. Brendan","year":"2018","unstructured":"H. Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. 2018. Learning differentially private recurrent language models. In Proceedings of the 6th International Conference on Learning Representations. OpenReview.net."},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K16-1006"},{"key":"e_1_3_2_41_2","first-page":"3111","volume-title":"Proceedings of 27th Annual Conference on Neural Information Processing Systems","author":"Mikolov Tom\u00e1s","year":"2013","unstructured":"Tom\u00e1s Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of 27th Annual Conference on Neural Information Processing Systems. 3111\u20133119."},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482252"},{"key":"e_1_3_2_43_2","article-title":"LIME: Low-Cost incremental learning for dynamic heterogeneous information networks","author":"Peng Hao","year":"2021","unstructured":"Hao Peng, Renyu Yang, Zheng Wang, Jianxin Li, Lifang He, Philip Yu, Albert Zomaya, and Raj Ranjan. 2021. LIME: Low-Cost incremental learning for dynamic heterogeneous information networks. IEEE Trans. Comput. 71, 3 (2021), 628\u2013642.","journal-title":"IEEE Trans. Comput."},{"key":"e_1_3_2_44_2","article-title":"Reinforced neighborhood selection guided multi-relational graph neural networks","author":"Peng Hao","year":"2021","unstructured":"Hao Peng, Ruitong Zhang, Yingtong Dou, Renyu Yang, Jingyi Zhang, and Philip S. Yu. 2021. Reinforced neighborhood selection guided multi-relational graph neural networks. arXiv preprint arXiv:2104.07886 (2021).","journal-title":"arXiv preprint arXiv:2104.07886"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_2_46_2","article-title":"Improving Language Understanding with Unsupervised Learning","author":"Radford Alec","year":"2018","unstructured":"Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving Language Understanding with Unsupervised Learning. Technical report. OpenAI.","journal-title":"Technical report. OpenAI"},{"issue":"8","key":"e_1_3_2_47_2","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.","journal-title":"OpenAI Blog"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1264"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-020-00323-1"},{"key":"e_1_3_2_50_2","article-title":"Wireless federated learning with local differential privacy","author":"Seif Mohamed","year":"2020","unstructured":"Mohamed Seif, Ravi Tandon, and Ming Li. 2020. Wireless federated learning with local differential privacy. arXiv preprint arXiv:2002.05151 (2020).","journal-title":"arXiv preprint arXiv:2002.05151"},{"key":"e_1_3_2_51_2","article-title":"Neural machine translation of rare words with subword units","author":"Sennrich Rico","year":"2015","unstructured":"Rico Sennrich, Barry Haddow, and Alexandra Birch. 2015. Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015).","journal-title":"arXiv preprint arXiv:1508.07909"},{"key":"e_1_3_2_52_2","article-title":"Detailed comparison of communication efficiency of split learning and federated learning","author":"Singh Abhishek","year":"2019","unstructured":"Abhishek Singh, Praneeth Vepakomma, Otkrist Gupta, and Ramesh Raskar. 2019. Detailed comparison of communication efficiency of split learning and federated learning. arXiv preprint arXiv:1909.09145 (2019).","journal-title":"arXiv preprint arXiv:1909.09145"},{"key":"e_1_3_2_53_2","first-page":"1631","volume-title":"Proceedings of the Conference on Empirical Methods in Natural Language Processing","author":"Socher Richard","year":"2013","unstructured":"Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, 1631\u20131642."},{"key":"e_1_3_2_54_2","article-title":"MASS: Masked sequence to sequence pre-training for language generation","author":"Song Kaitao","year":"2019","unstructured":"Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2019. MASS: Masked sequence to sequence pre-training for language generation. arXiv preprint arXiv:1905.02450 (2019).","journal-title":"arXiv preprint arXiv:1905.02450"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00756"},{"key":"e_1_3_2_56_2","article-title":"Federated model distillation with noise-free differential privacy","author":"Sun Lichao","year":"2020","unstructured":"Lichao Sun and Lingjuan Lyu. 2020. Federated model distillation with noise-free differential privacy. arXiv preprint arXiv:2009.05537 (2020).","journal-title":"arXiv preprint arXiv:2009.05537"},{"key":"e_1_3_2_57_2","article-title":"LDP-FL: Practical private aggregation in federated learning with local differential privacy","author":"Sun Lichao","year":"2020","unstructured":"Lichao Sun, Jianwei Qian, Xun Chen, and Philip S. Yu. 2020. LDP-FL: Practical private aggregation in federated learning with local differential privacy. arXiv preprint arXiv:2007.15789 (2020).","journal-title":"arXiv preprint arXiv:2007.15789"},{"key":"e_1_3_2_58_2","article-title":"ERNIE: Enhanced representation through knowledge integration","author":"Sun Yu","year":"2019","unstructured":"Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. ERNIE: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019).","journal-title":"arXiv preprint arXiv:1904.09223"},{"key":"e_1_3_2_59_2","article-title":"SplitFed: When federated learning meets split learning","author":"Thapa Chandra","year":"2020","unstructured":"Chandra Thapa, Mahawaga Arachchige Pathum Chamikara, Seyit Camtepe, and Lichao Sun. 2020. SplitFed: When federated learning meets split learning. arXiv preprint arXiv:2004.12088 (2020).","journal-title":"arXiv preprint arXiv:2004.12088"},{"key":"e_1_3_2_60_2","first-page":"5998","volume-title":"Proceedings of the Conference on Advances in Neural Information Processing Systems","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 5998\u20136008."},{"key":"e_1_3_2_61_2","article-title":"Split learning for health: Distributed deep learning without sharing raw patient data","author":"Vepakomma Praneeth","year":"2018","unstructured":"Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, and Ramesh Raskar. 2018. Split learning for health: Distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564 (2018).","journal-title":"arXiv preprint arXiv:1812.00564"},{"key":"e_1_3_2_62_2","volume-title":"Proceedings of the 7th International Conference on Learning Representations","author":"Wang Alex","year":"2019","unstructured":"Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 7th International Conference on Learning Representations. OpenReview.net."},{"key":"e_1_3_2_63_2","volume-title":"Proceedings of the Conference on Advances in Neural Information Processing Systems","author":"Wang Hongyi","year":"2020","unstructured":"Hongyi Wang, Kartik Sreenivasan, Shashank Rajput, Harit Vishwakarma, Saurabh Agarwal, Jy Yong Sohn, Kangwook Lee, and Dimitris S. Papailiopoulos. 2020. Attack of the tails: Yes, you really can backdoor federated learning. In Proceedings of the Conference on Advances in Neural Information Processing Systems."},{"key":"e_1_3_2_64_2","volume-title":"Proceedings of the 8th International Conference on Learning Representations","author":"Wang Hongyi","year":"2020","unstructured":"Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris S. Papailiopoulos, and Yasaman Khazaeni. 2020. Federated learning with matched averaging. In Proceedings of the 8th International Conference on Learning Representations. OpenReview.net."},{"key":"e_1_3_2_65_2","article-title":"A field guide to federated optimization","author":"Wang Jianyu","year":"2021","unstructured":"Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Ag\u00fcera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas N. Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz, Satyen Kale, Sai Praneeth Karimireddy, Jakub Kone\u010dn\u00fd, Sanmi Koyejo, Tian Li, Luyang Liu, Mehryar Mohri, Hang Qi, Sashank J. Reddi, Peter Richt\u00e1rik, Karan Singhal, Virginia Smith, Mahdi Soltanolkotabi, Weikang Song, Ananda Theertha Suresh, Sebastian U. Stich, Ameet Talwalkar, Hongyi Wang, Blake E. Woodworth, Shanshan Wu, Felix X. Yu, Honglin Yuan, Manzil Zaheer, Mi Zhang, Tong Zhang, Chunxiang Zheng, Chen Zhu, and Wennan Zhu. 2021. A field guide to federated optimization. arXiv preprint arXiv:2107.06917 (2021).","journal-title":"arXiv preprint arXiv:2107.06917"},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1101"},{"key":"e_1_3_2_67_2","article-title":"Privacy-preserving federated depression detection from multi-source mobile health data","author":"Xu Xiaohang","year":"2021","unstructured":"Xiaohang Xu, Hao Peng, Md Zakirul Alam Bhuiyan, Zhifeng Hao, Lianzhong Liu, Lichao Sun, and Lifang He. 2021. Privacy-preserving federated depression detection from multi-source mobile health data. IEEE Trans. Industr. Inform. 18, 7 (2021), 4788\u20134797.","journal-title":"IEEE Trans. Industr. Inform."},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2021.101992"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1145\/3298981"},{"key":"e_1_3_2_70_2","article-title":"Local-global knowledge distillation in heterogeneous federated learning with non-IID data","author":"Yao Dezhong","year":"2021","unstructured":"Dezhong Yao, Wanning Pan, Yutong Dai, Yao Wan, Xiaofeng Ding, Hai Jin, Zheng Xu, and Lichao Sun. 2021. Local-global knowledge distillation in heterogeneous federated learning with non-IID data. arXiv preprint arXiv:2107.00051 (2021).","journal-title":"arXiv preprint arXiv:2107.00051"},{"key":"e_1_3_2_71_2","article-title":"FedHM: Efficient federated learning for heterogeneous models via low-rank factorization","author":"Yao Dezhong","year":"2021","unstructured":"Dezhong Yao, Wanning Pan, Yao Wan, Hai Jin, and Lichao Sun. 2021. FedHM: Efficient federated learning for heterogeneous models via low-rank factorization. arXiv preprint arXiv:2111.14655 (2021).","journal-title":"arXiv preprint arXiv:2111.14655"},{"key":"e_1_3_2_72_2","volume-title":"Proceedings of the Conference on Advances in Neural Information Processing Systems","author":"ZHANG Ke","year":"2021","unstructured":"Ke ZHANG, Carl Yang, Xiaoxiao Li, Lichao Sun, and Siu Ming Yiu. 2021. Subgraph federated learning with missing neighbor generation. In Proceedings of the Conference on Advances in Neural Information Processing Systems."}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3510033","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3510033","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:12:24Z","timestamp":1750191144000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3510033"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,24]]},"references-count":71,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,8,31]]}},"alternative-id":["10.1145\/3510033"],"URL":"https:\/\/doi.org\/10.1145\/3510033","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"value":"2157-6904","type":"print"},{"value":"2157-6912","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,24]]},"assertion":[{"value":"2021-05-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-12-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-08-24","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}