{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T13:20:47Z","timestamp":1758892847392,"version":"3.41.0"},"reference-count":49,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2021,2,17]],"date-time":"2021-02-17T00:00:00Z","timestamp":1613520000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"State Key Laboratory of Software Development Environment (Beihang University) Open Program","award":["SKLSDE-2020ZX-07"],"award-info":[{"award-number":["SKLSDE-2020ZX-07"]}]},{"DOI":"10.13039\/501100001809","name":"National Science Foundation of China","doi-asserted-by":"crossref","award":["61822201 and U1811463"],"award-info":[{"award-number":["61822201 and U1811463"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Key Research and Development Program of China","award":["2018AAA0101100"],"award-info":[{"award-number":["2018AAA0101100"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2021,2,28]]},"abstract":"<jats:p>\n            Probabilistic topic modeling has been applied in a variety of industrial applications. Training a high-quality model usually requires a massive amount of data to provide comprehensive co-occurrence information for the model to learn. However, industrial data such as medical or financial records are often proprietary or sensitive, which precludes uploading to data centers. Hence, training topic models in industrial scenarios using conventional approaches faces a dilemma: A party (i.e., a company or institute) has to either tolerate data scarcity or sacrifice data privacy. In this article, we propose a framework named\n            <jats:italic>Industrial Federated Topic Modeling<\/jats:italic>\n            (iFTM), in which multiple parties collaboratively train a high-quality topic model by simultaneously alleviating data scarcity and maintaining immunity to privacy adversaries. iFTM is inspired by federated learning, supports two representative topic models (i.e., Latent Dirichlet Allocation and SentenceLDA) in industrial applications, and consists of novel techniques such as private Metropolis-Hastings, topic-wise normalization, and heterogeneous model integration. We conduct quantitative evaluations to verify the effectiveness of iFTM and deploy iFTM in two real-life applications to demonstrate its utility. Experimental results verify iFTM\u2019s superiority over conventional topic modeling.\n          <\/jats:p>","DOI":"10.1145\/3418283","type":"journal-article","created":{"date-parts":[[2021,2,20]],"date-time":"2021-02-20T22:34:11Z","timestamp":1613860451000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Industrial Federated Topic Modeling"],"prefix":"10.1145","volume":"12","author":[{"given":"Di","family":"Jiang","sequence":"first","affiliation":[{"name":"AI Group, WeBank Co., Ltd., China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yongxin","family":"Tong","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Software Development Environment, Beihang University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuanfeng","family":"Song","sequence":"additional","affiliation":[{"name":"AI Group, WeBank Co., Ltd., China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xueyang","family":"Wu","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weiwei","family":"Zhao","sequence":"additional","affiliation":[{"name":"AI Group, WeBank Co., Ltd., China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jinhua","family":"Peng","sequence":"additional","affiliation":[{"name":"AI Group, WeBank Co., Ltd., China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rongzhong","family":"Lian","sequence":"additional","affiliation":[{"name":"AI Group, WeBank Co., Ltd., China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qian","family":"Xu","sequence":"additional","affiliation":[{"name":"AI Group, WeBank Co., Ltd., China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qiang","family":"Yang","sequence":"additional","affiliation":[{"name":"AI Group, WeBank Co., Ltd., China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,2,17]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2348283.2348454"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2914714"},{"key":"e_1_2_1_3_1","volume-title":"Shrinkwrap: Differentially-private query processing in private data federations. arXiv preprint arXiv:1810.01816","author":"Bater Johes","year":"2018","unstructured":"Johes Bater , Xi He , William Ehrich , Ashwin Machanavajjhala , and Jennie Rogers . 2018 . Shrinkwrap: Differentially-private query processing in private data federations. arXiv preprint arXiv:1810.01816 (2018). Johes Bater, Xi He, William Ehrich, Ashwin Machanavajjhala, and Jennie Rogers. 2018. Shrinkwrap: Differentially-private query processing in private data federations. arXiv preprint arXiv:1810.01816 (2018)."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/944919.944937"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.5555\/3265270"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1871437.1871745"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2010.5494942"},{"key":"e_1_2_1_8_1","volume-title":"SecureBoost: A lossless federated learning framework. CoRR abs\/1901.08755","author":"Cheng Kewei","year":"2019","unstructured":"Kewei Cheng , Tao Fan , Yilun Jin , Yang Liu , Tianjian Chen , and Qiang Yang . 2019. SecureBoost: A lossless federated learning framework. CoRR abs\/1901.08755 ( 2019 ). Kewei Cheng, Tao Fan, Yilun Jin, Yang Liu, Tianjian Chen, and Qiang Yang. 2019. SecureBoost: A lossless federated learning framework. CoRR abs\/1901.08755 (2019)."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/1791834.1791836"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1561\/0400000042"},{"key":"e_1_2_1_11_1","volume-title":"On the theory and practice of privacy-preserving Bayesian data analysis. arXiv preprint arXiv:1603.07294","author":"Foulds James","year":"2016","unstructured":"James Foulds , Joseph Geumlek , Max Welling , and Kamalika Chaudhuri . 2016. On the theory and practice of privacy-preserving Bayesian data analysis. arXiv preprint arXiv:1603.07294 ( 2016 ). James Foulds, Joseph Geumlek, Max Welling, and Kamalika Chaudhuri. 2016. On the theory and practice of privacy-preserving Bayesian data analysis. arXiv preprint arXiv:1603.07294 (2016)."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/116873.116878"},{"key":"e_1_2_1_13_1","volume-title":"Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557","author":"Geyer Robin C.","year":"2017","unstructured":"Robin C. Geyer , Tassilo Klein , and Moin Nabi . 2017. Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557 ( 2017 ). Robin C. Geyer, Tassilo Klein, and Moin Nabi. 2017. Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557 (2017)."},{"volume-title":"Markov Chain Monte Carlo in Practice","author":"Gilks Walter R.","key":"e_1_2_1_14_1","unstructured":"Walter R. Gilks , Sylvia Richardson , and David Spiegelhalter . 1995. Markov Chain Monte Carlo in Practice . Chapman and Hall\/CRC. Walter R. Gilks, Sylvia Richardson, and David Spiegelhalter. 1995. Markov Chain Monte Carlo in Practice. Chapman and Hall\/CRC."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0307752101"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.5555\/3045390.3045450"},{"key":"e_1_2_1_17_1","volume-title":"Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604","author":"Hard Andrew","year":"2018","unstructured":"Andrew Hard , Kanishka Rao , Rajiv Mathews , Fran\u00e7oise Beaufays , Sean Augenstein , Hubert Eichner , Chlo\u00e9 Kiddon , and Daniel Ramage . 2018. Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 ( 2018 ). Andrew Hard, Kanishka Rao, Rajiv Mathews, Fran\u00e7oise Beaufays, Sean Augenstein, Hubert Eichner, Chlo\u00e9 Kiddon, and Daniel Ramage. 2018. Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604 (2018)."},{"key":"e_1_2_1_18_1","volume-title":"Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677","author":"Hardy Stephen","year":"2017","unstructured":"Stephen Hardy , Wilko Henecka , Hamish Ivey-Law , Richard Nock , Giorgio Patrini , Guillaume Smith , and Brian Thorne . 2017. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677 ( 2017 ). Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio Patrini, Guillaume Smith, and Brian Thorne. 2017. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677 (2017)."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505642"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-37487-6_18"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3357909"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1935826.1935932"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/NAFIPS-WConSC.2015.7284190"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-6393(01)00041-3"},{"key":"e_1_2_1_25_1","volume-title":"Federated learning for keyword spotting. arXiv preprint arXiv:1810.05512","author":"Leroy David","year":"2018","unstructured":"David Leroy , Alice Coucke , Thibaut Lavril , Thibault Gisselbrecht , and Joseph Dureau . 2018. Federated learning for keyword spotting. arXiv preprint arXiv:1810.05512 ( 2018 ). David Leroy, Alice Coucke, Thibaut Lavril, Thibault Gisselbrecht, and Joseph Dureau. 2018. Federated learning for keyword spotting. arXiv preprint arXiv:1810.05512 (2018)."},{"key":"e_1_2_1_26_1","volume-title":"Meta-SGD: Learning to learn quickly for few shot learning. arXiv preprint arXiv:1707.09835","author":"Li Zhenguo","year":"2017","unstructured":"Zhenguo Li , Fengwei Zhou , Fei Chen , and Hang Li. 2017. Meta-SGD: Learning to learn quickly for few shot learning. arXiv preprint arXiv:1707.09835 ( 2017 ). Zhenguo Li, Fengwei Zhou, Fei Chen, and Hang Li. 2017. Meta-SGD: Learning to learn quickly for few shot learning. arXiv preprint arXiv:1707.09835 (2017)."},{"key":"e_1_2_1_27_1","volume-title":"Secure federated transfer learning. CoRR abs\/1812.03337","author":"Liu Yang","year":"2018","unstructured":"Yang Liu , Tianjian Chen , and Qiang Yang . 2018. Secure federated transfer learning. CoRR abs\/1812.03337 ( 2018 ). Yang Liu, Tianjian Chen, and Qiang Yang. 2018. Secure federated transfer learning. CoRR abs\/1812.03337 (2018)."},{"volume-title":"Proceedings of the Conference on Advances in Neural Information Processing Systems. 121--128","author":"Jon","key":"e_1_2_1_28_1","unstructured":"Jon D. Mcauliffe and David M. Blei. 2008. Supervised topic models . In Proceedings of the Conference on Advances in Neural Information Processing Systems. 121--128 . Jon D. Mcauliffe and David M. Blei. 2008. Supervised topic models. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 121--128."},{"key":"e_1_2_1_29_1","volume-title":"et\u00a0al","author":"McMahan H. Brendan","year":"2016","unstructured":"H. Brendan McMahan , Eider Moore , Daniel Ramage , Seth Hampson , et\u00a0al . 2016 . Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629 (2016). H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, et\u00a0al. 2016. Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629 (2016)."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/1577069.1755845"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-10439-8_28"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2009.191"},{"key":"e_1_2_1_33_1","volume-title":"Semi-supervised knowledge transfer for deep learning from private training data. arXiv preprint arXiv:1610.05755","author":"Papernot Nicolas","year":"2016","unstructured":"Nicolas Papernot , Mart\u00edn Abadi , Ulfar Erlingsson , Ian Goodfellow , and Kunal Talwar . 2016. Semi-supervised knowledge transfer for deep learning from private training data. arXiv preprint arXiv:1610.05755 ( 2016 ). Nicolas Papernot, Mart\u00edn Abadi, Ulfar Erlingsson, Ian Goodfellow, and Kunal Talwar. 2016. Semi-supervised knowledge transfer for deep learning from private training data. arXiv preprint arXiv:1610.05755 (2016)."},{"key":"e_1_2_1_34_1","volume-title":"Private topic modeling. arXiv preprint arXiv:1609.04120","author":"Park Mijung","year":"2016","unstructured":"Mijung Park , James Foulds , Kamalika Chaudhuri , and Max Welling . 2016. Private topic modeling. arXiv preprint arXiv:1609.04120 ( 2016 ). Mijung Park, James Foulds, Kamalika Chaudhuri, and Max Welling. 2016. Private topic modeling. arXiv preprint arXiv:1609.04120 (2016)."},{"key":"e_1_2_1_35_1","first-page":"169","article-title":"On data banks and privacy homomorphisms","volume":"4","author":"Rivest Ronald L.","year":"1978","unstructured":"Ronald L. Rivest , Len Adleman , Michael L. Dertouzos , et\u00a0al. 1978 . On data banks and privacy homomorphisms . Found. Sec. Comput. 4 , 11 (1978), 169 -- 180 . Ronald L. Rivest, Len Adleman, Michael L. Dertouzos, et\u00a0al. 1978. On data banks and privacy homomorphisms. Found. Sec. Comput. 4, 11 (1978), 169--180.","journal-title":"Found. Sec. Comput."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1214\/12-AOAS618"},{"key":"e_1_2_1_37_1","first-page":"513","article-title":"The EU general data protection regulation: Toward a property regime for protecting data privacy","volume":"123","author":"Victor Jacob M.","year":"2013","unstructured":"Jacob M. Victor . 2013 . The EU general data protection regulation: Toward a property regime for protecting data privacy . Yale LJ 123 (2013), 513 . Jacob M. Victor. 2013. The EU general data protection regulation: Toward a property regime for protecting data privacy. Yale LJ 123 (2013), 513.","journal-title":"Yale LJ"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.5555\/3152676"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2651403"},{"key":"e_1_2_1_40_1","first-page":"221","article-title":"European Union data privacy law reform: General data protection regulation, privacy shield, and the right to delisting","volume":"72","author":"Voss W. Gregory","year":"2016","unstructured":"W. Gregory Voss . 2016 . European Union data privacy law reform: General data protection regulation, privacy shield, and the right to delisting . Bus. Law. 72 , 1 (2016), 221 -- 233 . W. Gregory Voss. 2016. European Union data privacy law reform: General data protection regulation, privacy shield, and the right to delisting. Bus. Law. 72, 1 (2016), 221--233.","journal-title":"Bus. Law."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150450"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 811--826","author":"Wang Yang","year":"2018","unstructured":"Yang Wang , Quanquan Gu , and Donald Brown . 2018 . Differentially private hypothesis transfer learning . In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 811--826 . Yang Wang, Quanquan Gu, and Donald Brown. 2018. Differentially private hypothesis transfer learning. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 811--826."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.5555\/3045118.3045383"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/SLT.2014.7078615"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3298981"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939821"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.5555\/1382436.1382751"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/2736277.2741115"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2187836.2187955"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3418283","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3418283","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:27Z","timestamp":1750197747000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3418283"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,17]]},"references-count":49,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,2,28]]}},"alternative-id":["10.1145\/3418283"],"URL":"https:\/\/doi.org\/10.1145\/3418283","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"type":"print","value":"2157-6904"},{"type":"electronic","value":"2157-6912"}],"subject":[],"published":{"date-parts":[[2021,2,17]]},"assertion":[{"value":"2019-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-02-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}