{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T18:08:11Z","timestamp":1776881291647,"version":"3.51.2"},"publisher-location":"New York, NY, USA","reference-count":90,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,11,9]],"date-time":"2020-11-09T00:00:00Z","timestamp":1604880000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"EPSRC DADA: Defence Against Dark Artefacts","award":["EP\/R03351X\/1"],"award-info":[{"award-number":["EP\/R03351X\/1"]}]},{"name":"The Saudi Arabian Cultural Bureau in the UK"},{"name":"EPSRC Databox: Privacy-aware Infrastructure for Managing Personal Data","award":["EP\/N028260\/1"],"award-info":[{"award-number":["EP\/N028260\/1"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,11,9]]},"DOI":"10.1145\/3411495.3421355","type":"proceedings-article","created":{"date-parts":[[2020,11,5]],"date-time":"2020-11-05T23:35:56Z","timestamp":1604619356000},"page":"1-14","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":31,"title":["Privacy-preserving Voice Analysis via Disentangled Representations"],"prefix":"10.1145","author":[{"given":"Ranya","family":"Aloufi","sequence":"first","affiliation":[{"name":"Imperial College London, London, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hamed","family":"Haddadi","sequence":"additional","affiliation":[{"name":"Imperial College London, London, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Boyle","sequence":"additional","affiliation":[{"name":"Imperial College London, London, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,11,9]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"[n. d.]. PyTorch Core. https:\/\/pytorch.org  [n. d.]. PyTorch Core. https:\/\/pytorch.org"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3362743.3362960"},{"key":"e_1_3_2_1_3_1","volume-title":"Proceedings of the 33rd International Conference on International Conference on Machine Learning -","volume":"48","author":"Amodei Dario","year":"2016","unstructured":"Dario Amodei , Sundaram Ananthanarayanan , Rishita Anubhai , Jingliang Bai , Eric Battenberg , Carl Case , Jared Casper , Bryan Catanzaro , Qiang Cheng , Guoliang Chen , Jie Chen , Jingdong Chen , Zhijie Chen , Mike Chrzanowski , Adam Coates , Greg Diamos , Ke Ding , Niandong Du , Erich Elsen , Jesse Engel , Weiwei Fang , Linxi Fan , Christopher Fougner , Liang Gao , Caixia Gong , Awni Hannun , Tony Han , Lappi Vaino Johannes , Bing Jiang , Cai Ju , Billy Jun , Patrick LeGresley , Libby Lin , Junjie Liu , Yang Liu , Weigao Li , Xiangang Li , Dongpeng Ma , Sharan Narang , Andrew Ng , Sherjil Ozair , Yiping Peng , Ryan Prenger , Sheng Qian , Zongfeng Quan , Jonathan Raiman , Vinay Rao , Sanjeev Satheesh , David Seetapun , Shubho Sengupta , Kavya Srinet , Anuroop Sriram , Haiyuan Tang , Liliang Tang , Chong Wang , JidongWang, KaifuWang, YiWang, ZhijianWang, ZhiqianWang, Shuang Wu , Likai Wei , Bo Xiao , Wen Xie , Yan Xie , Dani Yogatama , Bin Yuan , Jun Zhan , and Zhenyao Zhu . 2016 . Deep Speech 2: End-to-End Speech Recognition in English and Mandarin . In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 . JMLR.org, 173--182. Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, Jie Chen, Jingdong Chen, Zhijie Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Ke Ding, Niandong Du, Erich Elsen, Jesse Engel, Weiwei Fang, Linxi Fan, Christopher Fougner, Liang Gao, Caixia Gong, Awni Hannun, Tony Han, Lappi Vaino Johannes, Bing Jiang, Cai Ju, Billy Jun, Patrick LeGresley, Libby Lin, Junjie Liu, Yang Liu, Weigao Li, Xiangang Li, Dongpeng Ma, Sharan Narang, Andrew Ng, Sherjil Ozair, Yiping Peng, Ryan Prenger, Sheng Qian, Zongfeng Quan, Jonathan Raiman, Vinay Rao, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Kavya Srinet, Anuroop Sriram, Haiyuan Tang, Liliang Tang, Chong Wang, JidongWang, KaifuWang, YiWang, ZhijianWang, ZhiqianWang, Shuang Wu, Likai Wei, Bo Xiao, Wen Xie, Yan Xie, Dani Yogatama, Bin Yuan, Jun Zhan, and Zhenyao Zhu. 2016. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48. JMLR.org, 173--182."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1504\/IJSN.2015.071829"},{"key":"e_1_3_2_1_5_1","volume-title":"International Conference on Learning Representations.","author":"Baevski Alexei","year":"2020","unstructured":"Alexei Baevski , Steffen Schneider , and Michael Auli . 2020 . vq-wav2vec: Self- Supervised Learning of Discrete Speech Representations . In International Conference on Learning Representations. Alexei Baevski, Steffen Schneider, and Michael Auli. 2020. vq-wav2vec: Self- Supervised Learning of Discrete Speech Representations. In International Conference on Learning Representations."},{"key":"e_1_3_2_1_6_1","volume-title":"Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432","author":"Bengio Yoshua","year":"2013","unstructured":"Yoshua Bengio , Nicholas L\u00e9onard , and Aaron Courville . 2013. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 ( 2013 ). Yoshua Bengio, Nicholas L\u00e9onard, and Aaron Courville. 2013. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013)."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.2909562"},{"key":"e_1_3_2_1_8_1","volume-title":"IEMOCAP: Interactive emotional dyadic motion capture database. Language resources and evaluation 42, 4","author":"Busso Carlos","year":"2008","unstructured":"Carlos Busso , Murtaza Bulut , Chi-Chun Lee , Abe Kazemzadeh , Emily Mower , Samuel Kim , Jeannette N Chang , Sungbok Lee , and Shrikanth S Narayanan . 2008 . IEMOCAP: Interactive emotional dyadic motion capture database. Language resources and evaluation 42, 4 (2008), 335. Carlos Busso, Murtaza Bulut, Chi-Chun Lee, Abe Kazemzadeh, Emily Mower, Samuel Kim, Jeannette N Chang, Sungbok Lee, and Shrikanth S Narayanan. 2008. IEMOCAP: Interactive emotional dyadic motion capture database. Language resources and evaluation 42, 4 (2008), 335."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.21437\/Odyssey.2018-11"},{"key":"e_1_3_2_1_10_1","volume-title":"Proceedings of the 28th USENIX Conference on Security Symposium. USENIX Association, USA.","author":"Carlini Nicholas","year":"2019","unstructured":"Nicholas Carlini , Chang Liu , \u00dalfar Erlingsson , Jernej Kos , and Dawn Song . 2019 . The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks . In Proceedings of the 28th USENIX Conference on Security Symposium. USENIX Association, USA. Nicholas Carlini, Chang Liu, \u00dalfar Erlingsson, Jernej Kos, and Dawn Song. 2019. The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks. In Proceedings of the 28th USENIX Conference on Security Symposium. USENIX Association, USA."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472621"},{"key":"e_1_3_2_1_12_1","volume-title":"Infogan: Interpretable representation learning by information maximizing generative adversarial nets.","author":"Chen Xi","year":"2016","unstructured":"Xi Chen , Yan Duan , Rein Houthooft , John Schulman , Ilya Sutskever , and Pieter Abbeel . 2016 . Infogan: Interpretable representation learning by information maximizing generative adversarial nets. Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets."},{"key":"#cr-split#-e_1_3_2_1_13_1.1","doi-asserted-by":"crossref","unstructured":"Chung-Cheng Chiu Tara Sainath Yonghui Wu Rohit Prabhavalkar Patrick Nguyen Zhifeng Chen Anjuli Kannan Ron Weiss Kanishka Rao Ekaterina Gonina Navdeep Jaitly Bo Li Jan Chorowski and Michiel Bacchiani. 2018. Stateof- the-Art Speech Recognition with Sequence-to-Sequence Models. 4774--4778. https:\/\/doi.org\/10.1109\/ICASSP.2018.8462105 10.1109\/ICASSP.2018.8462105","DOI":"10.1109\/ICASSP.2018.8462105"},{"key":"#cr-split#-e_1_3_2_1_13_1.2","doi-asserted-by":"crossref","unstructured":"Chung-Cheng Chiu Tara Sainath Yonghui Wu Rohit Prabhavalkar Patrick Nguyen Zhifeng Chen Anjuli Kannan Ron Weiss Kanishka Rao Ekaterina Gonina Navdeep Jaitly Bo Li Jan Chorowski and Michiel Bacchiani. 2018. Stateof- the-Art Speech Recognition with Sequence-to-Sequence Models. 4774--4778. https:\/\/doi.org\/10.1109\/ICASSP.2018.8462105","DOI":"10.1109\/ICASSP.2018.8462105"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00916"},{"key":"e_1_3_2_1_15_1","volume-title":"Soyeon Choe, Chiheon Ham, Sunghwan Jung, Bong-Jin Lee, and Icksang Han.","author":"Chung Joon Son","year":"2020","unstructured":"Joon Son Chung , Jaesung Huh , Seongkyu Mun , Minjae Lee , Hee Soo Heo , Soyeon Choe, Chiheon Ham, Sunghwan Jung, Bong-Jin Lee, and Icksang Han. 2020 . In de fence of metric learning for speaker recognition. arXiv preprint arXiv:2003.11982 (2020). Joon Son Chung, Jaesung Huh, Seongkyu Mun, Minjae Lee, Hee Soo Heo, Soyeon Choe, Chiheon Ham, Sunghwan Jung, Bong-Jin Lee, and Icksang Han. 2020. In defence of metric learning for speaker recognition. arXiv preprint arXiv:2003.11982 (2020)."},{"key":"e_1_3_2_1_16_1","volume-title":"Flexibly fair representation learning by disentanglement. arXiv preprint arXiv:1906.02589","author":"Creager Elliot","year":"2019","unstructured":"Elliot Creager , David Madras , J\u00f6rn-Henrik Jacobsen , Marissa A Weis , Kevin Swersky , Toniann Pitassi , and Richard Zemel . 2019. Flexibly fair representation learning by disentanglement. arXiv preprint arXiv:1906.02589 ( 2019 ). Elliot Creager, David Madras, J\u00f6rn-Henrik Jacobsen, Marissa A Weis, Kevin Swersky, Toniann Pitassi, and Richard Zemel. 2019. Flexibly fair representation learning by disentanglement. arXiv preprint arXiv:1906.02589 (2019)."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.2478\/popets-2020-0072"},{"key":"e_1_3_2_1_18_1","volume-title":"Censoring representations with an adversary. arXiv preprint arXiv:1511.05897","author":"Edwards Harrison","year":"2015","unstructured":"Harrison Edwards and Amos Storkey . 2015. Censoring representations with an adversary. arXiv preprint arXiv:1511.05897 ( 2015 ). Harrison Edwards and Amos Storkey. 2015. Censoring representations with an adversary. arXiv preprint arXiv:1511.05897 (2015)."},{"key":"e_1_3_2_1_19_1","volume-title":"Latent constraints: Learning to generate conditionally from unconditional generative models. arXiv preprint arXiv:1711.05772","author":"Engel Jesse","year":"2017","unstructured":"Jesse Engel , Matthew Hoffman , and Adam Roberts . 2017. Latent constraints: Learning to generate conditionally from unconditional generative models. arXiv preprint arXiv:1711.05772 ( 2017 ). Jesse Engel, Matthew Hoffman, and Adam Roberts. 2017. Latent constraints: Learning to generate conditionally from unconditional generative models. arXiv preprint arXiv:1711.05772 (2017)."},{"key":"e_1_3_2_1_20_1","unstructured":"Chanho Eom and Bumsub Ham. 2019. Learning Disentangled Representation for Robust Person Re-identification. In Advances in Neural Information Processing Systems. 5298--5309.  Chanho Eom and Bumsub Ham. 2019. Learning Disentangled Representation for Robust Person Re-identification. In Advances in Neural Information Processing Systems. 5298--5309."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243734.3243834"},{"key":"e_1_3_2_1_22_1","volume-title":"Towards learning fine-grained disentangled representations from speech. arXiv preprint arXiv:1808.02939","author":"Gong Yuan","year":"2018","unstructured":"Yuan Gong and Christian Poellabauer . 2018. Towards learning fine-grained disentangled representations from speech. arXiv preprint arXiv:1808.02939 ( 2018 ). Yuan Gong and Christian Poellabauer. 2018. Towards learning fine-grained disentangled representations from speech. arXiv preprint arXiv:1808.02939 (2018)."},{"key":"e_1_3_2_1_23_1","unstructured":"Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.  Ian Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1984.1164317"},{"key":"e_1_3_2_1_25_1","first-page":"565","article-title":"Neural face editing with intrinsic image disentangling","volume":"10","author":"Hadap Sunil","year":"2020","unstructured":"Sunil Hadap , Elya Shechtman , Zhixin Shu , Kalyan Sunkavalli , and Mehmet Yumer . 2020 . Neural face editing with intrinsic image disentangling . US Patent 10 , 565 ,758. Sunil Hadap, Elya Shechtman, Zhixin Shu, Kalyan Sunkavalli, and Mehmet Yumer. 2020. Neural face editing with intrinsic image disentangling. US Patent 10,565,758.","journal-title":"US Patent"},{"key":"e_1_3_2_1_26_1","unstructured":"Awni Hannun Carl Case Jared Casper Bryan Catanzaro Greg Diamos Erich Elsen Ryan Prenger Sanjeev Satheesh Shubho Sengupta Adam Coates and Andrew Ng. 2014. DeepSpeech: Scaling up end-to-end speech recognition. (2014).  Awni Hannun Carl Case Jared Casper Bryan Catanzaro Greg Diamos Erich Elsen Ryan Prenger Sanjeev Satheesh Shubho Sengupta Adam Coates and Andrew Ng. 2014. DeepSpeech: Scaling up end-to-end speech recognition. (2014)."},{"key":"e_1_3_2_1_27_1","volume-title":"Proc. Int. Conf. on Auditory- Visual Speech Processing (AVSP'08)","author":"Haq Sanaul","year":"2008","unstructured":"Sanaul Haq , Philip JB Jackson , and James Edge . 2008 . Audio-visual feature selection and reduction for emotion classification . In Proc. Int. Conf. on Auditory- Visual Speech Processing (AVSP'08) , Tangalooma, Australia. Sanaul Haq, Philip JB Jackson, and James Edge. 2008. Audio-visual feature selection and reduction for emotion classification. In Proc. Int. Conf. on Auditory- Visual Speech Processing (AVSP'08), Tangalooma, Australia."},{"key":"e_1_3_2_1_28_1","first-page":"6","article-title":"beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework","volume":"2","author":"Higgins Irina","year":"2017","unstructured":"Irina Higgins , Loic Matthey , Arka Pal , Christopher Burgess , Xavier Glorot , Matthew Botvinick , Shakir Mohamed , and Alexander Lerchner . 2017 . beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework . Iclr 2 , 5 (2017), 6 . Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. 2017. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Iclr 2, 5 (2017), 6.","journal-title":"Iclr"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1587\/transinf.2017EDP7165"},{"key":"e_1_3_2_1_30_1","unstructured":"Wei-Ning Hsu Yu Zhang and James Glass. 2017. Unsupervised learning of disentangled and interpretable representations from sequential data. In Advances in neural information processing systems. 1878--1889.  Wei-Ning Hsu Yu Zhang and James Glass. 2017. Unsupervised learning of disentangled and interpretable representations from sequential data. In Advances in neural information processing systems. 1878--1889."},{"key":"e_1_3_2_1_31_1","volume-title":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 3267--3271","author":"Hu T.","unstructured":"T. Hu , A. Shrivastava , O. Tuzel , and C. Dhir . 2020. Unsupervised Style and Content Separation by Minimizing Mutual Information for Speech Synthesis . In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 3267--3271 . T. Hu, A. Shrivastava, O. Tuzel, and C. Dhir. 2020. Unsupervised Style and Content Separation by Minimizing Mutual Information for Speech Synthesis. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 3267--3271."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TETCI.2020.2977678"},{"key":"e_1_3_2_1_33_1","volume-title":"Privacy enhanced multimodal neural representations for emotion recognition. arXiv preprint arXiv:1910.13212","author":"Jaiswal Mimansa","year":"2019","unstructured":"Mimansa Jaiswal and Emily Mower Provost . 2019. Privacy enhanced multimodal neural representations for emotion recognition. arXiv preprint arXiv:1910.13212 ( 2019 ). Mimansa Jaiswal and Emily Mower Provost. 2019. Privacy enhanced multimodal neural representations for emotion recognition. arXiv preprint arXiv:1910.13212 (2019)."},{"key":"e_1_3_2_1_34_1","volume-title":"27th USENIX Security Symposium (USENIX Security 18)","author":"Jia Jinyuan","year":"2018","unstructured":"Jinyuan Jia and Neil Zhenqiang Gong . 2018 . AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning . In 27th USENIX Security Symposium (USENIX Security 18) . Jinyuan Jia and Neil Zhenqiang Gong. 2018. AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning. In 27th USENIX Security Symposium (USENIX Security 18)."},{"key":"e_1_3_2_1_35_1","unstructured":"Huafeng Jin and Shuo Wang. 2018. Voice-based determination of physical and emotional characteristics of users.  Huafeng Jin and Shuo Wang. 2018. Voice-based determination of physical and emotional characteristics of users."},{"key":"e_1_3_2_1_36_1","volume-title":"Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.)","volume":"80","author":"Kalchbrenner Nal","unstructured":"Nal Kalchbrenner , Erich Elsen , Karen Simonyan , Seb Noury , Norman Casagrande , Edward Lockhart , Florian Stimberg , Aaron van den Oord, Sander Dieleman, and Koray Kavukcuoglu. 2018. Efficient Neural Audio Synthesis . In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.) , Vol. 80 . PMLR, 2410--2419. Nal Kalchbrenner, Erich Elsen, Karen Simonyan, Seb Noury, Norman Casagrande, Edward Lockhart, Florian Stimberg, Aaron van den Oord, Sander Dieleman, and Koray Kavukcuoglu. 2018. Efficient Neural Audio Synthesis. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.), Vol. 80. PMLR, 2410--2419."},{"key":"e_1_3_2_1_37_1","volume-title":"exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds. Acoustical science and technology 27, 6","author":"Kawahara Hideki","year":"2006","unstructured":"Hideki Kawahara . 2006. STRAIGHT , exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds. Acoustical science and technology 27, 6 ( 2006 ), 349--353. Hideki Kawahara. 2006. STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds. Acoustical science and technology 27, 6 (2006), 349--353."},{"key":"e_1_3_2_1_38_1","volume-title":"International Conference on Machine Learning. 2649--2658","author":"Kim Hyunjik","year":"2018","unstructured":"Hyunjik Kim and Andriy Mnih . 2018 . Disentangling by Factorising . In International Conference on Machine Learning. 2649--2658 . Hyunjik Kim and Andriy Mnih. 2018. Disentangling by Factorising. In International Conference on Machine Learning. 2649--2658."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.5555\/3305381.3305573"},{"key":"e_1_3_2_1_40_1","unstructured":"Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. (2013).  Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. (2013)."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1915768117"},{"key":"e_1_3_2_1_42_1","volume-title":"User-centric Privacy: A Usable and Provider-independent Privacy Infrastructure.","author":"Kolter Jan Paul","year":"2010","unstructured":"Jan Paul Kolter . 2010 . User-centric Privacy: A Usable and Provider-independent Privacy Infrastructure. Vol. 41 . BoD--Books on Demand. Jan Paul Kolter. 2010. User-centric Privacy: A Usable and Provider-independent Privacy Infrastructure. Vol. 41. BoD--Books on Demand."},{"key":"e_1_3_2_1_43_1","unstructured":"Tejas D Kulkarni William F Whitney Pushmeet Kohli and Josh Tenenbaum. 2015. Deep convolutional inverse graphics network. In Advances in neural information processing systems. 2539--2547.  Tejas D Kulkarni William F Whitney Pushmeet Kohli and Josh Tenenbaum. 2015. Deep convolutional inverse graphics network. In Advances in neural information processing systems. 2539--2547."},{"key":"e_1_3_2_1_44_1","unstructured":"Guillaume Lample Neil Zeghidour Nicolas Usunier Antoine Bordes Ludovic Denoyer and Marc-Aurelio Ranzato. 2017. Fader networks: Manipulating images by sliding attributes. In Advances in Neural Information Processing Systems. 5967--5976.  Guillaume Lample Neil Zeghidour Nicolas Usunier Antoine Bordes Ludovic Denoyer and Marc-Aurelio Ranzato. 2017. Fader networks: Manipulating images by sliding attributes. In Advances in Neural Information Processing Systems. 5967--5976."},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2020.2973798"},{"key":"e_1_3_2_1_46_1","volume-title":"Recent Advances, and Future Trends. arXiv preprint arXiv:2001.00378","author":"Latif Siddique","year":"2020","unstructured":"Siddique Latif , Rajib Rana , Sara Khalifa , Raja Jurdak , Junaid Qadir , and Bj\u00f6rn W Schuller . 2020. Deep Representation Learning in Speech Processing: Challenges , Recent Advances, and Future Trends. arXiv preprint arXiv:2001.00378 ( 2020 ). Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Junaid Qadir, and Bj\u00f6rn W Schuller. 2020. Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends. arXiv preprint arXiv:2001.00378 (2020)."},{"key":"e_1_3_2_1_47_1","volume-title":"High-Fidelity Synthesis with Disentangled Representation. arXiv preprint arXiv:2001.04296","author":"Lee Wonkwang","year":"2020","unstructured":"Wonkwang Lee , Donggyun Kim , Seunghoon Hong , and Honglak Lee . 2020. High-Fidelity Synthesis with Disentangled Representation. arXiv preprint arXiv:2001.04296 ( 2020 ). Wonkwang Lee, Donggyun Kim, Seunghoon Hong, and Honglak Lee. 2020. High-Fidelity Synthesis with Disentangled Representation. arXiv preprint arXiv:2001.04296 (2020)."},{"key":"e_1_3_2_1_48_1","volume-title":"The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS one 13, 5","author":"Livingstone Steven R","year":"2018","unstructured":"Steven R Livingstone and Frank A Russo . 2018. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS one 13, 5 ( 2018 ). Steven R Livingstone and Frank A Russo. 2018. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS one 13, 5 (2018)."},{"key":"e_1_3_2_1_49_1","unstructured":"Francesco Locatello Gabriele Abbati Thomas Rainforth Stefan Bauer Bernhard Sch\u00f6lkopf and Olivier Bachem. 2019. On the fairness of disentangled representations. In Advances in Neural Information Processing Systems. 14584--14597.  Francesco Locatello Gabriele Abbati Thomas Rainforth Stefan Bauer Bernhard Sch\u00f6lkopf and Olivier Bachem. 2019. On the fairness of disentangled representations. In Advances in Neural Information Processing Systems. 14584--14597."},{"key":"#cr-split#-e_1_3_2_1_50_1.1","doi-asserted-by":"crossref","unstructured":"Jaime Lorenzo-Trueba Thomas Drugman Javier Latorre Thomas Merritt Bartosz Putrycz Roberto Barra-Chicote Alexis Moinet and Vatsal Aggarwal. 2019. Towards Achieving Robust Universal Neural Vocoding. 181--185. https:\/\/doi.org\/10.21437\/Interspeech.2019-1424 10.21437\/Interspeech.2019-1424","DOI":"10.21437\/Interspeech.2019-1424"},{"key":"#cr-split#-e_1_3_2_1_50_1.2","doi-asserted-by":"crossref","unstructured":"Jaime Lorenzo-Trueba Thomas Drugman Javier Latorre Thomas Merritt Bartosz Putrycz Roberto Barra-Chicote Alexis Moinet and Vatsal Aggarwal. 2019. Towards Achieving Robust Universal Neural Vocoding. 181--185. https:\/\/doi.org\/10.21437\/Interspeech.2019-1424","DOI":"10.21437\/Interspeech.2019-1424"},{"key":"e_1_3_2_1_51_1","volume-title":"Learning Adversarially Fair and Transferable Representations. In International Conference on Machine Learning. 3384--3393","author":"Madras David","year":"2018","unstructured":"David Madras , Elliot Creager , Toniann Pitassi , and Richard Zemel . 2018 . Learning Adversarially Fair and Transferable Representations. In International Conference on Machine Learning. 3384--3393 . David Madras, Elliot Creager, Toniann Pitassi, and Richard Zemel. 2018. Learning Adversarially Fair and Transferable Representations. In International Conference on Machine Learning. 3384--3393."},{"key":"e_1_3_2_1_52_1","volume-title":"Adversarial autoencoders. arXiv preprint arXiv:1511.05644","author":"Makhzani Alireza","year":"2015","unstructured":"Alireza Makhzani , Jonathon Shlens , Navdeep Jaitly , Ian Goodfellow , and Brendan Frey . 2015. Adversarial autoencoders. arXiv preprint arXiv:1511.05644 ( 2015 ). Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. 2015. Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)."},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/3302505.3310068"},{"key":"e_1_3_2_1_54_1","unstructured":"Charles Marx Richard Phillips Sorelle Friedler Carlos Scheidegger and Suresh Venkatasubramanian. 2019. Disentangling Influence: Using disentangled representations to audit model predictions. In Advances in Neural Information Processing Systems 32 H. Wallach H. Larochelle A. Beygelzimer F. d'Alch\u00e9-Buc E. Fox and R. Garnett (Eds.). Curran Associates Inc. 4496--4506.  Charles Marx Richard Phillips Sorelle Friedler Carlos Scheidegger and Suresh Venkatasubramanian. 2019. Disentangling Influence: Using disentangled representations to audit model predictions. In Advances in Neural Information Processing Systems 32 H. Wallach H. Larochelle A. Beygelzimer F. d'Alch\u00e9-Buc E. Fox and R. Garnett (Eds.). Curran Associates Inc. 4496--4506."},{"key":"e_1_3_2_1_55_1","volume-title":"Junbo Zhao, Aditya Ramesh, Pablo Sprechmann, and Yann LeCun.","author":"Mathieu Michael F","year":"2016","unstructured":"Michael F Mathieu , Junbo Jake Zhao , Junbo Zhao, Aditya Ramesh, Pablo Sprechmann, and Yann LeCun. 2016 . Disentangling factors of variation in deep representation using adversarial training. In Advances in neural information processing systems. 5040--5048. Michael F Mathieu, Junbo Jake Zhao, Junbo Zhao, Aditya Ramesh, Pablo Sprechmann, and Yann LeCun. 2016. Disentangling factors of variation in deep representation using adversarial training. In Advances in neural information processing systems. 5040--5048."},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP.2019.00029"},{"key":"e_1_3_2_1_57_1","volume-title":"WORLD: a vocoderbased high-quality speech synthesis system for real-time applications. IEICE TRANSACTIONS on Information and Systems","author":"Morise Masanori","year":"2016","unstructured":"Masanori Morise , Fumiya Yokomori , and Kenji Ozawa . 2016. WORLD: a vocoderbased high-quality speech synthesis system for real-time applications. IEICE TRANSACTIONS on Information and Systems ( 2016 ). Masanori Morise, Fumiya Yokomori, and Kenji Ozawa. 2016. WORLD: a vocoderbased high-quality speech synthesis system for real-time applications. IEICE TRANSACTIONS on Information and Systems (2016)."},{"key":"e_1_3_2_1_58_1","unstructured":"Daniel Moyer Shuyang Gao Rob Brekelmans Aram Galstyan and Greg Ver Steeg. 2018. Invariant representations without adversarial training. In Advances in Neural Information Processing Systems. 9084--9093.  Daniel Moyer Shuyang Gao Rob Brekelmans Aram Galstyan and Greg Ver Steeg. 2018. Invariant representations without adversarial training. In Advances in Neural Information Processing Systems. 9084--9093."},{"key":"e_1_3_2_1_59_1","volume-title":"18th Annual Conference of the International Speech Communication Association","author":"Nagrani Arsha","year":"2017","unstructured":"Arsha Nagrani , Joon Son Chung , and AndrewZisserman. 2017 . VoxCeleb:ALarge- Scale Speaker Identification Dataset. In Interspeech 2017 , 18th Annual Conference of the International Speech Communication Association , Stockholm, Sweden , August 20-24, 2017, Francisco Lacerda (Ed.). ISCA, 2616--2620. Arsha Nagrani, Joon Son Chung, and AndrewZisserman. 2017. VoxCeleb:ALarge- Scale Speaker Identification Dataset. In Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, Francisco Lacerda (Ed.). ISCA, 2616--2620."},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP.2019.00065"},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2019.06.001"},{"key":"e_1_3_2_1_62_1","volume-title":"Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499","author":"van den Oord Aaron","year":"2016","unstructured":"Aaron van den Oord , Sander Dieleman , Heiga Zen , Karen Simonyan , Oriol Vinyals , Alex Graves , Nal Kalchbrenner , Andrew Senior , and Koray Kavukcuoglu . 2016 . Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016). Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016)."},{"key":"e_1_3_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178964"},{"key":"e_1_3_2_1_64_1","volume-title":"Unsupervised Speech Domain Adaptation Based on Disentangled Representation Learning for Robust Speech Recognition. arXiv preprint arXiv:1904.06086","author":"Park Jong-Hyeon","year":"2019","unstructured":"Jong-Hyeon Park , Myungwoo Oh , and Hyung-Min Park . 2019. Unsupervised Speech Domain Adaptation Based on Disentangled Representation Learning for Robust Speech Recognition. arXiv preprint arXiv:1904.06086 ( 2019 ). Jong-Hyeon Park, Myungwoo Oh, and Hyung-Min Park. 2019. Unsupervised Speech Domain Adaptation Based on Disentangled Representation Learning for Robust Speech Recognition. arXiv preprint arXiv:1904.06086 (2019)."},{"key":"e_1_3_2_1_65_1","unstructured":"Xingchao Peng Zijun Huang Ximeng Sun and Kate Saenko. 2019. Domain Agnostic Learning with Disentangled Representations. In ICML.  Xingchao Peng Zijun Huang Ximeng Sun and Kate Saenko. 2019. Domain Agnostic Learning with Disentangled Representations. In ICML."},{"key":"e_1_3_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.21437\/Odyssey.2020-28"},{"key":"e_1_3_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/3274783.3274855"},{"key":"e_1_3_2_1_68_1","volume-title":"Proceedings of the 31st International Conference on International Conference on Machine Learning -","volume":"32","author":"Reed Scott","year":"2014","unstructured":"Scott Reed , Kihyuk Sohn , Yuting Zhang , and Honglak Lee . 2014 . Learning to Disentangle Factors of Variation with Manifold Interaction . In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (ICML'14). JMLR.org, II-1431-II-1439. Scott Reed, Kihyuk Sohn, Yuting Zhang, and Honglak Lee. 2014. Learning to Disentangle Factors of Variation with Manifold Interaction. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (ICML'14). JMLR.org, II-1431-II-1439."},{"key":"e_1_3_2_1_69_1","doi-asserted-by":"crossref","unstructured":"Mhd Hasan Sarhan Nassir Navab Abouzar Eslami and Shadi Albarqouni. 2020. Fairness by Learning Orthogonal Disentangled Representations. (2020).  Mhd Hasan Sarhan Nassir Navab Abouzar Eslami and Shadi Albarqouni. 2020. Fairness by Learning Orthogonal Disentangled Representations. (2020).","DOI":"10.1007\/978-3-030-58526-6_44"},{"key":"e_1_3_2_1_70_1","doi-asserted-by":"crossref","unstructured":"Steffen Schneider Alexei Baevski Ronan Collobert and Michael Auli. 2019. wav2vec: Unsupervised Pre-training for Speech Recognition. In INTERSPEECH.  Steffen Schneider Alexei Baevski Ronan Collobert and Michael Auli. 2019. wav2vec: Unsupervised Pre-training for Speech Recognition. In INTERSPEECH.","DOI":"10.21437\/Interspeech.2019-1873"},{"key":"e_1_3_2_1_71_1","unstructured":"Bj\u00f6rn W Schuller and Anton M Batliner. [n. d.]. EMOTION AFFECT AND PERSONALITY IN SPEECH AND LANGUAGE PROCESSING. ([n. d.]).  Bj\u00f6rn W Schuller and Anton M Batliner. [n. d.]. EMOTION AFFECT AND PERSONALITY IN SPEECH AND LANGUAGE PROCESSING. ([n. d.])."},{"key":"e_1_3_2_1_72_1","volume-title":"Membership Inference Attacks Against Machine Learning Models. In 2017 IEEE Symposium on Security and Privacy (SP). 3--18","author":"Shokri R.","unstructured":"R. Shokri , M. Stronati , C. Song , and V. Shmatikov . 2017 . Membership Inference Attacks Against Machine Learning Models. In 2017 IEEE Symposium on Security and Privacy (SP). 3--18 . R. Shokri, M. Stronati, C. Song, and V. Shmatikov. 2017. Membership Inference Attacks Against Machine Learning Models. In 2017 IEEE Symposium on Security and Privacy (SP). 3--18."},{"key":"e_1_3_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1145\/2382196.2382261"},{"key":"e_1_3_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133956.3134077"},{"key":"e_1_3_2_1_75_1","volume-title":"Overlearning Reveals Sensitive Attributes. In International Conference on Learning Representations.","author":"Song Congzheng","year":"2020","unstructured":"Congzheng Song and Vitaly Shmatikov . 2020 . Overlearning Reveals Sensitive Attributes. In International Conference on Learning Representations. Congzheng Song and Vitaly Shmatikov. 2020. Overlearning Reveals Sensitive Attributes. In International Conference on Learning Representations."},{"key":"e_1_3_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-2845"},{"key":"e_1_3_2_1_77_1","volume-title":"Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 6264--6268","author":"Sun G.","unstructured":"G. Sun , Y. Zhang , R. J. Weiss , Y. Cao , H. Zen , and Y. Wu . 2020 . Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 6264--6268 . G. Sun, Y. Zhang, R. J. Weiss, Y. Cao, H. Zen, and Y. Wu. 2020. Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 6264--6268."},{"key":"e_1_3_2_1_78_1","first-page":"376","article-title":"Domain adaptation for structured output via disentangled representations","volume":"16","author":"Tsai Yi-Hsuan","year":"2019","unstructured":"Yi-Hsuan Tsai , Samuel Schulter , Kihyuk Sohn , and Manmohan Chandraker . 2019 . Domain adaptation for structured output via disentangled representations . US Patent App. 16\/400 , 376 . Yi-Hsuan Tsai, Samuel Schulter, Kihyuk Sohn, and Manmohan Chandraker. 2019. Domain adaptation for structured output via disentangled representations. US Patent App. 16\/400,376.","journal-title":"US Patent App."},{"key":"e_1_3_2_1_79_1","unstructured":"Aaron van den Oord Oriol Vinyals etal 2017. Neural discrete representation learning. In Advances in Neural Information Processing Systems. 6306--6315.  Aaron van den Oord Oriol Vinyals et al. 2017. Neural discrete representation learning. In Advances in Neural Information Processing Systems. 6306--6315."},{"key":"e_1_3_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1145\/3168389"},{"key":"e_1_3_2_1_81_1","volume-title":"International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications.","author":"Wakefield Gregory H","year":"1999","unstructured":"Gregory H Wakefield . 1999 . Chromagram visualization of the singing voice . In International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications. Gregory H Wakefield. 1999. Chromagram visualization of the singing voice. In International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications."},{"key":"e_1_3_2_1_82_1","volume-title":"End-to-end Anchored Speech Recognition. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).","author":"Wang Y.","unstructured":"Y. Wang , X. Fan , I. Chen , Y. Liu , T. Chen , and B. Hoffmeister . 2019 . End-to-end Anchored Speech Recognition. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Y. Wang, X. Fan, I. Chen, Y. Liu, T. Chen, and B. Hoffmeister. 2019. End-to-end Anchored Speech Recognition. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)."},{"key":"e_1_3_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1145\/3274694.3274696"},{"key":"e_1_3_2_1_84_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8683120"},{"key":"e_1_3_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1109\/CSF.2018.00027"},{"key":"e_1_3_2_1_86_1","volume-title":"Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch\u00e9-Buc","author":"Zhang Xueru","unstructured":"Xueru Zhang , Mohammadmahdi Khaliligarekani , Cem Tekin , and mingyan liu. 2019. Group Retention when Using Machine Learning in Sequential Decision Making: the Interplay between UserDynamics and Fairness . In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch\u00e9-Buc , E. Fox, and R. Garnett (Eds.). Curran Associates, Inc. , 15269--15278. Xueru Zhang, Mohammadmahdi Khaliligarekani, Cem Tekin, and mingyan liu. 2019. Group Retention when Using Machine Learning in Sequential Decision Making: the Interplay between UserDynamics and Fairness. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch\u00e9-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 15269--15278."},{"key":"e_1_3_2_1_87_1","doi-asserted-by":"crossref","unstructured":"Y. Zhang S. Pan L. He and Z. Ling. 2019. Learning Latent Representations for Style Control and Transfer in End-to-end Speech Synthesis. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP). 6945--6949.  Y. Zhang S. Pan L. He and Z. Ling. 2019. Learning Latent Representations for Style Control and Transfer in End-to-end Speech Synthesis. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP). 6945--6949.","DOI":"10.1109\/ICASSP.2019.8683623"},{"key":"e_1_3_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33019251"}],"event":{"name":"CCS '20: 2020 ACM SIGSAC Conference on Computer and Communications Security","location":"Virtual Event USA","acronym":"CCS '20","sponsor":["SIGSAC ACM Special Interest Group on Security, Audit, and Control"]},"container-title":["Proceedings of the 2020 ACM SIGSAC Conference on Cloud Computing Security Workshop"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3411495.3421355","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3411495.3421355","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:31:41Z","timestamp":1750195901000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3411495.3421355"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,9]]},"references-count":90,"alternative-id":["10.1145\/3411495.3421355","10.1145\/3411495"],"URL":"https:\/\/doi.org\/10.1145\/3411495.3421355","relation":{},"subject":[],"published":{"date-parts":[[2020,11,9]]},"assertion":[{"value":"2020-11-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}