{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T08:57:53Z","timestamp":1769763473014,"version":"3.49.0"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"14","license":[{"start":{"date-parts":[[2019,7,8]],"date-time":"2019-07-08T00:00:00Z","timestamp":1562544000000},"content-version":"vor","delay-in-days":7,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"publisher","award":["292334"],"award-info":[{"award-number":["292334"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"publisher","award":["294238"],"award-info":[{"award-number":["294238"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"publisher","award":["303815"],"award-info":[{"award-number":["303815"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"publisher","award":["303816"],"award-info":[{"award-number":["303816"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"publisher","award":["313124"],"award-info":[{"award-number":["313124"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,7,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Human genomic datasets often contain sensitive information that limits use and sharing of the data. In particular, simple anonymization strategies fail to provide sufficient level of protection for genomic data, because the data are inherently identifiable. Differentially private machine learning can help by guaranteeing that the published results do not leak too much information about any individual data point. Recent research has reached promising results on differentially private drug sensitivity prediction using gene expression data. Differentially private learning with genomic data is challenging because it is more difficult to guarantee privacy in high dimensions. Dimensionality reduction can help, but if the dimension reduction mapping is learned from the data, then it needs to be differentially private too, which can carry a significant privacy cost. Furthermore, the selection of any hyperparameters (such as the target dimensionality) needs to also avoid leaking private information.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We study an approach that uses a large public dataset of similar type to learn a compact representation for differentially private learning. We compare three representation learning methods: variational autoencoders, principal component analysis and random projection. We solve two machine learning tasks on gene expression of cancer cell lines: cancer type classification, and drug sensitivity prediction. The experiments demonstrate significant benefit from all representation learning methods with variational autoencoders providing the most accurate predictions most often. Our results significantly improve over previous state-of-the-art in accuracy of differentially private drug sensitivity prediction.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Code used in the experiments is available at https:\/\/github.com\/DPBayes\/dp-representation-transfer.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz373","type":"journal-article","created":{"date-parts":[[2019,5,9]],"date-time":"2019-05-09T11:20:03Z","timestamp":1557400803000},"page":"i218-i224","source":"Crossref","is-referenced-by-count":9,"title":["Representation transfer for differentially private drug sensitivity prediction"],"prefix":"10.1093","volume":"35","author":[{"given":"Teppo","family":"Niinim\u00e4ki","sequence":"first","affiliation":[{"name":"Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland"}]},{"given":"Mikko A","family":"Heikkil\u00e4","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland"}]},{"given":"Antti","family":"Honkela","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland"},{"name":"Department of Public Health, University of Helsinki, Helsinki, Finland"},{"name":"Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland"}]},{"given":"Samuel","family":"Kaski","sequence":"additional","affiliation":[{"name":"Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland"}]}],"member":"286","published-online":{"date-parts":[[2019,7,5]]},"reference":[{"key":"2023062712332605600_btz373-B1","first-page":"308","author":"Abadi","year":"2016"},{"key":"2023062712332605600_btz373-B2","doi-asserted-by":"crossref","first-page":"1109","DOI":"10.1109\/TKDE.2018.2855136","article-title":"Differentially private mixture of generative neural networks","volume":"31","author":"Acs","year":"2019","journal-title":"IEEE Trans. Knowl. Data Eng"},{"key":"2023062712332605600_btz373-B3","first-page":"245","author":"Bingham","year":"2001"},{"key":"2023062712332605600_btz373-B4","first-page":"289","article-title":"Privacy-preserving logistic regression","author":"Chaudhuri","year":"2009","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2023062712332605600_btz373-B5","first-page":"989","article-title":"Near-optimal differentially private principal components","volume":"25","author":"Chaudhuri","year":"2012","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2023062712332605600_btz373-B6","doi-asserted-by":"crossref","first-page":"1202","DOI":"10.1038\/nbt.2877","article-title":"A community effort to assess and improve drug sensitivity prediction algorithms","volume":"32","author":"Costello","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023062712332605600_btz373-B7","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1561\/0400000042","article-title":"The algorithmic foundations of differential privacy","volume":"9","author":"Dwork","year":"2013","journal-title":"Found. Trends Theor. Comput. Sci"},{"key":"2023062712332605600_btz373-B8","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1007\/11681878_14","volume-title":"Theory of Cryptography (TCC 2006)","author":"Dwork","year":"2006"},{"key":"2023062712332605600_btz373-B9","first-page":"11","volume-title":"Proceedings of the 46th Annual ACM Symposium on Theory of Computing (STOC 2014)","author":"Dwork","year":"2014"},{"key":"2023062712332605600_btz373-B10","first-page":"192","volume-title":"Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI 2016)","author":"Foulds","year":"2016"},{"key":"2023062712332605600_btz373-B11","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1145\/1401890.1401926","volume-title":"Proceedings 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2008)","author":"Ganta","year":"2008"},{"key":"2023062712332605600_btz373-B12","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1126\/science.1229566","article-title":"Identifying personal genomes by surname inference","volume":"339","author":"Gymrek","year":"2013","journal-title":"Science"},{"key":"2023062712332605600_btz373-B13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pgen.1000167","article-title":"Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays","volume":"4","author":"Homer","year":"2008","journal-title":"PLoS Genet"},{"key":"2023062712332605600_btz373-B14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13062-017-0203-4","article-title":"Efficient differentially private learning improves drug sensitivity prediction","volume":"13","author":"Honkela","year":"2018","journal-title":"Biol. Direct"},{"key":"2023062712332605600_btz373-B15","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1007\/s10994-013-5396-x","article-title":"Differential privacy based on importance weighting","volume":"93","author":"Ji","year":"2013","journal-title":"Mach. Learn"},{"key":"2023062712332605600_btz373-B16","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1090\/conm\/026\/737400","volume-title":"Conference in Modern Analysis and Probability (New Haven, Conn., 1982), Volume","author":"Johnson","year":"1984"},{"key":"2023062712332605600_btz373-B17","first-page":"488","volume-title":"Proceedings of the 33rd International Conference on Machine Learning (ICML 2016)","author":"Kasiviswanathan","year":"2016"},{"key":"2023062712332605600_btz373-B18","first-page":"25.1","volume-title":"Proceedings of the 25th Annual Conference on Learning Theory (COLT 2012)","author":"Kifer","year":"2012"},{"key":"2023062712332605600_btz373-B19","volume-title":"Proceedings of the 3rd International Conference for Learning Representations (ICLR 2015)","author":"Kingma","year":"2015"},{"key":"2023062712332605600_btz373-B20","author":"Kingma","year":"2014"},{"key":"2023062712332605600_btz373-B21","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1109\/ICDE.2007.367856","volume-title":"IEEE 23rd International Conference on Data Engineering (ICDE 2007)","author":"Li","year":"2007"},{"key":"2023062712332605600_btz373-B22","doi-asserted-by":"crossref","DOI":"10.1145\/1217299.1217302","article-title":"l-diversity: privacy beyond k-anonymity","volume":"1","author":"Machanavajjhala","year":"2007","journal-title":"ACM Trans. Knowl. Discov. Data"},{"key":"2023062712332605600_btz373-B23","volume-title":"Proceedings of the 5th International Conference on Learning Representations (ICLR 2017)","author":"Papernot","year":"2017"},{"key":"2023062712332605600_btz373-B24","author":"Paszke","year":"2017"},{"key":"2023062712332605600_btz373-B25","author":"Raina","year":"2007"},{"key":"2023062712332605600_btz373-B26","doi-asserted-by":"crossref","first-page":"10787","DOI":"10.1073\/pnas.191368598","article-title":"Chemosensitivity prediction by transcriptional profiling","volume":"98","author":"Staunton","year":"2001","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062712332605600_btz373-B27","doi-asserted-by":"crossref","first-page":"557","DOI":"10.1142\/S0218488502001648","article-title":"k-anonymity: a model for protecting privacy","volume":"10","author":"Sweeney","year":"2002","journal-title":"Int. J. Uncertainty Fuzziness Knowl. Based Syst"},{"key":"2023062712332605600_btz373-B28","year":"2016"},{"key":"2023062712332605600_btz373-B29","year":"2016"},{"key":"2023062712332605600_btz373-B30","doi-asserted-by":"crossref","first-page":"811","DOI":"10.1007\/978-3-030-10928-8_48","volume-title":"Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2018), Volume 11052 of Lecture Notes in Computer Science","author":"Wang","year":"2019"},{"key":"2023062712332605600_btz373-B31","author":"Xie","year":"2018"},{"key":"2023062712332605600_btz373-B32","doi-asserted-by":"crossref","first-page":"D955","DOI":"10.1093\/nar\/gks1111","article-title":"Genomics of drug sensitivity in cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells","volume":"41","author":"Yang","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023062712332605600_btz373-B33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3134428","article-title":"PrivBayes: private data release via Bayesian networks","volume":"42","author":"Zhang","year":"2017","journal-title":"ACM Trans. Database Syst"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/14\/i218\/50720352\/bioinformatics_35_14_i218.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/14\/i218\/50720352\/bioinformatics_35_14_i218.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T12:34:22Z","timestamp":1687869262000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/14\/i218\/5529143"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,7]]},"references-count":33,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2019,7,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz373","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,7]]},"published":{"date-parts":[[2019,7]]}}}