{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,16]],"date-time":"2025-12-16T12:51:30Z","timestamp":1765889490887,"version":"3.41.2"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2025,5,27]],"date-time":"2025-05-27T00:00:00Z","timestamp":1748304000000},"content-version":"vor","delay-in-days":26,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"National Institute Of General Medical Sciences of the National Institutes of Health","award":["R01GM134307"],"award-info":[{"award-number":["R01GM134307"]}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Pharmacogenomics studies are attracting an increasing amount of interest from researchers in precision medicine. The advances in high-throughput experiments and multiplexed approaches allow the large-scale quantification of drug sensitivities in molecularly characterized cancer cell lines (CCLs), resulting in a number of open drug sensitivity datasets for drug biomarker discovery. However, a significant inconsistency in drug sensitivity values among these datasets has been noted. Such inconsistency indicates the presence of substantial noise, subsequently hindering downstream analyses. To address the noise in drug sensitivity data, we introduce a robust and scalable deep learning framework, Residual Thresholded Deep Matrix Factorization (RT-DMF). This method takes a single drug sensitivity data matrix as its sole input and outputs a corrected and imputed matrix. Deep matrix factorization (DMF) excels at uncovering subtle patterns, due to its minimal reliance on data structure assumptions. This attribute significantly boosts DMF\u2019s ability to identify complex hidden patterns among nuisance effects in the data, thereby facilitating the detection of signals that are therapeutically relevant. Furthermore, RT-DMF incorporates an iterative residual thresholding procedure, which plays a crucial role in retaining signals more likely to hold therapeutic importance. Validation using simulated datasets and real pharmacogenomics datasets demonstrates the effectiveness of our approach in correcting noise and imputing missing data in drug sensitivity datasets (open-source package available at https:\/\/github.com\/tomwhoooo\/rtdmf).<\/jats:p>","DOI":"10.1093\/bib\/bbaf226","type":"journal-article","created":{"date-parts":[[2025,5,27]],"date-time":"2025-05-27T04:36:56Z","timestamp":1748320616000},"source":"Crossref","is-referenced-by-count":1,"title":["Large-scale information retrieval and correction of noisy pharmacogenomic datasets through residual thresholded deep matrix factorization"],"prefix":"10.1093","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6056-7629","authenticated-orcid":false,"given":"Zhiyue","family":"Tom Hu","sequence":"first","affiliation":[{"name":"Division of Biostatistics , University of California Berkeley, Berkeley, CA 94720,","place":["United States"]}]},{"given":"Yaodong","family":"Yu","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineer and Computer Science , University of California Berkeley, Berkeley, CA 94720,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9797-1046","authenticated-orcid":false,"given":"Ruoqiao","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Pharmacology and Toxicology , Michigan State University, MI 48824,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7006-6396","authenticated-orcid":false,"given":"Shan-Ju","family":"Yeh","sequence":"additional","affiliation":[{"name":"School of Medicine , National Tsing Hua University, Hsinchu 300044, Taiwan R.O.C"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8858-874X","authenticated-orcid":false,"given":"Bin","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Pharmacology and Toxicology , Michigan State University, MI 48824,","place":["United States"]},{"name":"Department of Pediatrics and Human Development , Michigan State University, MI 48824,","place":["United States"]}]},{"given":"Haiyan","family":"Huang","sequence":"additional","affiliation":[{"name":"Department of Statistics , University of California Berkeley, Berkeley, CA 94720,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2025,5,27]]},"reference":[{"key":"2025052702074403300_ref1","doi-asserted-by":"publisher","first-page":"793","DOI":"10.1056\/NEJMp1500523","article-title":"A new initiative on precision medicine","volume":"372","author":"Collins","year":"2015","journal-title":"N Engl J Med"},{"key":"2025052702074403300_ref2","doi-asserted-by":"publisher","first-page":"1901","DOI":"10.1056\/NEJMp1600894","article-title":"Aiming high\u2014changing the trajectory for cancer","volume":"374","author":"Lowy","year":"2016","journal-title":"N Engl J Med"},{"key":"2025052702074403300_ref3","doi-asserted-by":"publisher","first-page":"285","DOI":"10.1002\/cpt.318","article-title":"Leveraging big data to transform target selection and drug discovery","volume":"99","author":"Chen","year":"2016","journal-title":"Clin Pharmacol Ther"},{"key":"2025052702074403300_ref4","doi-asserted-by":"publisher","first-page":"4512","DOI":"10.1200\/JCO.2012.47.3116","article-title":"Validation of the 12-gene colon cancer recurrence score in NSABP C-07 as a predictor of recurrence in patients with stage II and III colon cancer treated with fluorouracil and leucovorin (FU\/LV) and FU\/LV plus oxaliplatin","volume":"31","author":"Yothers","year":"2013","journal-title":"J Clin Oncol"},{"key":"2025052702074403300_ref5","doi-asserted-by":"publisher","first-page":"197","DOI":"10.1038\/nrclinonc.2014.202","article-title":"Pragmatic issues in biomarker evaluation for targeted therapies in cancer","volume":"12","author":"de Gramont","year":"2015","journal-title":"Nat Rev Clin Oncol"},{"key":"2025052702074403300_ref6","doi-asserted-by":"publisher","first-page":"570","DOI":"10.1038\/nature11005","article-title":"Systematic identification of genomic markers of drug sensitivity in cancer cells","volume":"483","author":"Garnett","year":"2012","journal-title":"Nature"},{"key":"2025052702074403300_ref7","doi-asserted-by":"publisher","first-page":"603","DOI":"10.1038\/nature11003","article-title":"The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity","volume":"483","author":"Barretina","year":"2012","journal-title":"Nature"},{"key":"2025052702074403300_ref8","doi-asserted-by":"publisher","first-page":"1151","DOI":"10.1016\/j.cell.2013.08.003","article-title":"An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules","volume":"154","author":"Basu","year":"2013","journal-title":"Cell"},{"key":"2025052702074403300_ref9","doi-asserted-by":"publisher","first-page":"740","DOI":"10.1016\/j.cell.2016.06.017","article-title":"A landscape of pharmacogenomic interactions in cancer","volume":"166","author":"Iorio","year":"2016","journal-title":"Cell"},{"key":"2025052702074403300_ref10","doi-asserted-by":"publisher","first-page":"ra84","DOI":"10.1126\/scisignal.2004379","article-title":"Profiles of basal and stimulated receptor signaling networks predict drug response in breast cancer lines","volume":"6","author":"Niepel","year":"2013","journal-title":"Sci Signal"},{"key":"2025052702074403300_ref11","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1038\/nchembio.1986","article-title":"Correlating chemical sensitivity and basal gene expression reveals mechanism of action","volume":"12","author":"Rees","year":"2016","journal-title":"Nat Chem Biol"},{"key":"2025052702074403300_ref12","doi-asserted-by":"publisher","first-page":"1210","DOI":"10.1158\/2159-8290.CD-15-0235","article-title":"Harnessing connectivity in a large-scale small-molecule sensitivity dataset","volume":"5","author":"Seashore-Ludlow","year":"2015","journal-title":"Cancer Discov"},{"key":"2025052702074403300_ref13","doi-asserted-by":"publisher","first-page":"389","DOI":"10.1038\/nature12831","article-title":"Inconsistency in large pharmacogenomic studies","volume":"504","author":"Haibe-Kains","year":"2013","journal-title":"Nature"},{"key":"2025052702074403300_ref14","doi-asserted-by":"crossref","first-page":"2333","DOI":"10.12688\/f1000research.9611.1","article-title":"Revisiting inconsistency in large pharmacogenomic studies","volume":"5","author":"Safikhani","year":"2016","journal-title":"F1000Res"},{"key":"2025052702074403300_ref15","doi-asserted-by":"crossref","first-page":"E5 EP","DOI":"10.1038\/nature20171","article-title":"Consistency in drug response profiling","volume":"540","author":"Mpindi","year":"2016","journal-title":"Nature"},{"key":"2025052702074403300_ref16","doi-asserted-by":"crossref","first-page":"E9 EP","DOI":"10.1038\/nature20580","article-title":"Drug response consistency in CCLE and CGP","volume":"540","author":"Bouhaddou","year":"2016","journal-title":"Nature"},{"key":"2025052702074403300_ref17","doi-asserted-by":"publisher","first-page":"521","DOI":"10.1038\/nmeth.3853","article-title":"Growth rate inhibition metrics correct for confounders in measuring sensitivity to cancer drugs","volume":"13","author":"Hafner","year":"2016","journal-title":"Nat Methods"},{"key":"2025052702074403300_ref18","first-page":"248","article-title":"AICM: a genuine framework for correcting inconsistency between large pharmacogenomics","author":"Hu","year":"2019","journal-title":"Datasets"},{"key":"2025052702074403300_ref19","doi-asserted-by":"publisher","first-page":"1740","DOI":"10.1038\/s41467-021-21997-5","article-title":"Deep generative neural network for accurate drug response imputation","volume":"12","author":"Jia","year":"2021","journal-title":"Nat Commun"},{"key":"2025052702074403300_ref20","doi-asserted-by":"publisher","first-page":"1384","DOI":"10.1109\/JBHI.2021.3102186","article-title":"Predicting drug response based on multi-omics fusion and graph convolution","volume":"26","author":"Peng","year":"2022","journal-title":"IEEE J Biomed Health Inform"},{"key":"2025052702074403300_ref21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pcbi.1012012","article-title":"DBDNMF: a dual branch deep neural matrix factorization method for drug response prediction","volume":"20","author":"Liu","year":"2024","journal-title":"PLoS Comput Biol"},{"key":"2025052702074403300_ref22","article-title":"Implicit regularization in deep learning","author":"Neyshabur","year":"2017","journal-title":"CoRR"},{"volume-title":"Implicit Regularization in Deep Matrix Factorization","year":"2019","author":"Arora","key":"2025052702074403300_ref23"},{"key":"2025052702074403300_ref24","doi-asserted-by":"publisher","first-page":"717","DOI":"10.1007\/s10208-009-9045-5","article-title":"Exact matrix completion via convex optimization","volume":"9","author":"Cand\u00e8s","year":"2009","journal-title":"Found Comput Math"},{"key":"2025052702074403300_ref25","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/1970392.1970395","article-title":"Robust principal component analysis?","volume":"58","author":"Cand\u00e8s","year":"2011","journal-title":"J ACM"},{"key":"2025052702074403300_ref26","doi-asserted-by":"publisher","first-page":"578","DOI":"10.1080\/01621459.1972.10481251","article-title":"Significance testing of the spearman rank correlation coefficient","volume":"67","author":"Zar","year":"1972","journal-title":"J Am Stat Assoc"},{"key":"2025052702074403300_ref27","doi-asserted-by":"publisher","first-page":"1927","DOI":"10.1158\/1538-7445.AM2022-1927","article-title":"TransCell: in silico characterization of genomic landscape and cellular responses from gene expressions through a two-step transfer learning","volume":"82","author":"Yeh","year":"2022","journal-title":"Cancer Res"},{"key":"2025052702074403300_ref28","doi-asserted-by":"publisher","first-page":"qzad008","DOI":"10.1093\/gpbjnl\/qzad008","article-title":"TransCell: in silico characterization of genomic landscape and cellular responses by deep transfer learning","volume":"22","author":"Yeh","year":"2024","journal-title":"Genomics Proteomics Bioinformatics"},{"key":"2025052702074403300_ref29","article-title":"Identification of pathways associated with chemosensitivity through network embedding","volume":"15","author":"Wang","journal-title":"PLoS Comput Biol"},{"key":"2025052702074403300_ref30","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1109\/TETC.2014.2330519","article-title":"A survey of clustering algorithms for big data: taxonomy and empirical analysis","volume":"2","author":"Fahad","year":"2014","journal-title":"IEEE Trans Emerg Top Comput"},{"key":"2025052702074403300_ref31","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"key":"2025052702074403300_ref32","first-page":"1","article-title":"CVXPY: a python-embedded modeling language for convex optimization","volume":"17","author":"Diamond","year":"2016","journal-title":"J Mach Learn Res"},{"key":"2025052702074403300_ref33","doi-asserted-by":"publisher","first-page":"3920","DOI":"10.1073\/pnas.1901326117","article-title":"Veridical data science","volume":"117","author":"Yu","year":"2020","journal-title":"Proc Natl Acad Sci"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/3\/bbaf226\/63362772\/bbaf226.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/3\/bbaf226\/63362772\/bbaf226.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,27]],"date-time":"2025-05-27T06:07:50Z","timestamp":1748326070000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbaf226\/8150964"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,1]]},"references-count":33,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbaf226","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"type":"print","value":"1467-5463"},{"type":"electronic","value":"1477-4054"}],"subject":[],"published-other":{"date-parts":[[2025,5]]},"published":{"date-parts":[[2025,5,1]]},"article-number":"bbaf226"}}