{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:04:08Z","timestamp":1760123048001,"version":"3.37.3"},"reference-count":51,"publisher":"Oxford University Press (OUP)","issue":"17","license":[{"start":{"date-parts":[[2021,3,3]],"date-time":"2021-03-03T00:00:00Z","timestamp":1614729600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"National Science Foundation Graduate Research Fellowship Program","award":["DGE 1106400"],"award-info":[{"award-number":["DGE 1106400"]}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01 GM134307-01"],"award-info":[{"award-number":["R01 GM134307-01"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,9,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>In pharmacogenomic studies, the biological context of cell lines influences the predictive ability of drug-response models and the discovery of biomarkers. Thus, similar cell lines are often studied together based on prior knowledge of biological annotations. However, this selection approach is not scalable with the number of annotations, and the relationship between gene\u2013drug association patterns and biological context may not be obvious.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We present a procedure to compare cell lines based on their gene\u2013drug association patterns. Starting with a grouping of cell lines from biological annotation, we model gene\u2013drug association patterns for each group as a bipartite graph between genes and drugs. This is accomplished by applying sparse canonical correlation analysis (SCCA) to extract the gene\u2013drug associations, and using the canonical vectors to construct the edge weights. Then, we introduce a nuclear norm-based dissimilarity measure to compare the bipartite graphs. Accompanying our procedure is a permutation test to evaluate the significance of similarity of cell line groups in terms of gene\u2013drug associations. In the pharmacogenomic datasets CTRP2, GDSC2 and CCLE, hierarchical clustering of carcinoma groups based on this dissimilarity measure uniquely reveals clustering patterns driven by carcinoma subtype rather than primary site. Next, we show that the top associated drugs or genes from SCCA can be used to characterize the clustering patterns of haematopoietic and lymphoid malignancies. Finally, we confirm by simulation that when drug responses are linearly dependent on expression, our approach is the only one that can effectively infer the true hierarchy compared to existing approaches.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>Bipartite graph-based hierarchical clustering is implemented in R and can be obtained from CRAN: https:\/\/CRAN.R-project.org\/package=hierBipartite. The source code is available at https:\/\/github.com\/CalvinTChi\/hierBipartite. The datasets were derived from sources in the public domain, which are the Cancer Cell Line Encyclopedia (https:\/\/portals.broadinstitute.org\/ccle), the Cancer Therapeutics Response Portal (https:\/\/portals.broadinstitute.org\/ctrp.v2.1\/?page=#ctd2BodyHome), and the Genomics of Drug Sensitivity in Cancer (https:\/\/www.cancerrxgene.org\/). These datasets can be downloaded using the PharmacoGx R package (https:\/\/bioconductor.org\/packages\/release\/bioc\/html\/PharmacoGx.html).<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab143","type":"journal-article","created":{"date-parts":[[2021,3,1]],"date-time":"2021-03-01T12:36:16Z","timestamp":1614602176000},"page":"2617-2626","source":"Crossref","is-referenced-by-count":14,"title":["Bipartite graph-based approach for clustering of cell lines by gene expression\u2013drug response associations"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4757-0559","authenticated-orcid":false,"given":"Calvin","family":"Chi","sequence":"first","affiliation":[{"name":"Center of Computational Biology, College of Engineering, University of California, Berkeley, CA 94720, USA"}]},{"given":"Yuting","family":"Ye","sequence":"additional","affiliation":[{"name":"Division of Biostatistics, University of California , Berkeley, CA 94720, USA"}]},{"given":"Bin","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Pediatrics and Human Development, Michigan State University , Grand Rapids, MI 48912, USA"},{"name":"Department of Pharmacology and Toxicology, Michigan State University , Grand Rapids, MI 48824, USA"}]},{"given":"Haiyan","family":"Huang","sequence":"additional","affiliation":[{"name":"Center of Computational Biology, College of Engineering, University of California, Berkeley, CA 94720, USA"},{"name":"Department of Statistics, University of California , Berkeley, CA 94720, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,3,3]]},"reference":[{"volume-title":"Abeloff\u2019s Clinical Oncology E-Book","year":"2008","author":"Abeloff","key":"2023051609203130000_btab143-B1"},{"key":"2023051609203130000_btab143-B2","doi-asserted-by":"crossref","first-page":"i413","DOI":"10.1093\/bioinformatics\/btw449","article-title":"Tandem: a two-stage approach to maximize interpretability of drug response models based on multiple molecular data types","volume":"32","author":"Aben","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051609203130000_btab143-B3","first-page":"1","article-title":"Machine learning approaches to drug response prediction: challenges and recent progress","volume":"4","author":"Adam","year":"2020","journal-title":"NPJ Precision Oncol"},{"key":"2023051609203130000_btab143-B4","doi-asserted-by":"crossref","first-page":"714","DOI":"10.4049\/jimmunol.1700884","article-title":"Evidence for the existence of a cxcl17 receptor distinct from gpr35","volume":"201","author":"Amir","year":"2018","journal-title":"J. Immunol"},{"year":"2013","author":"Andrew","first-page":"1247","key":"2023051609203130000_btab143-B5"},{"key":"2023051609203130000_btab143-B6","doi-asserted-by":"crossref","first-page":"e1004663","DOI":"10.1371\/journal.pgen.1004663","article-title":"Methylation QTLS are associated with coordinated changes in transcription factor binding, histone modifications, and gene expression levels","volume":"10","author":"Banovich","year":"2014","journal-title":"PLoS Genet"},{"key":"2023051609203130000_btab143-B7","doi-asserted-by":"crossref","first-page":"603","DOI":"10.1038\/nature11003","article-title":"The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity","volume":"483","author":"Barretina","year":"2012","journal-title":"Nature"},{"key":"2023051609203130000_btab143-B8","doi-asserted-by":"crossref","first-page":"R10","DOI":"10.1186\/gb-2011-12-1-r10","article-title":"DNA methylation patterns associate with genetic and gene expression variation in hapmap cell lines","volume":"12","author":"Bell","year":"2011","journal-title":"Genome Biol"},{"key":"2023051609203130000_btab143-B9","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1002\/hon.916","article-title":"Burkitt lymphoma versus diffuse large b-cell lymphoma: a practical approach","volume":"28","author":"Bellan","year":"2010","journal-title":"Hematol. Oncol"},{"key":"2023051609203130000_btab143-B10","doi-asserted-by":"crossref","first-page":"e0133850","DOI":"10.1371\/journal.pone.0133850","article-title":"Context sensitive modeling of cancer drug sensitivity","volume":"10","author":"Chen","year":"2015","journal-title":"PLoS One"},{"key":"2023051609203130000_btab143-B11","doi-asserted-by":"crossref","first-page":"e441","DOI":"10.1038\/bcj.2016.50","article-title":"Acute myeloid leukemia: a comprehensive review and 2016 update","volume":"6","author":"De Kouchkovsky","year":"2016","journal-title":"Blood Cancer J"},{"year":"2001","author":"DeVita Junior","first-page":"1518","key":"2023051609203130000_btab143-B12"},{"year":"2001","author":"Fazel","first-page":"4734","key":"2023051609203130000_btab143-B13"},{"key":"2023051609203130000_btab143-B14","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1038\/s41586-019-1186-3","article-title":"Next-generation characterization of the cancer cell line encyclopedia","volume":"569","author":"Ghandi","year":"2019","journal-title":"Nature"},{"key":"2023051609203130000_btab143-B15","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1038\/nature12831","article-title":"Inconsistency in large pharmacogenomic studies","volume":"504","author":"Haibe-Kains","year":"2013","journal-title":"Nature"},{"key":"2023051609203130000_btab143-B16","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1093\/biomet\/28.3-4.321","article-title":"Relations between two sets of variates","volume":"28","author":"Harold","year":"1936","journal-title":"Biometrika"},{"key":"2023051609203130000_btab143-B17","doi-asserted-by":"crossref","first-page":"929","DOI":"10.1016\/j.cell.2014.06.049","article-title":"Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin","volume":"158","author":"Hoadley","year":"2014","journal-title":"Cell"},{"volume-title":"Hematology: Basic Principles and Practice","year":"2013","author":"Hoffman","key":"2023051609203130000_btab143-B18"},{"key":"2023051609203130000_btab143-B19","doi-asserted-by":"crossref","first-page":"740","DOI":"10.1016\/j.cell.2016.06.017","article-title":"A landscape of pharmacogenomic interactions in cancer","volume":"166","author":"Iorio","year":"2016","journal-title":"Cell"},{"key":"2023051609203130000_btab143-B20","doi-asserted-by":"crossref","first-page":"1041","DOI":"10.1073\/pnas.1213021110","article-title":"Adar1 promotes malignant progenitor reprogramming in chronic myeloid leukemia","volume":"110","author":"Jiang","year":"2013","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051609203130000_btab143-B21","doi-asserted-by":"crossref","first-page":"1619","DOI":"10.3324\/haematol.2011.049551","article-title":"Phase i and pharmacological study of cytarabine and tanespimycin in relapsed and refractory acute leukemia","volume":"96","author":"Kaufmann","year":"2011","journal-title":"Haematologica"},{"year":"2012","author":"Klami","article-title":"Bayesian exponential family projections for coupled data sources","key":"2023051609203130000_btab143-B22"},{"key":"2023051609203130000_btab143-B23","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1186\/1756-8722-2-28","article-title":"Riz1 is potential cml tumor suppressor that is down-regulated during disease progression","volume":"2","author":"Lakshmikuttyamma","year":"2009","journal-title":"J. Hematol. Oncol"},{"key":"2023051609203130000_btab143-B24","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1038\/ejhg.2013.69","article-title":"Kernel canonical correlation analysis for assessing gene\u2013gene interactions and application to ovarian cancer","volume":"22","author":"Larson","year":"2014","journal-title":"Eur. J. Hum. Genet"},{"key":"2023051609203130000_btab143-B25","doi-asserted-by":"crossref","first-page":"1","DOI":"10.2202\/1544-6115.1638","article-title":"Sparse canonical covariance analysis for high-throughput data","volume":"10","author":"Lee","year":"2011","journal-title":"Stat. Appl. Genet. Mol. Biol"},{"key":"2023051609203130000_btab143-B26","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1016\/j.ajhg.2014.02.011","article-title":"GEMES, clusters of Dna methylation under genetic control, can inform genetic and epigenetic analysis of disease","volume":"94","author":"Liu","year":"2014","journal-title":"Am. J. Hum. Genet"},{"key":"2023051609203130000_btab143-B27","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1186\/s12920-019-0519-2","article-title":"A systematic analysis of genomics-based modeling approaches for prediction of drug response to cytotoxic chemotherapies","volume":"12","author":"Mannheimer","year":"2019","journal-title":"BMC Med. Genomics"},{"key":"2023051609203130000_btab143-B28","doi-asserted-by":"crossref","first-page":"665","DOI":"10.1309\/AJCP3FEPX5BEEKGX","article-title":"Differentiating between Burkitt lymphoma and CD10+ diffuse large B-cell lymphoma: the role of commonly used flow cytometry cell markers and the application of a multiparameter scoring system","volume":"137","author":"McGowan","year":"2012","journal-title":"Am. J. Clin. Pathol"},{"key":"2023051609203130000_btab143-B29","doi-asserted-by":"crossref","first-page":"4245","DOI":"10.1073\/pnas.1208949110","article-title":"Pattern discovery and cancer gene identification in integrated cancer genomic data","volume":"110","author":"Mo","year":"2013","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051609203130000_btab143-B30","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1038\/nature10166","article-title":"Integrated genomic analyses of ovarian carcinoma","volume":"474","author":"Network","year":"2011","journal-title":"Nature"},{"key":"2023051609203130000_btab143-B31","doi-asserted-by":"crossref","first-page":"630","DOI":"10.3324\/haematol.2019.236745","article-title":"The clinical and biological characteristics of nup98-kdm5a in pediatric acute myeloid leukemia","volume":"106","author":"Noort","year":"2020","journal-title":"Haematologica"},{"key":"2023051609203130000_btab143-B32","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-019-50720-0","article-title":"Modeling cancer drug response through drug-specific informative genes","volume":"9","author":"Parca","year":"2019","journal-title":"Sci. Rep"},{"key":"2023051609203130000_btab143-B33","doi-asserted-by":"crossref","first-page":"1586","DOI":"10.1038\/sj.onc.1209959","article-title":"Riz1 repression is associated with insulin-like growth factor-1 signaling activation in chronic myeloid leukemia cell lines","volume":"26","author":"Pastural","year":"2007","journal-title":"Oncogene"},{"key":"2023051609203130000_btab143-B34","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1038\/nchembio.1986","article-title":"Correlating chemical sensitivity and basal gene expression reveals mechanism of action","volume":"12","author":"Rees","year":"2016","journal-title":"Nat. Chem. Biol"},{"key":"2023051609203130000_btab143-B35","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1038\/73432","article-title":"Systematic variation in gene expression patterns in human cancer cell lines","volume":"24","author":"Ross","year":"2000","journal-title":"Nat. Genet"},{"key":"2023051609203130000_btab143-B36","doi-asserted-by":"crossref","first-page":"940","DOI":"10.1046\/j.1365-2141.2002.03972.x","article-title":"Altered expression of retinoblastoma protein-interacting zinc finger gene, RIZ, in human leukaemia","volume":"119","author":"Sasaki","year":"2002","journal-title":"Br. J. Haematol"},{"key":"2023051609203130000_btab143-B37","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1111\/ejh.12491","article-title":"Phase 2 study of dovitinib in patients with relapsed or refractory multiple myeloma with or without t (4; 14) translocation","volume":"95","author":"Scheid","year":"2015","journal-title":"Eur. J. Haematol"},{"key":"2023051609203130000_btab143-B38","doi-asserted-by":"crossref","first-page":"1210","DOI":"10.1158\/2159-8290.CD-15-0235","article-title":"Harnessing connectivity in a large-scale small-molecule sensitivity dataset","volume":"5","author":"Seashore-Ludlow","year":"2015","journal-title":"Cancer Discov"},{"key":"2023051609203130000_btab143-B39","doi-asserted-by":"crossref","first-page":"2906","DOI":"10.1093\/bioinformatics\/btp543","article-title":"Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis","volume":"25","author":"Shen","year":"2009","journal-title":"Bioinformatics"},{"key":"2023051609203130000_btab143-B40","doi-asserted-by":"crossref","first-page":"813","DOI":"10.1038\/nrc1951","article-title":"The nci60 human tumour cell line anticancer drug screen","volume":"6","author":"Shoemaker","year":"2006","journal-title":"Nat. Rev. Cancer"},{"year":"2019","author":"Solari","article-title":"Sparse canonical correlation analysis via concave minimization","key":"2023051609203130000_btab143-B41"},{"key":"2023051609203130000_btab143-B42","doi-asserted-by":"crossref","first-page":"1540","DOI":"10.1093\/bioinformatics\/btl117","article-title":"Pvclust: an r package for assessing the uncertainty in hierarchical clustering","volume":"22","author":"Suzuki","year":"2006","journal-title":"Bioinformatics"},{"key":"2023051609203130000_btab143-B43","doi-asserted-by":"crossref","first-page":"4886","DOI":"10.1093\/bioinformatics\/btz381","article-title":"A bayesian two-way latent structure model for genomic data integration reveals few pan-genomic cluster subtypes in a breast cancer cohort","volume":"35","author":"Swanson","year":"2019","journal-title":"Bioinformatics"},{"key":"2023051609203130000_btab143-B44","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B (Methodological)"},{"key":"2023051609203130000_btab143-B45","doi-asserted-by":"crossref","first-page":"551","DOI":"10.1007\/s00180-011-0276-y","article-title":"Generalized canonical correlation analysis with missing values","volume":"27","author":"Van de Velden","year":"2012","journal-title":"Comput. Stat"},{"key":"2023051609203130000_btab143-B46","doi-asserted-by":"crossref","first-page":"300","DOI":"10.1214\/14-AOAS792","article-title":"Inferring gene\u2013gene interactions and functional modules using sparse canonical correlation analysis","volume":"9","author":"Wang","year":"2015","journal-title":"Ann. Appl. Stat"},{"key":"2023051609203130000_btab143-B47","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1080\/01621459.1963.10500845","article-title":"Hierarchical grouping to optimize an objective function","volume":"58","author":"Ward","year":"1963","journal-title":"J. Am. Stat. Assoc"},{"key":"2023051609203130000_btab143-B48","doi-asserted-by":"crossref","first-page":"2866","DOI":"10.1016\/j.celrep.2019.08.012","article-title":"AML subtype is a major determinant of the association between prognostic gene expression signatures and their clinical significance","volume":"28","author":"Wiggers","year":"2019","journal-title":"Cell Rep"},{"key":"2023051609203130000_btab143-B49","doi-asserted-by":"crossref","first-page":"515","DOI":"10.1093\/biostatistics\/kxp008","article-title":"A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis","volume":"10","author":"Witten","year":"2009","journal-title":"Biostatistics"},{"key":"2023051609203130000_btab143-B50","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1093\/jamia\/ocx062","article-title":"Tissue specificity of in vitro drug sensitivity","volume":"25","author":"Yao","year":"2018","journal-title":"J. Am. Med. Inf. Assoc"},{"key":"2023051609203130000_btab143-B51","doi-asserted-by":"crossref","first-page":"e1004498","DOI":"10.1371\/journal.pcbi.1004498","article-title":"Predicting anticancer drug responses using a dual-layer integrated cell line-drug network model","volume":"11","author":"Zhang","year":"2015","journal-title":"PLoS Comput. Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab143\/38924164\/btab143.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/17\/2617\/50338976\/btab143.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/17\/2617\/50338976\/btab143.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,25]],"date-time":"2024-08-25T02:00:42Z","timestamp":1724551242000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/17\/2617\/6158036"}},"subtitle":[],"editor":[{"given":"Anthony","family":"Mathelier","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,3,3]]},"references-count":51,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2021,9,9]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab143","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2021,9,1]]},"published":{"date-parts":[[2021,3,3]]}}}