{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T00:29:45Z","timestamp":1773275385371,"version":"3.50.1"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2315,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Tumorigenesis is an evolutionary process by which tumor cells acquire sequences of mutations leading to increased growth, invasiveness and eventually metastasis. It is hoped that by identifying the common patterns of mutations underlying major cancer sub-types, we can better understand the molecular basis of tumor development and identify new diagnostics and therapeutic targets. This goal has motivated several attempts to apply evolutionary tree reconstruction methods to assays of tumor state. Inference of tumor evolution is in principle aided by the fact that tumors are heterogeneous, retaining remnant populations of different stages along their development along with contaminating healthy cell populations. In practice, though, this heterogeneity complicates interpretation of tumor data because distinct cell types are conflated by common methods for assaying the tumor state. We previously proposed a method to computationally infer cell populations from measures of tumor-wide gene expression through a geometric interpretation of mixture type separation, but this approach deals poorly with noisy and outlier data.<\/jats:p>\n               <jats:p>Results: In the present work, we propose a new method to perform tumor mixture separation efficiently and robustly to an experimental error. The method builds on the prior geometric approach but uses a novel objective function allowing for robust fits that greatly reduces the sensitivity to noise and outliers. We further develop an efficient gradient optimization method to optimize this \u2018soft geometric unmixing\u2019 objective for measurements of tumor DNA copy numbers assessed by array comparative genomic hybridization (aCGH) data. We show, on a combination of semi-synthetic and real data, that the method yields fast and accurate separation of tumor states.<\/jats:p>\n               <jats:p>Conclusions: We have shown a novel objective function and optimization method for the robust separation of tumor sub-types from aCGH data and have shown that the method provides fast, accurate reconstruction of tumor states from mixed samples. Better solutions to this problem can be expected to improve our ability to accurately identify genetic abnormalities in primary tumor samples and to infer patterns of tumor evolution.<\/jats:p>\n               <jats:p>Contact: \u00a0tolliver@cs.cmu.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq213","type":"journal-article","created":{"date-parts":[[2010,6,7]],"date-time":"2010-06-07T07:28:13Z","timestamp":1275895693000},"page":"i106-i114","source":"Crossref","is-referenced-by-count":28,"title":["Robust unmixing of tumor states in array comparative genomic hybridization data"],"prefix":"10.1093","volume":"26","author":[{"given":"David","family":"Tolliver","sequence":"first","affiliation":[{"name":"1 Computer Science Department, Carnegie Mellon University, 2 Machine Learning Department, Carnegie Mellon University, 3 Department of Biological Sciences, Carnegie Mellon University, 4 Departments of Human Oncology and Human Genetics, Drexel University and 5 Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh PA 15213, USA"}]},{"given":"Charalampos","family":"Tsourakakis","sequence":"additional","affiliation":[{"name":"1 Computer Science Department, Carnegie Mellon University, 2 Machine Learning Department, Carnegie Mellon University, 3 Department of Biological Sciences, Carnegie Mellon University, 4 Departments of Human Oncology and Human Genetics, Drexel University and 5 Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh PA 15213, USA"}]},{"given":"Ayshwarya","family":"Subramanian","sequence":"additional","affiliation":[{"name":"1 Computer Science Department, Carnegie Mellon University, 2 Machine Learning Department, Carnegie Mellon University, 3 Department of Biological Sciences, Carnegie Mellon University, 4 Departments of Human Oncology and Human Genetics, Drexel University and 5 Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh PA 15213, USA"}]},{"given":"Stanley","family":"Shackney","sequence":"additional","affiliation":[{"name":"1 Computer Science Department, Carnegie Mellon University, 2 Machine Learning Department, Carnegie Mellon University, 3 Department of Biological Sciences, Carnegie Mellon University, 4 Departments of Human Oncology and Human Genetics, Drexel University and 5 Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh PA 15213, USA"}]},{"given":"Russell","family":"Schwartz","sequence":"additional","affiliation":[{"name":"1 Computer Science Department, Carnegie Mellon University, 2 Machine Learning Department, Carnegie Mellon University, 3 Department of Biological Sciences, Carnegie Mellon University, 4 Departments of Human Oncology and Human Genetics, Drexel University and 5 Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh PA 15213, USA"},{"name":"1 Computer Science Department, Carnegie Mellon University, 2 Machine Learning Department, Carnegie Mellon University, 3 Department of Biological Sciences, Carnegie Mellon University, 4 Departments of Human Oncology and Human Genetics, Drexel University and 5 Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh PA 15213, USA"}]}],"member":"286","published-online":{"date-parts":[[2010,6,1]]},"reference":[{"key":"2023012508053492000_B1","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1038\/nrc900","article-title":"From the analyst's couch: selective anticancer drugs","volume":"2","author":"Atkins","year":"2002","journal-title":"Nat. Rev. Cancer"},{"key":"2023012508053492000_B2","doi-asserted-by":"crossref","first-page":"2106","DOI":"10.1093\/bioinformatics\/bti274","article-title":"Mtreemix: a software package for learning and using mixture models of mutagenetic trees","volume":"21","author":"Beerenwinkel","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012508053492000_B3","doi-asserted-by":"crossref","first-page":"735","DOI":"10.1038\/nrc1976","article-title":"Opinion: linking oncogenic pathways with therapeutic opportunities","volume":"6","author":"Bild","year":"2006","journal-title":"Nat. Rev. Cancer"},{"key":"2023012508053492000_B4","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511804441","volume-title":"Convex Optimization.","author":"Boyd","year":"2004"},{"key":"2023012508053492000_B5","doi-asserted-by":"crossref","first-page":"4418","DOI":"10.1109\/TSP.2009.2025802","article-title":"A convex analysis based minimum-volume enclosing simplex algorithm for hyperspectral unmixing","volume":"57","author":"Chan","year":"2009","journal-title":"IEEE Trans. Signal Proc."},{"key":"2023012508053492000_B6","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1016\/0165-1684(94)90029-9","article-title":"Independent component analysis","volume":"36","author":"Comon","year":"1994","journal-title":"Signal Proc."},{"key":"2023012508053492000_B7","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1089\/cmb.1999.6.37","article-title":"Inferring tree models for oncogenesis from comparative genome hybridization data","volume":"6","author":"Desper","year":"1999","journal-title":"J. Comp. Biol."},{"key":"2023012508053492000_B8","first-page":"33","article-title":"Sorting out geology \u2014 unmixing mixtures","volume-title":"Use and Abuse of Statistical Methods in the Earth Sciences","author":"Ehrlich","year":"1987"},{"key":"2023012508053492000_B9","doi-asserted-by":"crossref","first-page":"1040","DOI":"10.1158\/1055-9965.EPI-04-0584","article-title":"Analyzing patterns of staining in immunohistochemical studies: application to a study of prostate cancer recurrence","volume":"14","author":"Etzioni","year":"2005","journal-title":"Cancer Epidemiol Biomarkers Prev."},{"key":"2023012508053492000_B10","doi-asserted-by":"crossref","first-page":"2809","DOI":"10.1093\/bioinformatics\/btp505","article-title":"Quantifying cancer progression with conjunctive Bayesian networks","volume":"25","author":"Gerstung","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012508053492000_B11","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1126\/science.286.5439.531","article-title":"Molecular classification of cancer: class discovery and class prediction by gene expression monitoring","volume":"286","author":"Golub","year":"1999","journal-title":"Science"},{"key":"2023012508053492000_B12","volume-title":"Bayesian hidden Markov modeling of array CGH data. Paper 24.","author":"Guha","year":"2006"},{"key":"2023012508053492000_B13","doi-asserted-by":"crossref","DOI":"10.1002\/gcc.1129","article-title":"Multivariate analyses of genomic imbalances in solid tumors reveal distinct and converging pathways of karyotypic evolution","author":"Hglund","year":"2001","journal-title":"Genes Chromosomes Cancer."},{"key":"2023012508053492000_B14","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1038\/nrd2155","article-title":"Why is cancer drug discovery so difficult?","volume":"6","author":"Kamb","year":"2007","journal-title":"Nat. Rev. Drug Discov."},{"key":"2023012508053492000_B15","doi-asserted-by":"crossref","first-page":"434","DOI":"10.1186\/1471-2105-8-434","article-title":"A hidden Markov model to estimate population mixture and allelic copy-numbers in cancers using Affymetrix SNP arrays","volume":"8","author":"Lamy","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023012508053492000_B16","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1038\/44565","article-title":"Learning the parts of objects by non-negative matrix factorization","volume":"401","author":"Lee","year":"1999","journal-title":"Nature"},{"key":"2023012508053492000_B17","doi-asserted-by":"crossref","first-page":"1971","DOI":"10.1093\/bioinformatics\/btl185","article-title":"Distance-based clustering of CGH data","volume":"22","author":"Liu","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012508053492000_B18","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1101\/gr.099622.109","article-title":"Inferring tumor progression from genomic heterogeneity","volume":"20","author":"Navin","year":"2010","journal-title":"Genome Res."},{"key":"2023012508053492000_B19","doi-asserted-by":"crossref","first-page":"557","DOI":"10.1093\/biostatistics\/kxh008","article-title":"Circular binary segmentation for the analysis of array-based DNA copy number data","volume":"5","author":"Olshen","year":"2004","journal-title":"Biostatistics"},{"key":"2023012508053492000_B20","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1007\/s00454-002-0745-8","article-title":"NP-hardness of largest contained and smallest containing simplices for v- and h-polytopes","volume":"28","author":"Packer","year":"2002","journal-title":"Discrete Comput. Geom."},{"key":"2023012508053492000_B21","doi-asserted-by":"crossref","first-page":"1409","DOI":"10.1056\/NEJMc0801440","article-title":"Her2 status and benefit from adjuvant trastuzumab in breast cancer","volume":"358","author":"Paik","year":"2008","journal-title":"N. Engl. J. Med."},{"key":"2023012508053492000_B22","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1080\/14786440109462720","article-title":"On lines and planes of closest fit to systems of points in space","volume":"2","author":"Pearson","year":"1901","journal-title":"Philos. Mag."},{"key":"2023012508053492000_B23","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1007\/978-1-4757-3147-7_4","article-title":"The molecular and cellular biology of her2\/neu gene amplification\/overexpression and the clinical development of herceptin (trastuzumab) therapy for breast cancer","volume":"103","author":"Pegram","year":"2000","journal-title":"Cancer Treat. Res."},{"key":"2023012508053492000_B24","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1142\/S021972000700259X","article-title":"Reconstructing tumor phylogenies from single-cell data","volume":"5","author":"Pennington","year":"2007","journal-title":"J. Bioinform. Comput. Biol."},{"key":"2023012508053492000_B25","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1038\/35021093","article-title":"Molecular portraits of human breast tumors","volume":"406","author":"Perou","year":"2000","journal-title":"Nature"},{"key":"2023012508053492000_B26","volume-title":"Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond.","author":"Sch\u00f6lkopf","year":"2002"},{"key":"2023012508053492000_B27","doi-asserted-by":"crossref","first-page":"1299","DOI":"10.1162\/089976698300017467","article-title":"Nonlinear component analysis as a kernel eigenvalue problem","volume":"10","author":"Sch\u00f6lkopf","year":"1998","journal-title":"Neural Comput."},{"key":"2023012508053492000_B28","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1186\/1471-2105-11-42","article-title":"Applying unmixing to gene expression data for tumor phylogeny inference","volume":"11","author":"Schwartz","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012508053492000_B29","doi-asserted-by":"crossref","first-page":"3042","DOI":"10.1158\/1078-0432.CCR-0401-3","article-title":"Intracellular patterns of Her-2\/neu, ras, and ploidy abnormalities in primary human breast cancers predict postoperative clinical disease-free survival","volume":"10","author":"Shackney","year":"2004","journal-title":"Clin. Cancer Res."},{"key":"2023012508053492000_B30","doi-asserted-by":"crossref","first-page":"10869","DOI":"10.1073\/pnas.191367098","article-title":"Gene expression profiles of breast carcinomas distinguish tumor subclasses with clinical implications","volume":"98","author":"Sorlie","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012508053492000_B31","doi-asserted-by":"crossref","first-page":"8418","DOI":"10.1073\/pnas.0932692100","article-title":"Repeated observation of breast tumor subtypes in indepednent gene expression data sets","volume":"100","author":"Sorlie","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012508053492000_B32","doi-asserted-by":"crossref","first-page":"10393","DOI":"10.1073\/pnas.1732912100","article-title":"Breast cancer classification and prognosis based on gene expression profiles from a population-based study","volume":"100","author":"Sotiriou","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012508053492000_B33","first-page":"500","article-title":"Algorithms for minimum volume enclosing simplex in R3","author":"Zhou","year":"2000","journal-title":"Proceedings of the Eleventh Annual ACM\/SIAM Symposium on Discrete Algorithms"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/12\/i106\/48858411\/bioinformatics_26_12_i106.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/12\/i106\/48858411\/bioinformatics_26_12_i106.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T08:09:24Z","timestamp":1674634164000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/12\/i106\/285904"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,6,1]]},"references-count":33,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2010,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq213","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,6,15]]},"published":{"date-parts":[[2010,6,1]]}}}