{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,5]],"date-time":"2025-02-05T03:40:04Z","timestamp":1738726804369,"version":"3.37.0"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2860,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,5,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: The heterogeneity of cancer cannot always be recognized by tumor morphology, but may be reflected by the underlying genetic aberrations. Array comparative genome hybridization (array-CGH) methods provide high-throughput data on genetic copy numbers, but determining the clinically relevant copy number changes remains a challenge. Conventional classification methods for linking recurrent alterations to clinical outcome ignore sequential correlations in selecting relevant features. Conversely, existing sequence classification methods can only model overall copy number instability, without regard to any particular position in the genome.<\/jats:p><jats:p>Results: Here, we present the heterogeneous hidden conditional random field, a new integrated array-CGH analysis method for jointly classifying tumors, inferring copy numbers and identifying clinically relevant positions in recurrent alteration regions. By capturing the sequentiality as well as the locality of changes, our integrated model provides better noise reduction, and achieves more relevant gene retrieval and more accurate classification than existing methods. We provide an efficient L1-regularized discriminative training algorithm, which notably selects a small set of candidate genes most likely to be clinically relevant and driving the recurrent amplicons of importance. Our method thus provides unbiased starting points in deciding which genomic regions and which genes in particular to pursue for further examination. Our experiments on synthetic data and real genomic cancer prediction data show that our method is superior, both in prediction accuracy and relevant feature discovery, to existing methods. We also demonstrate that it can be used to generate novel biological hypotheses for breast cancer.<\/jats:p><jats:p>Contact: \u00a0ogt@cs.princeton.edu<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn585","type":"journal-article","created":{"date-parts":[[2008,12,4]],"date-time":"2008-12-04T01:47:06Z","timestamp":1228355226000},"page":"1307-1313","source":"Crossref","is-referenced-by-count":2,"title":["Aneuploidy prediction and tumor classification with heterogeneous hidden conditional random fields"],"prefix":"10.1093","volume":"25","author":[{"given":"Zafer","family":"Barutcuoglu","sequence":"first","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, Princeton, NJ 08544, USA and 3Institute of Community Medicine, Tromso University, Tromso, Norway"}]},{"given":"Edoardo M.","family":"Airoldi","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, Princeton, NJ 08544, USA and 3Institute of Community Medicine, Tromso University, Tromso, Norway"},{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, Princeton, NJ 08544, USA and 3Institute of Community Medicine, Tromso University, Tromso, Norway"}]},{"given":"Vanessa","family":"Dumeaux","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, Princeton, NJ 08544, USA and 3Institute of Community Medicine, Tromso University, Tromso, Norway"}]},{"given":"Robert E.","family":"Schapire","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, Princeton, NJ 08544, USA and 3Institute of Community Medicine, Tromso University, Tromso, Norway"}]},{"given":"Olga G.","family":"Troyanskaya","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, Princeton, NJ 08544, USA and 3Institute of Community Medicine, Tromso University, Tromso, Norway"},{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, Princeton, NJ 08544, USA and 3Institute of Community Medicine, Tromso University, Tromso, Norway"}]}],"member":"286","published-online":{"date-parts":[[2008,12,3]]},"reference":[{"key":"2023013110285452600_B1","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1038\/75985","article-title":"Quantitative mapping of amplicon structure by array CGH identifies CYP24 as a candidate oncogene","volume":"25","author":"Albertson","year":"2000","journal-title":"Nat. Genet."},{"key":"2023013110285452600_B2","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1016\/j.tig.2006.06.007","article-title":"Gene amplification in cancer","volume":"22","author":"Albertson","year":"2006","journal-title":"Trends Genet."},{"key":"2023013110285452600_B3","doi-asserted-by":"crossref","first-page":"792","DOI":"10.1038\/emboj.2008.13","article-title":"p73 poses a barrier to malignant transformation by limiting anchorage-independent growth","volume":"27","author":"Beitzinger","year":"2008","journal-title":"EMBO J."},{"key":"2023013110285452600_B4","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1016\/j.ygyno.2005.08.026","article-title":"Amplification of EMSY, a novel oncogene on 11q13, in high grade ovarian surface epithelial carcinomas","volume":"100","author":"Brown","year":"2006","journal-title":"Gynecol. Oncol."},{"key":"2023013110285452600_B5","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1016\/j.ccr.2006.10.009","article-title":"Genomic and transcriptional aberrations linked to breast cancer pathophysiologies","volume":"10","author":"Chin","year":"2006","journal-title":"Cancer Cell"},{"key":"2023013110285452600_B6","doi-asserted-by":"crossref","first-page":"4035","DOI":"10.1038\/sj.onc.1206610","article-title":"VDUP1 upregulated by TGF-beta1 and 1,25-dihydorxyvitamin D3 inhibits tumor cell growth by blocking cell-cycle progression","volume":"22","author":"Han","year":"2003","journal-title":"Oncogene"},{"key":"2023013110285452600_B7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/S0065-230X(08)60209-2","article-title":"Primary chromosome abnormalities in human neoplasia","volume":"52","author":"Heim","year":"1989","journal-title":"Adv. Cancer Res."},{"key":"2023013110285452600_B8","doi-asserted-by":"crossref","first-page":"745","DOI":"10.1002\/gcc.20459","article-title":"Genomic and functional evidence for an ARID1A tumor suppressor role","volume":"46","author":"Huang","year":"2007","journal-title":"Genes Chromosomes Cancer"},{"key":"2023013110285452600_B9","doi-asserted-by":"crossref","first-page":"7612","DOI":"10.1158\/0008-5472.CAN-05-0570","article-title":"Distinct genomic profiles in hereditary breast tumors identified by array-based comparative genomic hybridization","volume":"65","author":"Jonsson","year":"2005","journal-title":"Cancer Res."},{"key":"2023013110285452600_B10","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1145\/1015330.1015364","article-title":"Gradient LASSO for feature selection","volume-title":"ICML '04: Proceedings of the 21st International Conference on Machine Learning.","author":"Kim","year":"2004"},{"key":"2023013110285452600_B11","doi-asserted-by":"crossref","first-page":"3763","DOI":"10.1093\/bioinformatics\/bti611","article-title":"Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data","volume":"21","author":"Lai","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013110285452600_B12","doi-asserted-by":"crossref","DOI":"10.2202\/1544-6115.1248","article-title":"Sparse logistic regression with Lp penalty for biomarker identification","volume":"6","author":"Liu","year":"2007","journal-title":"Stat. Appl. Genet. Mol. Biol."},{"key":"2023013110285452600_B13","doi-asserted-by":"crossref","first-page":"3533","DOI":"10.1093\/bioinformatics\/bth440","article-title":"Accurate detection of aneuploidies in array CGH and gene expression microarray data","volume":"20","author":"Myers","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013110285452600_B14","doi-asserted-by":"crossref","first-page":"8152","DOI":"10.1158\/0008-5472.CAN-04-2598","article-title":"Cul4A physically associates with MDM2 and participates in the proteolysis of p53","volume":"64","author":"Nag","year":"2004","journal-title":"Cancer Res."},{"key":"2023013110285452600_B15","doi-asserted-by":"crossref","first-page":"773","DOI":"10.1090\/S0025-5718-1980-0572855-7","article-title":"Updating quasi-Newton matrices with limited storage","volume":"35","author":"Nocedal","year":"1980","journal-title":"Math. Comput."},{"key":"2023013110285452600_B16","doi-asserted-by":"crossref","first-page":"12963","DOI":"10.1073\/pnas.162471999","article-title":"Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors","volume":"99","author":"Pollack","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110285452600_B17","first-page":"269","article-title":"Bayesian conditional random fields","volume-title":"Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, Jan 6\u20138, 2005.","author":"Qi","year":"2005"},{"key":"2023013110285452600_B18","doi-asserted-by":"crossref","first-page":"i375","DOI":"10.1093\/bioinformatics\/btn188","article-title":"Classification of arrayCGH data using a fused SVM","volume":"24","author":"Rapaport","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013110285452600_B19","doi-asserted-by":"crossref","first-page":"557","DOI":"10.1089\/106652701753307485","article-title":"A model for measurement error for gene expression arrays","volume":"8","author":"Rocke","year":"2001","journal-title":"J. Comput. Biol."},{"key":"2023013110285452600_B20","doi-asserted-by":"crossref","first-page":"e122","DOI":"10.1371\/journal.pcbi.0030122","article-title":"Flexible and accurate detection of genomic copy-number changes from aCGH","volume":"3","author":"Rueda","year":"2007","journal-title":"PLoS. Comput. Biol."},{"key":"2023013110285452600_B21","doi-asserted-by":"crossref","first-page":"i450","DOI":"10.1093\/bioinformatics\/btm221","article-title":"Modeling recurrent DNA copy number alterations in array CGH data","volume":"23","author":"Shah","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013110285452600_B22","doi-asserted-by":"crossref","first-page":"4232","DOI":"10.1038\/sj.onc.1208601","article-title":"Rare amplicons implicate frequent deregulation of cell fate specification pathways in oral squamous cell carcinoma","volume":"24","author":"Snijders","year":"2005","journal-title":"Oncogene"},{"key":"2023013110285452600_B23","first-page":"51","article-title":"Max-margin Markov networks","volume":"16","author":"Taskar","year":"2004","journal-title":"Adv. Neu. Infor. Proc. Sys."},{"key":"2023013110285452600_B24","doi-asserted-by":"crossref","first-page":"1566","DOI":"10.1198\/016214506000000302","article-title":"Hierarchical Dirichlet processes","volume":"101","author":"Teh","year":"2006","journal-title":"J. Am. Stat. Assoc."},{"key":"2023013110285452600_B25","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"2023013110285452600_B26","doi-asserted-by":"crossref","first-page":"210","DOI":"10.1186\/bcr1510","article-title":"Array-CGH and breast cancer","volume":"8","author":"van Beers","year":"2006","journal-title":"Breast Cancer Res."},{"key":"2023013110285452600_B27","first-page":"7110","article-title":"Molecular classification of breast carcinomas by comparative genomic hybridization: a specific somatic genetic profile for BRCA1 tumors","volume":"62","author":"Wessels","year":"2002","journal-title":"Cancer Res."},{"key":"2023013110285452600_B28","doi-asserted-by":"crossref","first-page":"1535","DOI":"10.2353\/ajpath.2007.060478","article-title":"Loss of fibulin-2 expression is associated with breast cancer progression","volume":"170","author":"Yi","year":"2007","journal-title":"Am. J. Pathol."},{"key":"2023013110285452600_B29","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1111\/j.1467-9868.2005.00503.x","article-title":"Regularization and variable selection via the elastic net","volume":"67","author":"Zou","year":"2005","journal-title":"J. R. Stat. Soc. Ser. B"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/10\/1307\/48989683\/bioinformatics_25_10_1307.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/10\/1307\/48989683\/bioinformatics_25_10_1307.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,5]],"date-time":"2025-02-05T02:52:52Z","timestamp":1738723972000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/10\/1307\/269596"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,12,3]]},"references-count":29,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2009,5,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn585","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"type":"electronic","value":"1367-4811"},{"type":"print","value":"1367-4803"}],"subject":[],"published-other":{"date-parts":[[2009,5,15]]},"published":{"date-parts":[[2008,12,3]]}}}