{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T12:48:47Z","timestamp":1774442927387,"version":"3.50.1"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"20","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,10,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Patients with identical cancer diagnoses often progress differently. The disparity we see in disease progression and treatment response can be attributed to the idea that two histologically similar cancers may be completely different diseases on the molecular level. Methods for identifying cancer subtypes associated with patient survival have the capacity to be powerful instruments for understanding the biochemical processes that underlie disease progression as well as providing an initial step toward more personalized therapy for cancer patients. We propose a method called semi-supervised recursively partitioned mixture models (SS-RPMM) that utilizes array-based genetic and patient-level clinical data for finding cancer subtypes that are associated with patient survival.<\/jats:p><jats:p>Results: In the proposed SS-RPMM, cancer subtypes are identified using a selected subset of genes that are associated with survival time. Since survival information is used in the gene selection step, this method is semi-supervised. Unlike other semi-supervised clustering classification methods, SS-RPMM does not require specification of the number of cancer subtypes, which is often unknown. In a simulation study, our proposed method compared favorably with other competing semi-supervised methods, including: semi-supervised clustering and supervised principal components analysis. Furthermore, an analysis of mesothelioma cancer data using SS-RPMM, revealed at least two distinct methylation profiles that are informative for survival.<\/jats:p><jats:p>Availability: The analyses implemented in this article were carried out using R (http:\/\/www.r.project.org\/).<\/jats:p><jats:p>Contact: \u00a0devin_koestler@brown.edu; e_andres_houseman@brown.edu<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq470","type":"journal-article","created":{"date-parts":[[2010,9,12]],"date-time":"2010-09-12T00:24:27Z","timestamp":1284251067000},"page":"2578-2585","source":"Crossref","is-referenced-by-count":57,"title":["Semi-supervised recursively partitioned mixture models for identifying cancer subtypes"],"prefix":"10.1093","volume":"26","author":[{"given":"Devin C.","family":"Koestler","sequence":"first","affiliation":[{"name":"1 Department of Community Health, Section for Biostatistics, 2Department of Pathology and Laboratory Medicine, 3Department of Community Health, Center for Environmental Health and Technology, Brown University, Providence, RI 02912, 4Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH 03756, 5Division of Thoracic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 and 6Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Carmen J.","family":"Marsit","sequence":"additional","affiliation":[{"name":"1 Department of Community Health, Section for Biostatistics, 2Department of Pathology and Laboratory Medicine, 3Department of Community Health, Center for Environmental Health and Technology, Brown University, Providence, RI 02912, 4Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH 03756, 5Division of Thoracic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 and 6Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Brock C.","family":"Christensen","sequence":"additional","affiliation":[{"name":"1 Department of Community Health, Section for Biostatistics, 2Department of Pathology and Laboratory Medicine, 3Department of Community Health, Center for Environmental Health and Technology, Brown University, Providence, RI 02912, 4Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH 03756, 5Division of Thoracic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 and 6Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA"},{"name":"1 Department of Community Health, Section for Biostatistics, 2Department of Pathology and Laboratory Medicine, 3Department of Community Health, Center for Environmental Health and Technology, Brown University, Providence, RI 02912, 4Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH 03756, 5Division of Thoracic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 and 6Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Margaret R.","family":"Karagas","sequence":"additional","affiliation":[{"name":"1 Department of Community Health, Section for Biostatistics, 2Department of Pathology and Laboratory Medicine, 3Department of Community Health, Center for Environmental Health and Technology, Brown University, Providence, RI 02912, 4Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH 03756, 5Division of Thoracic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 and 6Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Raphael","family":"Bueno","sequence":"additional","affiliation":[{"name":"1 Department of Community Health, Section for Biostatistics, 2Department of Pathology and Laboratory Medicine, 3Department of Community Health, Center for Environmental Health and Technology, Brown University, Providence, RI 02912, 4Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH 03756, 5Division of Thoracic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 and 6Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David J.","family":"Sugarbaker","sequence":"additional","affiliation":[{"name":"1 Department of Community Health, Section for Biostatistics, 2Department of Pathology and Laboratory Medicine, 3Department of Community Health, Center for Environmental Health and Technology, Brown University, Providence, RI 02912, 4Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH 03756, 5Division of Thoracic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 and 6Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Karl T.","family":"Kelsey","sequence":"additional","affiliation":[{"name":"1 Department of Community Health, Section for Biostatistics, 2Department of Pathology and Laboratory Medicine, 3Department of Community Health, Center for Environmental Health and Technology, Brown University, Providence, RI 02912, 4Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH 03756, 5Division of Thoracic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 and 6Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA"},{"name":"1 Department of Community Health, Section for Biostatistics, 2Department of Pathology and Laboratory Medicine, 3Department of Community Health, Center for Environmental Health and Technology, Brown University, Providence, RI 02912, 4Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH 03756, 5Division of Thoracic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 and 6Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"E. Andres","family":"Houseman","sequence":"additional","affiliation":[{"name":"1 Department of Community Health, Section for Biostatistics, 2Department of Pathology and Laboratory Medicine, 3Department of Community Health, Center for Environmental Health and Technology, Brown University, Providence, RI 02912, 4Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH 03756, 5Division of Thoracic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 and 6Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA"},{"name":"1 Department of Community Health, Section for Biostatistics, 2Department of Pathology and Laboratory Medicine, 3Department of Community Health, Center for Environmental Health and Technology, Brown University, Providence, RI 02912, 4Department of Community and Family Medicine, Dartmouth Medical School, Hanover, NH 03756, 5Division of Thoracic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115 and 6Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2010,8,16]]},"reference":[{"key":"2023012507544187100_B1","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1038\/35000501","article-title":"Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling","volume":"403","author":"Alizadeh","year":"2000","journal-title":"Nature"},{"key":"2023012507544187100_B2","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1186\/1471-2407-10-227","article-title":"Comprehensive profiling of dna methylation in colorectal cancer reveals subgroups with distinct clinicopathological and molecular features","volume":"10","author":"Ang","year":"2010","journal-title":"BMC Cancer"},{"key":"2023012507544187100_B3","doi-asserted-by":"crossref","first-page":"E108","DOI":"10.1371\/journal.pbio.0020108","article-title":"Semi-supervised methods to predict patient survival from gene expression data","volume":"2","author":"Bair","year":"2004","journal-title":"PLoS Biol."},{"key":"2023012507544187100_B4","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1038\/nm733","article-title":"Gene-expression profiles predict survival of patients with lung adenocarcinoma","volume":"8","author":"Beer","year":"2002","journal-title":"Nat. Med."},{"key":"2023012507544187100_B5","doi-asserted-by":"crossref","first-page":"1605","DOI":"10.1056\/NEJMoa031046","article-title":"Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia","volume":"350","author":"Bullinger","year":"2004","journal-title":"N. Engl. J. Med."},{"key":"2023012507544187100_B6","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1214\/aos\/1176324464","article-title":"Optimal rate of convergence for finite mixture models","volume":"23","author":"Chen","year":"1995","journal-title":"Ann. Stat."},{"key":"2023012507544187100_B7","doi-asserted-by":"crossref","first-page":"e1000602","DOI":"10.1371\/journal.pgen.1000602","article-title":"Aging and environmental exposures alter tissue-specific dna methylation dependent upon CPG island context","volume":"5","author":"Christensen","year":"2009","journal-title":"PLoS Genet."},{"key":"2023012507544187100_B8","doi-asserted-by":"crossref","first-page":"6315","DOI":"10.1158\/0008-5472.CAN-09-1073","article-title":"Differentiation of lung adenocarcinoma, pleural mesothelioma, and nonmalignant pulmonary tissues using DNA methylation profiles","volume":"69","author":"Christensen","year":"2009","journal-title":"Cancer Res."},{"key":"2023012507544187100_B9","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1158\/0008-5472.CAN-08-2586","article-title":"Epigenetic profiles distinguish pleural mesothelioma from normal pleura and predict lung asbestos burden and clinical outcome","volume":"69","author":"Christensen","year":"2009","journal-title":"Cancer Res."},{"key":"2023012507544187100_B10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm","volume":"39","author":"Dempster","year":"1977","journal-title":"J. R. Stat. Soc. B"},{"key":"2023012507544187100_B11","doi-asserted-by":"crossref","first-page":"932","DOI":"10.1038\/leu.2010.41","article-title":"Gene-specific and global methylation patterns predict outcome in patients with acute myeloid leukemia","volume":"24","author":"Deneberg","year":"2010","journal-title":"Leukemia"},{"key":"2023012507544187100_B12","doi-asserted-by":"crossref","first-page":"14863","DOI":"10.1073\/pnas.95.25.14863","article-title":"Cluster analysis and display of genome-wide expression patterns","volume":"95","author":"Eisen","year":"1998","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012507544187100_B13","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1198\/016214502760047131","article-title":"Model-based clustering, discriminant analysis and density estimation","volume":"97","author":"Fraley","year":"2002","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012507544187100_B14","doi-asserted-by":"crossref","first-page":"1062","DOI":"10.1111\/j.1541-0420.2006.00566.x","article-title":"Feature-specific penalized latent class analysis for genomic data","volume":"62","author":"Houseman","year":"2006","journal-title":"Biometrics"},{"key":"2023012507544187100_B15","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1186\/1471-2105-9-365","article-title":"Model-based clustering of dna methylation array data: a recursive-partitioning algorithm for high-dimensional data arising as a mixture of beta distributions","volume":"9","author":"Houseman","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012507544187100_B16","doi-asserted-by":"crossref","first-page":"e10312","DOI":"10.1371\/journal.pone.0010312","article-title":"Gene expression-based classification of non-small cell lung carcinomas and survival prediction","volume":"5","author":"Hou","year":"2010","journal-title":"PLoS One"},{"key":"2023012507544187100_B17","doi-asserted-by":"crossref","first-page":"419","DOI":"10.1158\/1078-0432.CCR-07-0523","article-title":"Association of microRNA expression in hepatocellular carcinomas with hepatitis infection, cirrhosis, and patient survival","volume":"14","author":"Jiang","year":"2008","journal-title":"Clin. Cancer Res."},{"key":"2023012507544187100_B18","doi-asserted-by":"crossref","DOI":"10.1002\/9780470316801","volume-title":"Finding Groups in Data: An Introduction to Cluster Analysis.","author":"Kaufman","year":"1990"},{"key":"2023012507544187100_B19","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1093\/bioinformatics\/btm563","article-title":"Defining clusters from a hierarchical cluster tree: the dynamic tree cut package for R","volume":"24","author":"Langfelder","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012507544187100_B20","doi-asserted-by":"crossref","first-page":"811","DOI":"10.1073\/pnas.0304146101","article-title":"Gene expression profiling identifies clinically relevant subtypes of prostate cancer","volume":"101","author":"Lapointe","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012507544187100_B21","doi-asserted-by":"crossref","first-page":"6672","DOI":"10.1038\/sj.onc.1207881","article-title":"Expression of the secreted frizzled-related protein gene family is downregulated in human mesothelioma","volume":"23","author":"Lee","year":"2004","journal-title":"Oncogene"},{"key":"2023012507544187100_B22","doi-asserted-by":"crossref","first-page":"96","DOI":"10.1080\/01621459.1991.10475008","article-title":"Semiparametric estimation in the rasch model and related exponential response models, including a simple latent class model for item analysis","volume":"86","author":"Lindsay","year":"1991","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012507544187100_B23","doi-asserted-by":"crossref","first-page":"416","DOI":"10.1093\/carcin\/bgp006","article-title":"Epigenetic profiling reveals etiologically distinct patterns of DNA methylation in head and neck squamous cell carcinoma","volume":"30","author":"Marsit","year":"2009","journal-title":"Carcinogenesis"},{"key":"2023012507544187100_B24","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1080\/01621459.1971.10482356","article-title":"Objective criteria for the evaluation of clustering methods","volume":"66","author":"Rand","year":"1971","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012507544187100_B25","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1093\/biomet\/77.1.216","article-title":"The explained variation in proportional hazards regression","volume":"77","author":"Schemper","year":"1990","journal-title":"Biometrika"},{"key":"2023012507544187100_B26","doi-asserted-by":"crossref","first-page":"8418","DOI":"10.1073\/pnas.0932692100","article-title":"Repeated observation of breast tumor subtypes in independent gene expression data sets","volume":"100","author":"Sorlie","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012507544187100_B27","doi-asserted-by":"crossref","first-page":"602","DOI":"10.1198\/016214504000001565","article-title":"Bayesian variable selection in clustering high-dimensional data","volume":"100","author":"Tadesse","year":"2005","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012507544187100_B28","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1214\/ss\/1056397488","article-title":"Class prediction by nearest shrunken centroids, with applications to DNA microarrays","volume":"18","author":"Tibshirani","year":"2003","journal-title":"Stat. Sci."},{"key":"2023012507544187100_B29","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1016\/S0378-3758(02)00388-9","article-title":"A new algorithm for hybrid heirarchical clustering with visualization and the bootstrap","volume":"117","author":"van der Laan","year":"2003","journal-title":"J. Stat. Plan. Inference"},{"key":"2023012507544187100_B30","doi-asserted-by":"crossref","first-page":"1999","DOI":"10.1056\/NEJMoa021967","article-title":"A gene-expression signature as a predictor of survival in breast cancer","volume":"347","author":"van de Vijver","year":"2002","journal-title":"N. Engl. J. Med."},{"key":"2023012507544187100_B31","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1038\/415530a","article-title":"Gene expression profiling predicts clinical outcome of breast cancer","volume":"415","author":"van't Veer","year":"2002","journal-title":"Nature"},{"key":"2023012507544187100_B32","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1593\/neo.07859","article-title":"A transcriptional fingerprint of estrogen in human breast cancer predicts patient survival","volume":"10","author":"Yu","year":"2008","journal-title":"Neoplasia"},{"key":"2023012507544187100_B33","doi-asserted-by":"crossref","first-page":"e13","DOI":"10.1371\/journal.pmed.0030013","article-title":"Gene expression profiling predicts survival in conventional renal cell carcinoma","volume":"3","author":"Zhao","year":"2006","journal-title":"PLoS Med."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/20\/2578\/48853574\/bioinformatics_26_20_2578.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/20\/2578\/48853574\/bioinformatics_26_20_2578.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,25]],"date-time":"2025-02-25T18:32:01Z","timestamp":1740508321000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/20\/2578\/193857"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,8,16]]},"references-count":33,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2010,10,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq470","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,10,15]]},"published":{"date-parts":[[2010,8,16]]}}}