{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,8]],"date-time":"2025-07-08T05:11:53Z","timestamp":1751951513276,"version":"3.32.0"},"reference-count":28,"publisher":"Oxford University Press (OUP)","issue":"18","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2006,9,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Elucidating the molecular taxonomy of cancers and finding biological and clinical markers from microarray experiments is problematic due to the large number of variables being measured. Feature selection methods that can identify relevant classifiers or that can remove likely false positives prior to supervised analysis are therefore desirable.<\/jats:p><jats:p>Results: We present a novel feature selection procedure based on a mixture model and a non-gaussianity measure of a gene's expression profile. The method can be used to find genes that define either small outlier subgroups or major subdivisions, depending on the sign of kurtosis. The method can also be used as a filtering step, prior to supervised analysis, in order to reduce the false discovery rate. We validate our methodology using six independent datasets by rediscovering major classifiers in ER negative and ER positive breast cancer and in prostate cancer. Furthermore, our method finds two novel subtypes within the basal subgroup of ER negative breast tumours, associated with apoptotic and immune response functions respectively, and with statistically different clinical outcome.<\/jats:p><jats:p>Availability: An R-function pack that implements the methods used here has been added to vabayelMix, available from ().<\/jats:p><jats:p>Contact: \u00a0aet21@cam.ac.uk<\/jats:p><jats:p>Supplementary information: Supplementary information is available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btl174","type":"journal-article","created":{"date-parts":[[2006,5,9]],"date-time":"2006-05-09T01:39:44Z","timestamp":1147138784000},"page":"2269-2275","source":"Crossref","is-referenced-by-count":56,"title":["PACK: Profile Analysis using Clustering and Kurtosis to find molecular classifiers in cancer"],"prefix":"10.1093","volume":"22","author":[{"given":"Andrew E.","family":"Teschendorff","sequence":"first","affiliation":[{"name":"Cancer Genomics Program, Department of Oncology 1 \u00a0 1 \u00a0 \u00a0 University of Cambridge, Hutchison-MRC Research Centre, Hills Road, Cambridge CB2 2XZ, UK"}]},{"given":"Ali","family":"Naderi","sequence":"additional","affiliation":[{"name":"Cancer Genomics Program, Department of Oncology 1 \u00a0 1 \u00a0 \u00a0 University of Cambridge, Hutchison-MRC Research Centre, Hills Road, Cambridge CB2 2XZ, UK"}]},{"given":"Nuno L.","family":"Barbosa-Morais","sequence":"additional","affiliation":[{"name":"Cancer Genomics Program, Department of Oncology 1 \u00a0 1 \u00a0 \u00a0 University of Cambridge, Hutchison-MRC Research Centre, Hills Road, Cambridge CB2 2XZ, UK"},{"name":"Institute of Molecular Medicine, Faculty of Medicine 2 \u00a0 2 \u00a0 \u00a0 University of Lisbon, 1649-028 Lisbon, Portugal"}]},{"given":"Carlos","family":"Caldas","sequence":"additional","affiliation":[{"name":"Cancer Genomics Program, Department of Oncology 1 \u00a0 1 \u00a0 \u00a0 University of Cambridge, Hutchison-MRC Research Centre, Hills Road, Cambridge CB2 2XZ, UK"}]}],"member":"286","published-online":{"date-parts":[[2006,5,8]]},"reference":[{"key":"2023012409220121400_b1","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1016\/S1535-6108(02)00097-1","article-title":"Targeting ligand-activated erbb2 signaling inhibits breast and prostate tumor growth","volume":"2","author":"Agus","year":"2002","journal-title":"Cancer Cell"},{"key":"2023012409220121400_b2","first-page":"21","article-title":"Inferring parameters and structure of latent variable models by variational bayes","author":"Attias","year":"1999"},{"key":"2023012409220121400_b3","doi-asserted-by":"crossref","first-page":"E108","DOI":"10.1371\/journal.pbio.0020108","article-title":"Semi-supervised methods to predict patient survival from gene expression data","volume":"2","author":"Bair","year":"2004","journal-title":"PLoS Biol."},{"key":"2023012409220121400_b4","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1080\/00031305.1988.10475539","article-title":"Kurtosis: a critical review","volume":"42","author":"Balanda and MacGillivray","year":"1988","journal-title":"Am. Stat."},{"key":"2023012409220121400_b5","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1080\/07853890410026098","article-title":"Platelet-derived vegf, flt-1, angiopoietin-1 and p-selectin in breast and prostate cancer: further evidence for a role of platelets in tumour angiogenesis","volume":"36","author":"Caine","year":"2004","journal-title":"Ann. Med."},{"key":"2023012409220121400_b6","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1016\/S1525-1578(10)60455-2","article-title":"Analysis of microarray data using Z score transformation","volume":"5","author":"Cheadle","year":"2003","journal-title":"J. Mol. Diagn."},{"key":"2023012409220121400_b7","doi-asserted-by":"crossref","first-page":"565","DOI":"10.1007\/s10585-005-5376-z","article-title":"Muc1, muc2, muc4, muc5ac and muc6 expression in the progression of prostate cancer","volume":"22","author":"Cozzi","year":"2005","journal-title":"Clin. Exp. Metastasis"},{"key":"2023012409220121400_b8","doi-asserted-by":"crossref","first-page":"e147","DOI":"10.1093\/nar\/gnh146","article-title":"Hypervariable genes\u2013experimental error or hidden dynamics","volume":"32","author":"Dozmorov","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023012409220121400_b9","doi-asserted-by":"crossref","first-page":"4660","DOI":"10.1038\/sj.onc.1208561","article-title":"Identification of molecular apocrine breast tumours by microarray analysis","volume":"24","author":"Farmer","year":"2005","journal-title":"Oncogene"},{"key":"2023012409220121400_b10","first-page":"199","article-title":"The influence of cd44v3-v10 on adhesion, invasion and mmp-14 expression in prostate cancer cells","volume":"15","author":"Harrison","year":"2006","journal-title":"Oncol. Rep."},{"key":"2023012409220121400_b11","doi-asserted-by":"crossref","first-page":"1059","DOI":"10.1016\/j.ymthe.2004.08.024","article-title":"Adenovirus-mediated flt1-targeted proapoptotic gene therapy of human prostate cancer","volume":"10","author":"Kaliberov","year":"2004","journal-title":"Mol. Ther."},{"key":"2023012409220121400_b12","doi-asserted-by":"crossref","first-page":"811","DOI":"10.1073\/pnas.0304146101","article-title":"Gene expression profiling identifies clinically relevant subtypes of prostate cancer","volume":"101","author":"Lapointe","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409220121400_b13","first-page":"191","article-title":"Developments in probabilistic modelling with neural networks-ensemble learning","volume-title":"Neural Networks: Artificial Intelligence and Industrial Applications. Proceedings of the 3rd Annual Symposium on Neural Networks. Nijmengen","author":"MacKay","year":"1995"},{"key":"2023012409220121400_b14","first-page":"461","article-title":"Estimating the dimension of a model","volume":"6","author":"Schwarz","year":"1978","journal-title":"Annls. Stat."},{"key":"2023012409220121400_b15","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1002\/pros.20372","article-title":"Aberrant expression of transmembrane mucins, muc1 and muc4, in human prostate carcinomas","volume":"66","author":"Singh","year":"2006","journal-title":"Prostate"},{"key":"2023012409220121400_b16","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1016\/S1535-6108(02)00030-2","article-title":"Gene expression correlates of clinical prostate cancer behavior","volume":"1","author":"Singh","year":"2002","journal-title":"Cancer Cell"},{"volume-title":"Statistical Methods","year":"1967","author":"Snedecor","key":"2023012409220121400_b17"},{"key":"2023012409220121400_b18","doi-asserted-by":"crossref","first-page":"8418","DOI":"10.1073\/pnas.0932692100","article-title":"Repeated observation of breast tumor subtypes in independent gene expression data sets","volume":"100","author":"Sorlie","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409220121400_b19","doi-asserted-by":"crossref","first-page":"10393","DOI":"10.1073\/pnas.1732912100","article-title":"Breast cancer classification and prognosis based on gene expression profiles from a population-based study","volume":"100","author":"Sotiriou","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409220121400_b20","doi-asserted-by":"crossref","first-page":"3025","DOI":"10.1093\/bioinformatics\/bti466","article-title":"A variational bayesian mixture modelling framework for cluster analysis of gene-expression data","volume":"21","author":"Teschendorff","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012409220121400_b21","doi-asserted-by":"crossref","first-page":"644","DOI":"10.1126\/science.1117679","article-title":"Recurrent fusion of tmprss2 and ets transcription factor genes in prostate cancer","volume":"310","author":"Tomlins","year":"2005","journal-title":"Science"},{"key":"2023012409220121400_b22","doi-asserted-by":"crossref","first-page":"490","DOI":"10.1080\/02841860510029888","article-title":"Prostate cancer cell lines lack amplification: overexpression of her2","volume":"44","author":"Ullen","year":"2005","journal-title":"Acta. Oncol."},{"key":"2023012409220121400_b23","doi-asserted-by":"crossref","first-page":"1999","DOI":"10.1056\/NEJMoa021967","article-title":"A gene-expression signature as a predictor of survival in breast cancer","volume":"347","author":"van de Vijver","year":"2002","journal-title":"N Engl. J. Med."},{"key":"2023012409220121400_b24","doi-asserted-by":"crossref","first-page":"671","DOI":"10.1016\/S0140-6736(05)17947-1","article-title":"Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer","volume":"365","author":"Wang","year":"2005","journal-title":"Lancet"},{"key":"2023012409220121400_b25","first-page":"5974","article-title":"Analysis of gene expression identifies candidate markers and pharmacological targets in prostate cancer","volume":"61","author":"Welsh","year":"2001","journal-title":"Cancer Res."},{"key":"2023012409220121400_b26","doi-asserted-by":"crossref","first-page":"977","DOI":"10.1093\/bioinformatics\/17.10.977","article-title":"Model-based clustering and data transformations for gene expression data","volume":"17","author":"Yeung","year":"2001","journal-title":"Bioinformatics"},{"key":"2023012409220121400_b27","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-5-1","article-title":"Gotree machine (gotm): a web-based platform for interpreting sets of interesting genes using gene ontology hierarchies","volume":"5","author":"Zhang","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023012409220121400_b28","doi-asserted-by":"crossref","first-page":"629","DOI":"10.1081\/BIP-200025654","article-title":"Identification of differentially expressed genes with multivariate outlier analysis","volume":"14","author":"Zhao","year":"2004","journal-title":"J. Biopharm. Stat."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/18\/2269\/48841911\/bioinformatics_22_18_2269.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/18\/2269\/48841911\/bioinformatics_22_18_2269.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,8]],"date-time":"2025-01-08T21:58:13Z","timestamp":1736373493000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/22\/18\/2269\/316364"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,5,8]]},"references-count":28,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2006,9,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btl174","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"type":"electronic","value":"1367-4811"},{"type":"print","value":"1367-4803"}],"subject":[],"published-other":{"date-parts":[[2006,9,15]]},"published":{"date-parts":[[2006,5,8]]}}}