{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,6,6]],"date-time":"2024-06-06T22:34:31Z","timestamp":1717713271678},"reference-count":12,"publisher":"Oxford University Press (OUP)","issue":"7","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The classification of samples using gene expression profiles is an important application in areas such as cancer research and environmental health studies. However, the classification is usually based on a small number of samples, and each sample is a long vector of thousands of gene expression levels. An important issue in parametric modeling for so many gene expression levels is the control of the number of nuisance parameters in the model. Large models often lead to intensive or even intractable computation, while small models may be inadequate for complex data.<\/jats:p>\n               <jats:p>Methodology: We propose a two-step empirical Bayes classification method as a solution to this issue. At the first step, we use the model-based cluster algorithm with a non-traditional purpose of assigning gene expression levels to form abundance groups. At the second step, by assuming the same variance for all the genes in the same group, we substantially reduce the number of nuisance parameters in our statistical model.<\/jats:p>\n               <jats:p>Results: The proposed model is more parsimonious, which leads to efficient computation under an empirical Bayes estimation procedure. We consider two real examples and simulate data using our method. Desired low classification error rates are obtained even when a large number of genes are pre-selected for class prediction.<\/jats:p>\n               <jats:p>Availability: Supplemental materials including technical details are available at \u201chttp:\/\/odin.mdacc.tmc.edu\/~yuanj\/papers\/sup.pdf\u201d. An R program for computation is available upon request by email to Yuan Ji (yuanji@mdanderson.org)<\/jats:p>\n               <jats:p>Contact: \u00a0yuanji@mdanderson.org<\/jats:p>","DOI":"10.1093\/bioinformatics\/bti092","type":"journal-article","created":{"date-parts":[[2004,10,29]],"date-time":"2004-10-29T00:51:45Z","timestamp":1099011105000},"page":"1055-1061","source":"Crossref","is-referenced-by-count":5,"title":["A novel means of using gene clusters in a two-step empirical Bayes method for predicting classes of samples"],"prefix":"10.1093","volume":"21","author":[{"given":"Yuan","family":"Ji","sequence":"first","affiliation":[]},{"given":"Kam-Wah","family":"Tsui","sequence":"additional","affiliation":[]},{"given":"KyungMann","family":"Kim","sequence":"additional","affiliation":[]}],"member":"286","published-online":{"date-parts":[[2004,10,28]]},"reference":[{"key":"2023013107270934200_B1","doi-asserted-by":"crossref","unstructured":"Ambroise, C. and McLachlan, G. 2002Selection bias in gene extraction on the basis of microarray gene-expression data. Proc. Natl Acad. Sci. USA906562\u20136566","DOI":"10.1073\/pnas.102102699"},{"key":"2023013107270934200_B2","unstructured":"Dempster, A., Laird, N., Rubin, D. 1977Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). J. R. Statist. Soc. B391\u201338"},{"key":"2023013107270934200_B3","unstructured":"Dudoit, S., Fridlyand, J., Speed, T. 2002Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Statist. Assoc.9777\u201387"},{"key":"2023013107270934200_B4","unstructured":"Eisen, M., Spellman, P., Brown, P., Botstein, D. 1998Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA9514 863\u201314 868"},{"key":"2023013107270934200_B5","unstructured":"Fraley, C. and Raftery, A. 1999MCLUST: software for model-based cluster analysis. J. Classif.16297\u2013306"},{"key":"2023013107270934200_B6","doi-asserted-by":"crossref","unstructured":"Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E. 1999Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science286531\u2013537","DOI":"10.1126\/science.286.5439.531"},{"key":"2023013107270934200_B7","unstructured":"Keller, A., Schummer, M., Hood, L., Ruzzo, W. 2000Bayesian classification of DNA array expression data. Technical Report UW-CSE-2000-08-01 , Washington, DC  Department of computer science and engineering, University of Washington"},{"key":"2023013107270934200_B8","doi-asserted-by":"crossref","unstructured":"McLachlan, G. and Peel, D. Finite Mixture Models2000, New York  John Wiley & Sons","DOI":"10.1002\/0471721182"},{"key":"2023013107270934200_B9","doi-asserted-by":"crossref","unstructured":"Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewar, S., Dmitrovsky, E., Lander, E., Golub, T. 1999Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl Acad. Sci.96,  pp. 2907\u20132912","DOI":"10.1073\/pnas.96.6.2907"},{"key":"2023013107270934200_B10","doi-asserted-by":"crossref","unstructured":"Thomas, R., Rank, D., Penn, S., Zastrow, G., Hayes, K., Pande, K., Glover, E., Silander, T., Craven, M., Reddy, J., Jovanovich, S., Bradfield, C. 2001Identification of toxicologically predictive gene sets using cDNA microarrays. Mol. Pharmacol.601189\u20131194","DOI":"10.1124\/mol.60.6.1189"},{"key":"2023013107270934200_B11","unstructured":"West, M., Blanchette, C., Holly, D., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, J., Jr, Marks, J., Nevins, J. 2001Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl Acad. Sci. USA9811 462\u201311 467"},{"key":"2023013107270934200_B12","doi-asserted-by":"crossref","unstructured":"Yeung, K., Fraley, C., Murua, A., Raftery, A., Ruzzo, W. 2001Model based clustering and data transformations for gene expression data. Biostatistics17977\u2013987","DOI":"10.1093\/bioinformatics\/17.10.977"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/7\/1055\/48966699\/bioinformatics_21_7_1055.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/7\/1055\/48966699\/bioinformatics_21_7_1055.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T09:59:49Z","timestamp":1675159189000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/21\/7\/1055\/268860"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2004,10,28]]},"references-count":12,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2005,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bti092","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2005,4,1]]},"published":{"date-parts":[[2004,10,28]]}}}