{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T10:12:29Z","timestamp":1761559949262,"version":"3.30.2"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005,2,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Identification of differentially expressed genes is a major issue in gene expression data analysis and selection of marker genes is critical in tumor classification using gene expression data. In this paper, we propose a semiparametric two-sample test to identify both differentially expressed genes and select marker genes for sample classification.<\/jats:p><jats:p>Results: A simulation study shows that the proposed method is more robust and powerful than the methods, generally used such as t-tests and non-parametric rank-sum tests, when the sample size is small. Cross-validation shows that the sample classification based on genes selected using this semiparametric method has lower misclassification rates.<\/jats:p><jats:p>Contact: \u00a0hongyu.zhao@yale.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/bti032","type":"journal-article","created":{"date-parts":[[2004,9,17]],"date-time":"2004-09-17T00:13:37Z","timestamp":1095380017000},"page":"529-536","source":"Crossref","is-referenced-by-count":18,"title":["A semiparametric approach for marker gene selection based on gene expression data"],"prefix":"10.1093","volume":"21","author":[{"given":"Zhong","family":"Guan","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hongyu","family":"Zhao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2004,9,16]]},"reference":[{"key":"2023013107235085100_B1","doi-asserted-by":"crossref","unstructured":"Albert, A. and Anderson, J.A. 1984On the existence of maximum likelihood estimates in logistic regression models. Biometrika711\u201310","DOI":"10.1093\/biomet\/71.1.1"},{"key":"2023013107235085100_B2","unstructured":"Breiman, L. 1996Bagging predictors. Mach. Learn.24123\u2013140"},{"key":"2023013107235085100_B3","unstructured":"Breiman, L. 1998Arcing classifier. Ann. Stat.26801\u2013824"},{"key":"2023013107235085100_B4","unstructured":"Dudoit, S., Fridly, J., Speed, T.P. 2002Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Stat. Assoc.9777\u201387"},{"key":"2023013107235085100_B5","unstructured":"Dudoit, S., Yang, Y.H., Callow, M.J., Speed, T.P. 2002Statistical methods for identifying differentially expressed genes in replicated c{DNA} microarray experiments. Stat. Sinica12111\u2013139"},{"key":"2023013107235085100_B6","doi-asserted-by":"crossref","unstructured":"Efron, B. 1975The efficiency of logistic regression compared to normal discriminant analysis. J. Am. Stat. Assoc.70892\u2013898","DOI":"10.2307\/2285453"},{"key":"2023013107235085100_B7","unstructured":"Technical Report. Fix, E. and Hodges, J. 1951Discriminatory analysis, nonparametric discrimination: consistency properties. , Randolph Field, TX USAF School of Aviation Medicine"},{"key":"2023013107235085100_B8","unstructured":"Freund, Y. and Schapire, R.E. 1997A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Sys. Sci.55119\u2013139"},{"key":"2023013107235085100_B9","unstructured":"Freund, Y. and Schapire, R.E. 1998Comment on \u2018{A}rcing classifiers\u2019. Ann. Stat.26824\u2013832"},{"key":"2023013107235085100_B10","doi-asserted-by":"crossref","unstructured":"Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasen, M., Mesirov, J.P., Coller, H., Loh, M.L., Dowing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S. 1999Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science286531\u2013537","DOI":"10.1126\/science.286.5439.531"},{"key":"2023013107235085100_B11","unstructured":"Guan, Z. 2004A semiparametric changepoint model. Biometrika91849\u2013862"},{"key":"2023013107235085100_B12","unstructured":"Halperin, M., Blackwelder, W.C., Verter, J.I. 1971Estimation of the multivariate logistic risk function: a comparison of the discriminant function and maximum likelihood approaches. J. Chronic Dis.24125\u2013158"},{"key":"2023013107235085100_B13","doi-asserted-by":"crossref","unstructured":"Huber, W., von Heydebreck, A., Sultmann, H., Poustka, A., Vingron, M. 2002Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics18S96\u2013S104","DOI":"10.1093\/bioinformatics\/18.suppl_1.S96"},{"key":"2023013107235085100_B14","unstructured":"Jeffreys, H. 1946An invariant form for the prior probability in estimation problems. Proc. R. Soc. Lond., Ser. A186453\u2013461"},{"key":"2023013107235085100_B15","unstructured":"Technical Report. Kerr, M.K., Martin, M., Churchill, G.A. 2000Analysis of variance for gene expression microarray data. , Bar Harbor, ME The Jackson Laboratory"},{"key":"2023013107235085100_B16","doi-asserted-by":"crossref","unstructured":"Lesaffre, E. and Albert, A. 1989Multiple-group logistic regression diagnostics. J. R. Stat. Soc. Ser. C38425\u2013440","DOI":"10.2307\/2347731"},{"key":"2023013107235085100_B17","unstructured":"Lesaffre, E. and Albert, A. 1989Partial separation in logistic discrimination. J. R. Statist. Soc. Ser. B51109\u2013116"},{"key":"2023013107235085100_B18","unstructured":"Long, A., Mangalam, H.J., Chan, B.Y., Tolleri, L., Hatfield, G.W., Baldi, P. 2001Improved statistical inference from DNA microarray data using analysis of variance and a Bayesian statistical framework. Analysis of global gene expression profiling in Escherichia coli K12. J. Biol. Chem.27619937\u201319944"},{"key":"2023013107235085100_B19","doi-asserted-by":"crossref","unstructured":"Mantel, N. and Brown, C. 1974Alternative tests for comparing normal distribution parameters based on logistic regression. Biometrics30485\u2013497","DOI":"10.2307\/2529202"},{"key":"2023013107235085100_B20","doi-asserted-by":"crossref","unstructured":"Nguyen, D.V. and Rocke, D.M. 2002Multi-class cancer classification via partial least squares with gene expression profiles. Bioinformatics181216\u20131226","DOI":"10.1093\/bioinformatics\/18.9.1216"},{"key":"2023013107235085100_B21","doi-asserted-by":"crossref","unstructured":"O'Brien, P.C. 1988Comparing two samples: extensions of the t, rank-sum, and log-rank tests. J. Am. Stat. Assoc.8352\u201361","DOI":"10.2307\/2288918"},{"key":"2023013107235085100_B22","doi-asserted-by":"crossref","unstructured":"Qin, J. and Zhang, B. 1997A goodness-of-fit test for logistic regression models based on case\u2013control data. Biometrika84609\u2013618","DOI":"10.1093\/biomet\/84.3.609"},{"key":"2023013107235085100_B23","doi-asserted-by":"crossref","unstructured":"Santner, T.J. and Duffy, D.E. 1986A note on A. Albert and J. A. Anderson's conditions for the existence of maximum likelihood estimates in logistic regression models. Biometrika73755\u2013758","DOI":"10.1093\/biomet\/73.3.755"},{"key":"2023013107235085100_B24","unstructured":"Tibshirani, R. 1988Estimating transformations for regression via additivity and variance stabilization. J. Am. Stat. Assoc.83394\u2013405"},{"key":"2023013107235085100_B25","unstructured":"Tibshirani, R. 1988Variance stabilization and the bootstrap. Biometrika75433\u2013444"},{"key":"2023013107235085100_B26","doi-asserted-by":"crossref","unstructured":"Tusher, V.G., Tibshirani, R., Chu, G. 2001Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA985116\u20135121","DOI":"10.1073\/pnas.091062498"},{"key":"2023013107235085100_B27","doi-asserted-by":"crossref","unstructured":"Wu, B., Abbott, T., Fishman, D., McMurray, W., Mor, G., Stone, K., Ward, D., Williams, K., Zhao, H. 2003Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics191636\u20131643","DOI":"10.1093\/bioinformatics\/btg210"},{"key":"2023013107235085100_B28","doi-asserted-by":"crossref","unstructured":"Zhang, B. 1999A chi-squared goodness-of-fit test for logistic regression models based on case\u2013control data. Biometrika86531\u2013539","DOI":"10.1093\/biomet\/86.3.531"},{"key":"2023013107235085100_B29","unstructured":"Zhang, B. 2001An information matrix test for logistic regression models based on case\u2013control data. Biometrika88921\u2013932"},{"key":"2023013107235085100_B30","unstructured":"Zhang, B. 2002Assessing goodness-of-fit of generalized logit models based on case\u2013control data. J. Multivariate Anal.8217\u201338"},{"key":"2023013107235085100_B31","unstructured":"Zhang, B. 2002An EM algorithm for a semiparametric finite mixture model. J. Stat. Comput. Simul.72791\u2013802"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/4\/529\/48965126\/bioinformatics_21_4_529.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/4\/529\/48965126\/bioinformatics_21_4_529.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,18]],"date-time":"2024-12-18T16:58:09Z","timestamp":1734541089000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/21\/4\/529\/203337"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2004,9,16]]},"references-count":31,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2005,2,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bti032","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"type":"electronic","value":"1367-4811"},{"type":"print","value":"1367-4803"}],"subject":[],"published-other":{"date-parts":[[2005,2,15]]},"published":{"date-parts":[[2004,9,16]]}}}