{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T23:34:53Z","timestamp":1773272093778,"version":"3.50.1"},"reference-count":28,"publisher":"Oxford University Press (OUP)","issue":"24","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005,12,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: An important application of microarrays is to discover genomic biomarkers, among tens of thousands of genes assayed, for disease classification. Thus there is a need for developing statistical methods that can efficiently use such high-throughput genomic data, select biomarkers with discriminant power and construct classification rules. The ROC (receiver operator characteristic) technique has been widely used in disease classification with low-dimensional biomarkers because (1) it does not assume a parametric form of the class probability as required for example in the logistic regression method; (2) it accommodates case\u2013control designs and (3) it allows treating false positives and false negatives differently. However, due to computational difficulties, the ROC-based classification has not been used with microarray data. Moreover, the standard ROC technique does not incorporate built-in biomarker selection.<\/jats:p><jats:p>Results: We propose a novel method for biomarker selection and classification using the ROC technique for microarray data. The proposed method uses a sigmoid approximation to the area under the ROC curve as the objective function for classification and the threshold gradient descent regularization method for estimation and biomarker selection. Tuning parameter selection based on the V-fold cross validation and predictive performance evaluation are also investigated. The proposed approach is demonstrated with a simulation study, the Colon data and the Estrogen data. The proposed approach yields parsimonious models with excellent classification performance.<\/jats:p><jats:p>Availability: R code is available upon request.<\/jats:p><jats:p>Contact: \u00a0jian@stat.uiowa.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/bti724","type":"journal-article","created":{"date-parts":[[2005,10,19]],"date-time":"2005-10-19T03:38:37Z","timestamp":1129693117000},"page":"4356-4362","source":"Crossref","is-referenced-by-count":134,"title":["Regularized ROC method for disease classification and biomarker selection with microarray data"],"prefix":"10.1093","volume":"21","author":[{"given":"Shuangge","family":"Ma","sequence":"first","affiliation":[{"name":"Department of Biostatistics, University of Washington 1 \u00a0 1 \u00a0 \u00a0 Washington, USA"}]},{"given":"Jian","family":"Huang","sequence":"additional","affiliation":[{"name":"Department of Statistics and Actuarial Science, Program in Public Health Genetics, University of Iowa 2 \u00a0 2 \u00a0 \u00a0 Iowa City, IA, USA"}]}],"member":"286","published-online":{"date-parts":[[2005,10,18]]},"reference":[{"key":"2023061007214985000_b1","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1016\/S0165-1765(98)00255-9","article-title":"Computation of the maximum rank correlation estimator","volume":"62","author":"Abrevaya","year":"1999","journal-title":"Econ. Lett."},{"key":"2023061007214985000_b2","doi-asserted-by":"crossref","first-page":"6745","DOI":"10.1073\/pnas.96.12.6745","article-title":"Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays","volume":"96","author":"Alon","year":"1999","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023061007214985000_b3","doi-asserted-by":"crossref","first-page":"6562","DOI":"10.1073\/pnas.102102699","article-title":"Selection bias in gene extraction on the basis of microarray gene-expression data","volume":"99","author":"Ambroise","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023061007214985000_b4","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1089\/106652700750050943","article-title":"Tissue classification with gene expression profiles","volume":"7","author":"Ben-Dor","year":"2000","journal-title":"J. Comput. Biol."},{"key":"2023061007214985000_b5","first-page":"59","article-title":"Improved statistical tests for differential gene expression by shrinking variance components estimates","volume":"6","author":"Cui","year":"2005","journal-title":"Bioinformatics"},{"key":"2023061007214985000_b6","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1093\/bioinformatics\/btf867","article-title":"Boosting for tumor classification with gene expression data","volume":"9","author":"Dettling","year":"2003","journal-title":"Bioinformatics"},{"key":"2023061007214985000_b7","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1198\/016214502753479248","article-title":"Comparison of discrimination methods for tumor classification based on microarray data","volume":"97","author":"Dudoit","year":"2002","journal-title":"J. Am. Stat. Assoc."},{"key":"2023061007214985000_b8","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1214\/009053604000000067","article-title":"Least angle regression","volume":"32","author":"Efron","year":"2004","journal-title":"Ann. Stat."},{"key":"2023061007214985000_b9","article-title":"Gradient directed regularization for linear regression and classification","volume-title":"Technical report","author":"Friedman","year":"2004"},{"key":"2023061007214985000_b10","volume-title":"Computational Learning and Probabilistic Reasoning","author":"Gammerman","year":"1996"},{"key":"2023061007214985000_b11","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1155\/JBB.2005.147","article-title":"Classification and selection of biomarkers in genomic data using LASSO","volume":"2","author":"Ghosh","year":"2005","journal-title":"J. Biomed. Biotechnol."},{"key":"2023061007214985000_b12","article-title":"Threshold gradient descent method for censored data regression with applications in pharmacogenomics","author":"Gui","year":"2005"},{"key":"2023061007214985000_b13","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1016\/0304-4076(87)90030-3","article-title":"Non-parametric analysis of a generalized regression model","volume":"35","author":"Han","year":"1987","journal-title":"J. Econometrics"},{"key":"2023061007214985000_b14","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-21606-5","volume-title":"The Elements of Statistical Learning","author":"Hastie","year":"2001"},{"key":"2023061007214985000_b15","doi-asserted-by":"crossref","first-page":"505","DOI":"10.2307\/2951582","article-title":"A smoothed maximum score estimator for the binary response model","volume":"60","author":"Horowitz","year":"1992","journal-title":"Econometrica"},{"key":"2023061007214985000_b16","article-title":"Additive risk models for survival data with high dimensional covariates","volume-title":"Biometrics","author":"Ma","year":"2005"},{"key":"2023061007214985000_b17","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1177\/0272989X9901900110","article-title":"Three-way ROCs","volume":"19","author":"Mossman","year":"1999","journal-title":"Med. Decis. Making"},{"key":"2023061007214985000_b18","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1093\/bioinformatics\/18.1.39","article-title":"Tumor classification by partial least squares using microarray gene expression data","volume":"18","author":"Nguyen","year":"2002","journal-title":"Bioinformatics"},{"key":"2023061007214985000_b19","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198509844.001.0001","volume-title":"The Statistical Evaluation of Medical Tests for Classification and Prediction","author":"Pepe","year":"2003"},{"key":"2023061007214985000_b20","doi-asserted-by":"crossref","first-page":"882","DOI":"10.1093\/aje\/kwh101","article-title":"Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker","volume":"159","author":"Pepe","year":"2004","journal-title":"Am. J. Epidemiol."},{"key":"2023061007214985000_b21","article-title":"Combining predictors for classification using the area under the ROC curve","author":"Pepe","year":"2005","journal-title":"Biometrics"},{"key":"2023061007214985000_b22","doi-asserted-by":"crossref","first-page":"3185","DOI":"10.1093\/bioinformatics\/bth383","article-title":"Systematic benchmarking of microarray data classification: assessing the role of non-linearity and dimensionality reduction","volume":"17","author":"Pochet","year":"2004","journal-title":"Bioinformatics"},{"key":"2023061007214985000_b23","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1023\/A:1024099825458","article-title":"Tree induction for probability based rankings","volume":"52","author":"Provost","year":"2003","journal-title":"Mach. Learning"},{"key":"2023061007214985000_b24","article-title":"R: a language and environment for statistical computing","author":"R Development Core Team","year":"2005"},{"key":"2023061007214985000_b25","article-title":"Prediction and uncertainty in the analysis of gene expression profiles","author":"Spang","year":"2001"},{"key":"2023061007214985000_b26","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. B"},{"key":"2023061007214985000_b27","doi-asserted-by":"crossref","DOI":"10.1137\/1.9781611970128","article-title":"Spline models for observational data","author":"Wahba","year":"1990"},{"key":"2023061007214985000_b28","doi-asserted-by":"crossref","first-page":"11562","DOI":"10.1073\/pnas.201162998","article-title":"Predicting the clinical status of human breast cancer by using gene expression profiles","volume":"98","author":"West","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/24\/4356\/50566187\/bioinformatics_21_24_4356.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/24\/4356\/50566187\/bioinformatics_21_24_4356.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,5]],"date-time":"2025-01-05T01:35:47Z","timestamp":1736040947000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/21\/24\/4356\/180325"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,10,18]]},"references-count":28,"journal-issue":{"issue":"24","published-print":{"date-parts":[[2005,12,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bti724","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2005,12,15]]},"published":{"date-parts":[[2005,10,18]]}}}