{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,6]],"date-time":"2025-05-06T13:06:30Z","timestamp":1746536790784},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"S13","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2011,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>The collection of gene expression profiles from DNA microarrays and their analysis with pattern recognition algorithms is a powerful technology applied to several biological problems. Common pattern recognition systems classify samples assigning them to a set of known classes. However, in a clinical diagnostics setup, novel and unknown classes (new pathologies) may appear and one must be able to reject those samples that do not fit the trained model. The problem of implementing a rejection option in a multi-class classifier has not been widely addressed in the statistical literature. Gene expression profiles represent a critical case study since they suffer from the curse of dimensionality problem that negatively reflects on the reliability of both traditional rejection models and also more recent approaches such as one-class classifiers.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>This paper presents a set of empirical decision rules that can be used to implement a rejection option in a set of multi-class classifiers widely used for the analysis of gene expression profiles. In particular, we focus on the classifiers implemented in the R Language and Environment for Statistical Computing (R for short in the remaining of this paper). The main contribution of the proposed rules is their simplicity, which enables an easy integration with available data analysis environments. Since in the definition of a rejection model tuning of the involved parameters is often a complex and delicate task, in this paper we exploit an evolutionary strategy to automate this process. This allows the final user to maximize the rejection accuracy with minimum manual intervention.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>This paper shows how the use of simple decision rules can be used to help the use of complex machine learning algorithms in real experimental setups. The proposed approach is almost completely automated and therefore a good candidate for being integrated in data analysis flows in labs where the machine learning expertise required to tune traditional classifiers might not be available.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-12-s13-s3","type":"journal-article","created":{"date-parts":[[2011,12,2]],"date-time":"2011-12-02T05:01:57Z","timestamp":1322802117000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Building gene expression profile classifiers with a simple and efficient rejection option in R"],"prefix":"10.1186","volume":"12","author":[{"given":"Alfredo","family":"Benso","sequence":"first","affiliation":[]},{"given":"Stefano","family":"Di Carlo","sequence":"additional","affiliation":[]},{"given":"Gianfranco","family":"Politano","sequence":"additional","affiliation":[]},{"given":"Alessandro","family":"Savino","sequence":"additional","affiliation":[]},{"given":"Hafeez","family":"Hafeezurrehman","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2011,11,30]]},"reference":[{"key":"4919_CR1","doi-asserted-by":"publisher","first-page":"189","DOI":"10.1186\/1471-2105-12-189","volume":"12","author":"D Ko","year":"2011","unstructured":"Ko D, Windle B: Enriching for correct prediction of biological processes using a combination of diverse classifiers. BMC Bioinformatics 2011, 12: 189. 10.1186\/1471-2105-12-189","journal-title":"BMC Bioinformatics"},{"issue":"3","key":"4919_CR2","doi-asserted-by":"publisher","first-page":"95","DOI":"10.6026\/97320630006095","volume":"6","author":"S Selvaraj","year":"2011","unstructured":"Selvaraj S, Natarajan J: Microarray data analysis and mining tools. Bioinformation 2011, 6(3):95\u20139. 10.6026\/97320630006095","journal-title":"Bioinformation"},{"key":"4919_CR3","volume-title":"Bioinformatics","author":"L Tolo\u015fi","year":"2011","unstructured":"Tolo\u015fi L, Lengauer T: Classification with correlated features: unreliability of feature ranking and solutions. Bioinformatics 2011."},{"key":"4919_CR4","volume-title":"Bioinformatics","author":"LA Dalton","year":"2011","unstructured":"Dalton LA, Dougherty ER: Application of the Bayesian MMSE estimator for classification error to gene-expression microarray data. Bioinformatics 2011."},{"key":"4919_CR5","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1186\/1756-0381-4-12","volume":"4","author":"L Vanneschi","year":"2011","unstructured":"Vanneschi L, Farinaccio A, Mauri G, Antoniotti M, Provero P, Giacobini M: A comparison of machine learning techniques for survival prediction in breast cancer. BioData Min 2011, 4: 12. 10.1186\/1756-0381-4-12","journal-title":"BioData Min"},{"key":"4919_CR6","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1371\/journal.pbio.0000015","volume":"1","author":"G Gibson","year":"2003","unstructured":"Gibson G: Microarray analysis. PLoS Biology 2003, 1: 28\u201329.","journal-title":"PLoS Biology"},{"key":"4919_CR7","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1093\/bib\/bbk007","volume":"7","author":"P Larranaga","year":"2006","unstructured":"Larranaga P, Calvo B, Santana R, Bielza C, Galdiano J, Inza I, Lozano JA, Armananzas R, Santafe ad Perez AG, Robles V: Machine learning in bioinformatics. Briefings in Bioinformatics 2006, 7: 86\u2013112. 10.1093\/bib\/bbk007","journal-title":"Briefings in Bioinformatics"},{"issue":"8","key":"4919_CR8","doi-asserted-by":"publisher","first-page":"E41","DOI":"10.1093\/nar\/29.8.e41","volume":"29","author":"H Yue","year":"2009","unstructured":"Yue H, Eastman P, Wang B, Minor J, Doctolero M, Nuttall R, Stack R, Becker J, Montgomery J, Vainer M, Johnston R: An evaluation of the performance of cDNA microarrays for detecting changes in global mRNA expression. Nucl. Acids Res 2009, 29(8):E41\u20131.","journal-title":"Nucl. Acids Res"},{"key":"4919_CR9","doi-asserted-by":"publisher","first-page":"319","DOI":"10.1186\/1471-2105-9-319","volume":"9","author":"A Statnikov","year":"2008","unstructured":"Statnikov A, Wang L, Aliferis C: A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinformatics 2008, 9: 319. 10.1186\/1471-2105-9-319","journal-title":"BMC Bioinformatics"},{"issue":"3","key":"4919_CR10","doi-asserted-by":"publisher","first-page":"577","DOI":"10.1109\/TCBB.2010.90","volume":"8","author":"A Benso","year":"2011","unstructured":"Benso A, Di Carlo S, Politano G: A cDNA microarray gene expression data classifier for clinical diagnostics based on graph theory. IEEE\/ACM Transactions on Computational Biology and Bioinformatics 2011, 8(3):577\u2013591.","journal-title":"IEEE\/ACM Transactions on Computational Biology and Bioinformatics"},{"key":"4919_CR11","volume-title":"R: A Language and Environment for Statistical Computing","author":"R Development Core Team","year":"2010","unstructured":"R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria; 2010.http:\/\/www.R-project.org . ISBN 3-900051-07-0"},{"key":"4919_CR12","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1109\/TIT.1970.1054406","volume":"16","author":"C Chow","year":"1970","unstructured":"Chow C: On optimum recognition error and reject tradeoff. IEEE Transactions on Information Theory 1970, 16: 41\u201346. 10.1109\/TIT.1970.1054406","journal-title":"IEEE Transactions on Information Theory"},{"key":"4919_CR13","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511812651","volume-title":"Pattern Recognition and Neural Networks","author":"BD Ripley","year":"1996","unstructured":"Ripley BD: Pattern Recognition and Neural Networks. Cambridge University Press; 1996."},{"key":"4919_CR14","doi-asserted-by":"publisher","first-page":"1698","DOI":"10.1214\/009053604000000058","volume":"32","author":"Y Freund","year":"2004","unstructured":"Freund Y, Mansour Y, Schapire RE: Generalization bounds for averaged classifiers. The annals of statistics 2004, 32: 1698\u20131722. 10.1214\/009053604000000058","journal-title":"The annals of statistics"},{"key":"4919_CR15","first-page":"1823","volume":"9","author":"PL Bartlett","year":"2008","unstructured":"Bartlett PL, Wegkamp MH: Classification with a reject option using a hinge loss. J. Mach. Learn. Res 2008, 9: 1823\u20131840.","journal-title":"J. Mach. Learn. Res"},{"issue":"10","key":"4919_CR16","doi-asserted-by":"publisher","first-page":"1565","DOI":"10.1016\/j.patrec.2008.03.010","volume":"29","author":"D Tax","year":"2008","unstructured":"Tax D, Duin R: Growing a multi-class classifier with a reject option. Pattern Recognition Letters 2008, 29(10):1565\u20131570. 10.1016\/j.patrec.2008.03.010","journal-title":"Pattern Recognition Letters"},{"key":"4919_CR17","volume-title":"Pattern Classification","author":"RO Duda","year":"2000","unstructured":"Duda RO, Hart PE, Stork DG: Pattern Classification. 2nd edition. Wiley-Interscience; 2000.","edition":"2"},{"key":"4919_CR18","doi-asserted-by":"publisher","first-page":"2099","DOI":"10.1016\/S0031-3203(00)00059-5","volume":"33","author":"G Fumera","year":"2000","unstructured":"Fumera G, Roli F, Giacinto G: Reject option with multiple thresholds. Pattern Recognition 2000, 33: 2099\u20132101. 10.1016\/S0031-3203(00)00059-5","journal-title":"Pattern Recognition"},{"key":"4919_CR19","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1007\/11532323_7","volume":"3594\/2005","author":"E Spinosa","year":"2005","unstructured":"Spinosa E, de Carvalho A: Combining one-class classifiers for robust novelty detection in gene expression data. Advances in Bioinformatics and Computational Biology 2005, 3594\/2005: 54\u201364.","journal-title":"Advances in Bioinformatics and Computational Biology"},{"issue":"5","key":"4919_CR20","doi-asserted-by":"publisher","first-page":"1392","DOI":"10.1021\/ci049726v","volume":"45","author":"X Yun","year":"2005","unstructured":"Yun X, Brereton RG: Diagnostic pattern recognition on gene-expression profile data by using one-class classification. Journal of chemical information and modeling 2005, 45(5):1392\u20131401. 10.1021\/ci049726v","journal-title":"Journal of chemical information and modeling"},{"issue":"7-9","key":"4919_CR21","doi-asserted-by":"publisher","first-page":"1859","DOI":"10.1016\/j.neucom.2008.05.003","volume":"72","author":"P Juszczak","year":"2009","unstructured":"Juszczak P, Tax DM, Kalska EP, Duin RP: Minimum spanning tree based one-class classifier. Neurocom-puting 2009, 72(7\u20139):1859\u20131869. 10.1016\/j.neucom.2008.05.003","journal-title":"Neurocom-puting"},{"key":"4919_CR22","doi-asserted-by":"publisher","first-page":"747","DOI":"10.1007\/978-3-540-85567-5_93","volume-title":"KES \u201908: Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems Part III","author":"V Ges\u00f9","year":"2008","unstructured":"Ges\u00f9 V, Bosco G, Pinello L: A one class classifier for signal identification: a biological case study. In KES \u201908: Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems Part III. Berlin, Heidelberg: Springer-Verlag; 2008:747\u2013754."},{"key":"4919_CR23","doi-asserted-by":"publisher","first-page":"2481","DOI":"10.1016\/j.sigpro.2003.07.018","volume":"83","author":"M Markou","year":"2003","unstructured":"Markou M, Singh S: Novelty detection: a review - part 1: statistical approaches. Signal Processing 2003, 83: 2481\u20132497. 10.1016\/j.sigpro.2003.07.018","journal-title":"Signal Processing"},{"key":"4919_CR24","unstructured":"cDNA Stanford\u2019s Microarray database[http:\/\/smd.stanford.edu\/]"},{"issue":"6769","key":"4919_CR25","doi-asserted-by":"publisher","first-page":"503","DOI":"10.1038\/35000501","volume":"403","author":"AA Alizadeh","year":"2000","unstructured":"Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, Hudson JJ, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000, 403(6769):503\u2013511. 10.1038\/35000501","journal-title":"Nature"},{"key":"4919_CR26","volume-title":"BMC Genomics","author":"C Palmer","year":"2006","unstructured":"Palmer C, Diehn M, Alizadeh A, Browncorresponding PO: Cell-type specific gene expression profiles of leukocytes in human peripheral blood. BMC Genomics 2006., 7(115):","edition":"7"},{"issue":"4","key":"4919_CR27","doi-asserted-by":"publisher","first-page":"1926","DOI":"10.1073\/pnas.0437875100","volume":"100","author":"SP Bohen","year":"2003","unstructured":"Bohen SP, Troyanskaya OG, Alter O, Warnke R, Botstein D, Brown PO, Levy R: Variation in gene expression patterns in follicular lymphoma and the response to rituximab. Proc Natl Acad Sci U S A 2003, 100(4):1926\u20131930. 10.1073\/pnas.0437875100","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"4","key":"4919_CR28","doi-asserted-by":"publisher","first-page":"1291","DOI":"10.1182\/blood-2006-10-049783","volume":"110","author":"L Bullinger","year":"2007","unstructured":"Bullinger L, Rucker FG, Kurz S, Du J, Scholl C, Sander S, Corbacioglu A, Lottaz C, Krauter J, Frohling S, Ganser A, Schlenk RF, Dohner K, Pollack JR, Dohner H: Gene-expression profiling identifies distinct subclasses of core binding factor acute myeloid leukemia. Blood 2007, 110(4):1291\u20131300. 10.1182\/blood-2006-10-049783","journal-title":"Blood"},{"key":"4919_CR29","unstructured":"Tax DMJ: DDTools, the data description toolbox for matlab.[http:\/\/prlab.tudelft.nl\/david-tax\/dd_tools.html]"},{"issue":"5","key":"4919_CR30","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v028.i05","volume":"28","author":"M Kuhn","year":"2008","unstructured":"Kuhn M: Building predictive models in R using the caret package. Journal of Statistical Software 2008, 28(5):1\u201326.","journal-title":"Journal of Statistical Software"},{"key":"4919_CR31","first-page":"312","volume-title":"Proceedings 6th International Conference on Genetic Algorithms","author":"N Hansen","year":"1995","unstructured":"Hansen N, Ostermeier A, Gawelczyk A: On the adaptation of arbitrary normal mutation distributions in evolution strategies: the generating set adaptation. In Proceedings 6th International Conference on Genetic Algorithms. Morgan Kaufmann; 1995:312\u2013317."},{"key":"4919_CR32","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1162\/106365601750190398","volume":"9","author":"N Hansen","year":"2001","unstructured":"Hansen N, Ostermeier A: Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation 2001, 9: 159\u2013195. 10.1162\/106365601750190398","journal-title":"Evolutionary Computation"},{"key":"4919_CR33","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1162\/106365603321828970","volume":"11","author":"N Hansen","year":"2003","unstructured":"Hansen N, M\u00fcller SD, Petrosnf PK: Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary Computation 2003, 11: 1\u201318. 10.1162\/106365603321828970","journal-title":"Evolutionary Computation"},{"key":"4919_CR34","first-page":"1769","volume":"2","author":"A Auger","year":"2005","unstructured":"Auger A, Hansen N: A restart CMA evolution strategy with increasing population size. Proc. IEEE Congress Evolutionary Computation 2005, 2: 1769\u20131776.","journal-title":"Proc. IEEE Congress Evolutionary Computation"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-12-S13-S3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T17:37:38Z","timestamp":1630517858000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-12-S13-S3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,11,30]]},"references-count":34,"journal-issue":{"issue":"S13","published-print":{"date-parts":[[2011,12]]}},"alternative-id":["4919"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-12-s13-s3","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,11,30]]},"assertion":[{"value":"30 November 2011","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S3"}}