{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T20:50:23Z","timestamp":1767991823713,"version":"3.49.0"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2011,5,11]],"date-time":"2011-05-11T00:00:00Z","timestamp":1305072000000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BioData Mining"],"published-print":{"date-parts":[[2011,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>The ability to accurately classify cancer patients into risk classes, i.e. to predict the outcome of the pathology on an individual basis, is a key ingredient in making therapeutic decisions. In recent years gene expression data have been successfully used to complement the clinical and histological criteria traditionally used in such prediction. Many \"gene expression signatures\" have been developed, i.e. sets of genes whose expression values in a tumor can be used to predict the outcome of the pathology. Here we investigate the use of several machine learning techniques to classify breast cancer patients using one of such signatures, the well established <jats:italic>70-gene signature<\/jats:italic>.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We show that Genetic Programming performs significantly better than Support Vector Machines, Multilayered Perceptrons and Random Forests in classifying patients from the NKI breast cancer dataset, and comparably to the scoring-based method originally proposed by the authors of the 70-gene signature. Furthermore, Genetic Programming is able to perform an automatic feature selection.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>Since the performance of Genetic Programming is likely to be improvable compared to the out-of-the-box approach used here, and given the biological insight potentially provided by the Genetic Programming solutions, we conclude that Genetic Programming methods are worth further investigation as a tool for cancer patient classification based on gene expression data.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1756-0381-4-12","type":"journal-article","created":{"date-parts":[[2011,5,12]],"date-time":"2011-05-12T06:54:38Z","timestamp":1305183278000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":43,"title":["A comparison of machine learning techniques for survival prediction in breast cancer"],"prefix":"10.1186","volume":"4","author":[{"given":"Leonardo","family":"Vanneschi","sequence":"first","affiliation":[]},{"given":"Antonella","family":"Farinaccio","sequence":"additional","affiliation":[]},{"given":"Giancarlo","family":"Mauri","sequence":"additional","affiliation":[]},{"given":"Marco","family":"Antoniotti","sequence":"additional","affiliation":[]},{"given":"Paolo","family":"Provero","sequence":"additional","affiliation":[]},{"given":"Mario","family":"Giacobini","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2011,5,11]]},"reference":[{"issue":"8","key":"44_CR1","doi-asserted-by":"publisher","first-page":"601","DOI":"10.1038\/nrg2137","volume":"8","author":"JR Nevins","year":"2007","unstructured":"Nevins JR, Potti A: Mining gene expression profiles: expression signatures as cancer phenotypes. Nat Rev Genet. 2007, 8 (8): 601-609.","journal-title":"Nat Rev Genet"},{"issue":"6871","key":"44_CR2","doi-asserted-by":"publisher","first-page":"530","DOI":"10.1038\/415530a","volume":"415","author":"LJ van 't Veer","year":"2002","unstructured":"van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AAM, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH: Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002, 415 (6871): 530-536. 10.1038\/415530a.","journal-title":"Nature"},{"issue":"6","key":"44_CR3","doi-asserted-by":"publisher","first-page":"475","DOI":"10.1142\/S0129065705000396","volume":"15","author":"F Chu","year":"2005","unstructured":"Chu F, Wang L: Applications of support vector machines to cancer classification with microarray data. Int J Neural Syst. 2005, 15 (6): 475-484. 10.1142\/S0129065705000396.","journal-title":"Int J Neural Syst"},{"issue":"1-2","key":"44_CR4","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1016\/S0303-2647(03)00138-2","volume":"72","author":"K Deb","year":"2003","unstructured":"Deb K, Reddy AR: Reliable classification of two-class cancer data using evolutionary algorithms. Biosystems. 2003, 72 (1-2): 111-129. 10.1016\/S0303-2647(03)00138-2.","journal-title":"Biosystems"},{"key":"44_CR5","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1093\/bioinformatics\/19.1.45","volume":"19","author":"JM Deutsch","year":"2003","unstructured":"Deutsch JM: Evolutionary algorithms for finding optimal gene sets in microarray prediction. Bioinformatics. 2003, 19: 45-52. 10.1093\/bioinformatics\/19.1.45.","journal-title":"Bioinformatics"},{"issue":"3","key":"44_CR6","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1023\/B:GENP.0000030196.55525.f7","volume":"5","author":"WB Langdon","year":"2004","unstructured":"Langdon WB, Buxton BF: Genetic Programming for Mining DNA Chip Data from Cancer Patients. Genetic Programming and Evolvable Machines. 2004, 5 (3): 251-257.","journal-title":"Genetic Programming and Evolvable Machines"},{"issue":"3","key":"44_CR7","doi-asserted-by":"publisher","first-page":"208","DOI":"10.1016\/j.biosystems.2005.07.003","volume":"82","author":"TK Paul","year":"2005","unstructured":"Paul TK, Iba H: Gene selection for classification of cancers using probabilistic model building genetic algorithm. Biosystems. 2005, 82 (3): 208-225. 10.1016\/j.biosystems.2005.07.003.","journal-title":"Biosystems"},{"issue":"4","key":"44_CR8","doi-asserted-by":"publisher","first-page":"292","DOI":"10.1593\/neo.07121","volume":"9","author":"J Yu","year":"2007","unstructured":"Yu J, Yu J, Almal AA, Dhanasekaran SM, Ghosh D, Worzel WP, Chinnaiyan AM: Feature Selection and Molecular Classification of Cancer Using Genetic Programming. Neoplasia. 2007, 9 (4): 292-303. 10.1593\/neo.07121.","journal-title":"Neoplasia"},{"issue":"25","key":"44_CR9","doi-asserted-by":"publisher","first-page":"1999","DOI":"10.1056\/NEJMoa021967","volume":"347","author":"MJ van de Vijver","year":"2002","unstructured":"van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AAM, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, van der Velde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH, Bernards R: A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002, 347 (25): 1999-2009. 10.1056\/NEJMoa021967.","journal-title":"N Engl J Med"},{"issue":"4","key":"44_CR10","doi-asserted-by":"publisher","first-page":"243","DOI":"10.1016\/S0306-4379(02)00072-8","volume":"28","author":"Y Lu","year":"2003","unstructured":"Lu Y, Han J: Cancer classification using gene expression data. Inf Syst. 2003, 28 (4): 243-268. 10.1016\/S0306-4379(02)00072-8.","journal-title":"Inf Syst"},{"key":"44_CR11","volume-title":"Machine learning, neural and statistical classification","author":"D Michie","year":"1994","unstructured":"Michie D, Spiegelhalter D, Taylor C: Machine learning, neural and statistical classification. 1994, Prentice Hall"},{"key":"44_CR12","doi-asserted-by":"publisher","first-page":"6745","DOI":"10.1073\/pnas.96.12.6745","volume":"96","author":"U Alon","year":"1999","unstructured":"Alon U, Barkai N, Notterman D, Gish K, Ybarra S, Mack D, Levine AJ: Broad patterns of gene expression revealed by clustering analysis of tumour and normal colon tissues probed by oligonucleotide arrays. Proc Nat Acad Sci USA. 1999, 96: 6745-6750. 10.1073\/pnas.96.12.6745.","journal-title":"Proc Nat Acad Sci USA"},{"issue":"16","key":"44_CR13","doi-asserted-by":"publisher","first-page":"2131","DOI":"10.1093\/bioinformatics\/btg296","volume":"19","author":"A Hsu","year":"2003","unstructured":"Hsu A, Tang S, Halgamuge S: An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data. Bioinformatics. 2003, 19 (16): 2131-40. 10.1093\/bioinformatics\/btg296.","journal-title":"Bioinformatics"},{"key":"44_CR14","doi-asserted-by":"publisher","first-page":"389","DOI":"10.1023\/A:1012487302797","volume":"46","author":"I Guyon","year":"2002","unstructured":"Guyon I, Weston J, Barnhill S, Vapnik V: Gene selection for cancer classification using support vector machines. Machine Learning. 2002, 46: 389-422. 10.1023\/A:1012487302797.","journal-title":"Machine Learning"},{"key":"44_CR15","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1007\/978-3-540-71783-6_9","volume":"4447","author":"JCH Hernandez","year":"2007","unstructured":"Hernandez JCH, Duval B, Hao J: A genetic embedded approach for gene selection and classification of microarray data. Lecture Notes in Computer Science. 2007, 4447: 90-101. 10.1007\/978-3-540-71783-6_9.","journal-title":"Lecture Notes in Computer Science"},{"key":"44_CR16","doi-asserted-by":"publisher","first-page":"601","DOI":"10.1089\/106652700750050961","volume":"7","author":"N Friedman","year":"2000","unstructured":"Friedman N, Linial M, Nachmann I, Peer D: Using Bayesian Networks to Analyze Expression Data. J Computational Biology. 2000, 7: 601-620. 10.1089\/106652700750050961.","journal-title":"J Computational Biology"},{"key":"44_CR17","volume-title":"Adaptation in Natural and Artificial Systems","author":"JH Holland","year":"1975","unstructured":"Holland JH: Adaptation in Natural and Artificial Systems. 1975, Ann Arbor, Michigan: The University of Michigan Press"},{"key":"44_CR18","volume-title":"Genetic Algorithms in Search, Optimization and Machine Learning","author":"DE Goldberg","year":"1989","unstructured":"Goldberg DE: Genetic Algorithms in Search, Optimization and Machine Learning. 1989, Addison-Wesley"},{"key":"44_CR19","doi-asserted-by":"publisher","first-page":"2691","DOI":"10.1093\/bioinformatics\/bti419","volume":"21","author":"J Liu","year":"2005","unstructured":"Liu J, Cutler G, Li W, Pan Z, Peng S, Hoey T, Chen L, Ling XB: Multiclass cancer classification and biomarker discovery using GA-based algorithms. Bioinformatics. 2005, 21: 2691-2697. 10.1093\/bioinformatics\/bti419.","journal-title":"Bioinformatics"},{"key":"44_CR20","first-page":"372","volume":"2167","author":"J Moore","year":"2001","unstructured":"Moore J, Parker J, Hahn L: Symbolic discriminant analysis for mining gene expression patterns. Lecture Notes in Artificial Intelligence. 2001, 2167: 372-381.","journal-title":"Lecture Notes in Artificial Intelligence"},{"key":"44_CR21","volume-title":"Genetic Programming based DNA Microarray Analysis for classification of tumour tissues","author":"M Rosskopf","year":"2007","unstructured":"Rosskopf M, Schmidt H, Feldkamp U, Banzhaf W: Genetic Programming based DNA Microarray Analysis for classification of tumour tissues. 2007, Tech. Rep. Technical Report 2007-03, Memorial University of Newfoundland"},{"key":"44_CR22","volume-title":"Proceedings Intelligent Data Analysis in Medicine and Pharmacology","author":"C Bojarczuk","year":"2001","unstructured":"Bojarczuk C, Lopes H, Freitas A: Data mining with constrained-syntax genetic programming: applications to medical data sets. Proceedings Intelligent Data Analysis in Medicine and Pharmacology. 2001, 1:"},{"key":"44_CR23","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1016\/j.artmed.2005.06.002","volume":"36","author":"J Hong","year":"2006","unstructured":"Hong J, Cho S: The classification of cancer based on DNA microarray data that uses diverse ensemble genetic programming. Artif Intell Med. 2006, 36: 43-58. 10.1016\/j.artmed.2005.06.002.","journal-title":"Artif Intell Med"},{"issue":"38","key":"44_CR24","doi-asserted-by":"publisher","first-page":"13550","DOI":"10.1073\/pnas.0506230102","volume":"102","author":"LD Miller","year":"2005","unstructured":"Miller LD, Smeds J, George J, Vega VB, Vergara L, Ploner A, Pawitan Y, Hall P, Klaar S, Liu ET, Bergh J: An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Natl Acad Sci USA. 2005, 102 (38): 13550-13555. 10.1073\/pnas.0506230102.","journal-title":"Proc Natl Acad Sci USA"},{"key":"44_CR25","doi-asserted-by":"crossref","unstructured":"Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R: NCBI GEO: mining tens of millions of expression profiles-database and tools update. Nucleic Acids Res. 2007, D760-D765. 35 Database","DOI":"10.1093\/nar\/gkl887"},{"key":"44_CR26","volume-title":"Genetic Programming","author":"JR Koza","year":"1992","unstructured":"Koza JR: Genetic Programming. 1992, Cambridge, Massachusetts: The MIT Press"},{"key":"44_CR27","unstructured":"Poli R, Langdon WB, McPhee NF: A field guide to genetic programming. Published via http:\/\/lulu.com and freely available at http:\/\/www.gp-field-guide.org.uk 2008. [(With contributions by J. R. Koza)]"},{"key":"44_CR28","volume-title":"Ph.D. thesis, Faculty of Sciences","author":"L Vanneschi","year":"2004","unstructured":"Vanneschi L: Theory and Practice for Efficient Genetic Programming. Ph.D. thesis, Faculty of Sciences. 2004, University of Lausanne, Switzerland"},{"key":"44_CR29","volume-title":"On the Origin of Species by Means of Natural Selection","author":"C Darwin","year":"1859","unstructured":"Darwin C: On the Origin of Species by Means of Natural Selection. 1859, John Murray"},{"key":"44_CR30","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1145\/1143997.1144042","volume-title":"Proceedings of the 8th annual conference on Genetic and Evolutionary Computation","author":"F Archetti","year":"2006","unstructured":"Archetti F, Lanzeni S, Messina E, Vanneschi L: Genetic programming for human oral bioavailability of drugs. Proceedings of the 8th annual conference on Genetic and Evolutionary Computation. Edited by: Cattolico M et al. 2006, Seattle, Washington, USA, 255-262."},{"key":"44_CR31","first-page":"11","volume-title":"Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. Proceedings of the Fifth European Conference, EvoBIO 2007, Lecture Notes in Computer Science, LNCS 4447","author":"F Archetti","year":"2007","unstructured":"Archetti F, Messina E, Lanzeni S, Vanneschi L: Genetic Programming and other Machine Learning approaches to predict Median Oral Lethal Dose (LD50) and Plasma Protein Binding levels (%PPB) of drugs. Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. Proceedings of the Fifth European Conference, EvoBIO 2007, Lecture Notes in Computer Science, LNCS 4447. Edited by: Marchiori E et al. 2007, Springer, Berlin, Heidelberg, New York, 11-23."},{"issue":"4","key":"44_CR32","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1007\/s10710-007-9040-z","volume":"8","author":"F Archetti","year":"2007","unstructured":"Archetti F, Messina E, Lanzeni S, Vanneschi L: Genetic Programming for Computational Pharmacokinetics in Drug Discovery and Development. Genetic Programming and Evolvable Machines. 2007, 8 (4): 17-26.","journal-title":"Genetic Programming and Evolvable Machines"},{"key":"44_CR33","volume-title":"GPLAB - A Genetic Programming Toolbox for MATLAB, version 3.0","author":"S Silva","year":"2007","unstructured":"Silva S: GPLAB - A Genetic Programming Toolbox for MATLAB, version 3.0. 2007"},{"key":"44_CR34","volume-title":"Statistical Learning Theory","author":"V Vapnik","year":"1998","unstructured":"Vapnik V: Statistical Learning Theory. 1998, Wiley, New York, NY"},{"key":"44_CR35","volume-title":"Advances in Kernel Methods - Support Vector Learning","author":"J Platt","year":"1998","unstructured":"Platt J: Fast Training of Support Vector Machines using Sequential Minimal Optimization. Advances in Kernel Methods - Support Vector Learning. 1998"},{"key":"44_CR36","volume-title":"A multi-task machine learning software developed by Waikato University","author":"Weka","year":"2006","unstructured":"Weka: A multi-task machine learning software developed by Waikato University. 2006, [See http:\/\/www.cs.waikato.ac.nz\/ml\/weka]"},{"key":"44_CR37","volume-title":"Neural Networks: a comprehensive foundation","author":"S Haykin","year":"1999","unstructured":"Haykin S: Neural Networks: a comprehensive foundation. 1999, Prentice Hall, London"},{"key":"44_CR38","volume-title":"Classification and Regression Trees","author":"L Breiman","year":"1984","unstructured":"Breiman L, Friedman J, Olshen R, Stone C: Classification and Regression Trees. 1984, Belmont, California, Wadsworth International Group"},{"key":"44_CR39","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L: Random Forests. Machine Learning. 2001, 45: 5-32. 10.1023\/A:1010933404324.","journal-title":"Machine Learning"}],"container-title":["BioData Mining"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/1756-0381-4-12.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/1756-0381-4-12\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/1756-0381-4-12","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1756-0381-4-12.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T13:57:12Z","timestamp":1630504632000},"score":1,"resource":{"primary":{"URL":"https:\/\/biodatamining.biomedcentral.com\/articles\/10.1186\/1756-0381-4-12"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,5,11]]},"references-count":39,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2011,12]]}},"alternative-id":["44"],"URL":"https:\/\/doi.org\/10.1186\/1756-0381-4-12","relation":{},"ISSN":["1756-0381"],"issn-type":[{"value":"1756-0381","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,5,11]]},"assertion":[{"value":"3 June 2010","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 May 2011","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 May 2011","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"12"}}