{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T16:42:59Z","timestamp":1775666579719,"version":"3.50.1"},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"S10","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2011,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Genomic biomarkers play an increasing role in both preclinical and clinical application. Development of genomic biomarkers with microarrays is an area of intensive investigation. However, despite sustained and continuing effort, developing microarray-based predictive models (i.e., genomics biomarkers) capable of reliable prediction for an observed or measured outcome (i.e., endpoint) of unknown samples in preclinical and clinical practice remains a considerable challenge. No straightforward guidelines exist for selecting a single model that will perform best when presented with unknown samples. In the second phase of the MicroArray Quality Control (MAQC-II) project, 36 analysis teams produced a large number of models for 13 preclinical and clinical endpoints. Before external validation was performed, each team nominated one model per endpoint (referred to here as 'nominated models') from which MAQC-II experts selected 13 'candidate models' to represent the best model for each endpoint. Both the nominated and candidate models from MAQC-II provide benchmarks to assess other methodologies for developing microarray-based predictive models.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Methods<\/jats:title>\n            <jats:p>We developed a simple ensemble method by taking a number of the top performing models from cross-validation and developing an ensemble model for each of the MAQC-II endpoints. We compared the ensemble models with both nominated and candidate models from MAQC-II using blinded external validation.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>For 10 of the 13 MAQC-II endpoints originally analyzed by the MAQC-II data analysis team from the National Center for Toxicological Research (NCTR), the ensemble models achieved equal or better predictive performance than the NCTR nominated models. Additionally, the ensemble models had performance comparable to the MAQC-II candidate models. Most ensemble models also had better performance than the nominated models generated by five other MAQC-II data analysis teams that analyzed all 13 endpoints.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>Our findings suggest that an ensemble method can often attain a higher average predictive performance in an external validation set than a corresponding \u201coptimized\u201d model method. Using an ensemble method to determine a final model is a potentially important supplement to the good modeling practices recommended by the MAQC-II project for developing microarray-based genomic biomarkers.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-12-s10-s3","type":"journal-article","created":{"date-parts":[[2011,10,21]],"date-time":"2011-10-21T07:00:14Z","timestamp":1319180414000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Selecting a single model or combining multiple models for microarray-based classifier development? \u2013 A comparative analysis based on large and diverse datasets generated from the MAQC-II project"],"prefix":"10.1186","volume":"12","author":[{"given":"Minjun","family":"Chen","sequence":"first","affiliation":[]},{"given":"Leming","family":"Shi","sequence":"additional","affiliation":[]},{"given":"Reagan","family":"Kelly","sequence":"additional","affiliation":[]},{"given":"Roger","family":"Perkins","sequence":"additional","affiliation":[]},{"given":"Hong","family":"Fang","sequence":"additional","affiliation":[]},{"given":"Weida","family":"Tong","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2011,10,18]]},"reference":[{"issue":"1-3","key":"4843_CR1","doi-asserted-by":"publisher","first-page":"359","DOI":"10.1016\/S0378-4274(01)00267-3","volume":"120","author":"JF Waring","year":"2001","unstructured":"Waring JF, Ciurlionis R, Jolly RA, Heindel M, Ulrich RG: Microarray analysis of hepatotoxins in vitro reveals a correlation between gene expression profiles and mechanisms of toxicity. Toxicol Lett 2001, 120(1\u20133):359\u2013368. 10.1016\/S0378-4274(01)00267-3","journal-title":"Toxicol Lett"},{"issue":"1","key":"4843_CR2","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1093\/toxsci\/60.1.6","volume":"60","author":"MR Fielden","year":"2001","unstructured":"Fielden MR, Zacharewski TR: Challenges and limitations of gene expression profiling in mechanistic and predictive toxicology. Toxicol Sci 2001, 60(1):6\u201310. 10.1093\/toxsci\/60.1.6","journal-title":"Toxicol Sci"},{"issue":"5439","key":"4843_CR3","doi-asserted-by":"publisher","first-page":"531","DOI":"10.1126\/science.286.5439.531","volume":"286","author":"TR Golub","year":"1999","unstructured":"Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999, 286(5439):531\u2013537. 10.1126\/science.286.5439.531","journal-title":"Science"},{"issue":"9","key":"4843_CR4","doi-asserted-by":"publisher","first-page":"1540","DOI":"10.1038\/sj.bjc.6604329","volume":"98","author":"N Moniaux","year":"2008","unstructured":"Moniaux N, Chakraborty S, Yalniz M, Gonzalez J, Shostrom VK, Standop J, Lele SM, Ouellette M, Pour PM, Sasson AR, et al.: Early diagnosis of pancreatic cancer: neutrophil gelatinase-associated lipocalin as a marker of pancreatic intraepithelial neoplasia. Br J Cancer 2008, 98(9):1540\u20131547. 10.1038\/sj.bjc.6604329","journal-title":"Br J Cancer"},{"issue":"5","key":"4843_CR5","doi-asserted-by":"publisher","first-page":"2226","DOI":"10.1158\/0008-5472.CAN-06-3633","volume":"67","author":"F Huang","year":"2007","unstructured":"Huang F, Reeves K, Han X, Fairchild C, Platero S, Wong TW, Lee F, Shaw P, Clark E: Identification of candidate molecular markers predicting sensitivity in solid tumors to dasatinib: rationale for patient selection. Cancer Res 2007, 67(5):2226\u20132238. 10.1158\/0008-5472.CAN-06-3633","journal-title":"Cancer Res"},{"issue":"25","key":"4843_CR6","doi-asserted-by":"publisher","first-page":"1999","DOI":"10.1056\/NEJMoa021967","volume":"347","author":"MJ van de Vijver","year":"2002","unstructured":"van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AA, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, et al.: A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 2002, 347(25):1999\u20132009. 10.1056\/NEJMoa021967","journal-title":"N Engl J Med"},{"issue":"6","key":"4843_CR7","doi-asserted-by":"publisher","first-page":"489","DOI":"10.1038\/nrd1750","volume":"4","author":"N Kaplowitz","year":"2005","unstructured":"Kaplowitz N: Idiosyncratic drug hepatotoxicity. Nat Rev Drug Discov 2005, 4(6):489\u2013499. 10.1038\/nrd1750","journal-title":"Nat Rev Drug Discov"},{"issue":"6871","key":"4843_CR8","doi-asserted-by":"publisher","first-page":"530","DOI":"10.1038\/415530a","volume":"415","author":"LJ van't Veer","year":"2002","unstructured":"van't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, et al.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002, 415(6871):530\u2013536. 10.1038\/415530a","journal-title":"Nature"},{"issue":"2","key":"4843_CR9","doi-asserted-by":"publisher","first-page":"147","DOI":"10.1093\/jnci\/djk018","volume":"99","author":"A Dupuy","year":"2007","unstructured":"Dupuy A, Simon RM: Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. J Natl Cancer Inst 2007, 99(2):147\u2013157. 10.1093\/jnci\/djk018","journal-title":"J Natl Cancer Inst"},{"issue":"9458","key":"4843_CR10","doi-asserted-by":"publisher","first-page":"488","DOI":"10.1016\/S0140-6736(05)17866-0","volume":"365","author":"S Michiels","year":"2005","unstructured":"Michiels S, Koscielny S, Hill C: Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet 2005, 365(9458):488\u2013492. 10.1016\/S0140-6736(05)17866-0","journal-title":"Lancet"},{"key":"4843_CR11","volume-title":"International Joint Conference on Artificial Intelligence; Montreal IJCAI","author":"R Kohavi","year":"1995","unstructured":"Kohavi R: A study of cross-validation and bootstrap for accuracy estimation and model selection. International Joint Conference on Artificial Intelligence; Montreal IJCAI 1995. Unpaged Unpaged"},{"issue":"5","key":"4843_CR12","doi-asserted-by":"publisher","first-page":"587","DOI":"10.1586\/14737159.3.5.587","volume":"3","author":"R Simon","year":"2003","unstructured":"Simon R: Using DNA microarrays for diagnostic and prognostic prediction. Expert Rev Mol Diagn 2003, 3(5):587\u2013595. 10.1586\/14737159.3.5.587","journal-title":"Expert Rev Mol Diagn"},{"issue":"1","key":"4843_CR13","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1093\/jnci\/95.1.14","volume":"95","author":"R Simon","year":"2003","unstructured":"Simon R, Radmacher MD, Dobbin K, McShane LM: Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst 2003, 95(1):14\u201318. 10.1093\/jnci\/95.1.14","journal-title":"J Natl Cancer Inst"},{"issue":"3","key":"4843_CR14","doi-asserted-by":"publisher","first-page":"374","DOI":"10.1093\/bioinformatics\/btg419","volume":"20","author":"UM Braga-Neto","year":"2004","unstructured":"Braga-Neto UM, Dougherty ER: Is cross-validation valid for small-sample microarray classification? Bioinformatics 2004, 20(3):374\u2013380. 10.1093\/bioinformatics\/btg419","journal-title":"Bioinformatics"},{"key":"4843_CR15","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1186\/1471-2105-7-91","volume":"7","author":"S Varma","year":"2006","unstructured":"Varma S, Simon R: Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics 2006, 7: 91. 10.1186\/1471-2105-7-91","journal-title":"BMC Bioinformatics"},{"issue":"10","key":"4843_CR16","doi-asserted-by":"publisher","first-page":"1507","DOI":"10.1038\/sj.onc.1209920","volume":"26","author":"A Naderi","year":"2007","unstructured":"Naderi A, Teschendorff AE, Barbosa-Morais NL, Pinder SE, Green AR, Powe DG, Robertson JF, Aparicio S, Ellis IO, Brenton JD, et al.: A gene-expression signature to predict survival in breast cancer across independent data sets. Oncogene 2007, 26(10):1507\u20131516. 10.1038\/sj.onc.1209920","journal-title":"Oncogene"},{"issue":"18","key":"4843_CR17","doi-asserted-by":"publisher","first-page":"10393","DOI":"10.1073\/pnas.1732912100","volume":"100","author":"C Sotiriou","year":"2003","unstructured":"Sotiriou C, Neo SY, McShane LM, Korn EL, Long PM, Jazaeri A, Martiat P, Fox SB, Harris AL, Liu ET: Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc Natl Acad Sci U S A 2003, 100(18):10393\u201310398. 10.1073\/pnas.1732912100","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"9460","key":"4843_CR18","doi-asserted-by":"publisher","first-page":"671","DOI":"10.1016\/S0140-6736(05)70933-8","volume":"365","author":"Y Wang","year":"2005","unstructured":"Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J, et al.: Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 2005, 365(9460):671\u2013679.","journal-title":"Lancet"},{"issue":"2","key":"4843_CR19","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1093\/bioinformatics\/bth469","volume":"21","author":"L Ein-Dor","year":"2005","unstructured":"Ein-Dor L, Kela I, Getz G, Givol D, Domany E: Outcome signature genes in breast cancer: is there a unique set? Bioinformatics 2005, 21(2):171\u2013178. 10.1093\/bioinformatics\/bth469","journal-title":"Bioinformatics"},{"key":"4843_CR20","doi-asserted-by":"publisher","first-page":"559","DOI":"10.1016\/0169-2070(89)90012-5","volume":"5","author":"R Clemen","year":"1989","unstructured":"Clemen R: Combining forecasts: A review and annotated bibliography. Journal of Forecasting 1989, 5: 559\u2013583. 10.1016\/0169-2070(89)90012-5","journal-title":"Journal of Forecasting"},{"issue":"5","key":"4843_CR21","doi-asserted-by":"publisher","first-page":"1794","DOI":"10.1021\/ci049923u","volume":"44","author":"P Gramatica","year":"2004","unstructured":"Gramatica P, Pilutti P, Papa E: Validated QSAR prediction of OH tropospheric degradation of VOCs: splitting into training-test sets and consensus modeling. J Chem Inf Comput Sci 2004, 44(5):1794\u20131802. 10.1021\/ci049923u","journal-title":"J Chem Inf Comput Sci"},{"issue":"3 Suppl","key":"4843_CR22","first-page":"S75","volume":"2","author":"AC Tan","year":"2003","unstructured":"Tan AC, Gilbert D: Ensemble machine learning on gene expression data for cancer classification. Appl Bioinformatics 2003, 2(3 Suppl):S75\u201383.","journal-title":"Appl Bioinformatics"},{"issue":"1","key":"4843_CR23","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1016\/j.compbiolchem.2007.01.001","volume":"31","author":"Z Su","year":"2007","unstructured":"Su Z, Hong H, Perkins R, Shao X, Cai W, Tong W: Consensus analysis of multiple classifiers using non-repetitive variables: diagnostic application to microarray gene expression data. Comput Biol Chem 2007, 31(1):48\u201356. 10.1016\/j.compbiolchem.2007.01.001","journal-title":"Comput Biol Chem"},{"issue":"8","key":"4843_CR24","doi-asserted-by":"publisher","first-page":"827","DOI":"10.1038\/nbt.1665","volume":"28","author":"L Shi","year":"2010","unstructured":"Shi L, Campbell G, Jones WD, Campagne F, Wen Z, Walker SJ, Su Z, Chu TM, Goodsaid FM, Pusztai L, et al.: The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol 2010, 28(8):827\u2013838. 10.1038\/nbt.1665","journal-title":"Nat Biotechnol"},{"issue":"1","key":"4843_CR25","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1093\/toxsci\/kfm023","volume":"97","author":"RS Thomas","year":"2007","unstructured":"Thomas RS, Pluta L, Yang L, Halsey TA: Application of genomic biomarkers to predict increased lung tumor incidence in 2-year rodent cancer bioassays. Toxicol Sci 2007, 97(1):55\u201364. 10.1093\/toxsci\/kfm023","journal-title":"Toxicol Sci"},{"issue":"1","key":"4843_CR26","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1093\/toxsci\/kfm156","volume":"99","author":"MR Fielden","year":"2007","unstructured":"Fielden MR, Brennan R, Gollub J: A gene expression biomarker provides early prediction and mechanistic assessment of hepatic tumor induction by nongenotoxic chemicals. Toxicol Sci 2007, 99(1):90\u2013100. 10.1093\/toxsci\/kfm156","journal-title":"Toxicol Sci"},{"issue":"6","key":"4843_CR27","doi-asserted-by":"publisher","first-page":"R100","DOI":"10.1186\/gb-2008-9-6-r100","volume":"9","author":"EK Lobenhofer","year":"2008","unstructured":"Lobenhofer EK, Auman JT, Blackshear PE, Boorman GA, Bushel PR, Cunningham ML, Fostel JM, Gerrish K, Heinloth AN, Irwin RD, et al.: Gene expression response in target organ and whole blood varies as a function of target organ injury phenotype. Genome Biol 2008, 9(6):R100. 10.1186\/gb-2008-9-6-r100","journal-title":"Genome Biol"},{"issue":"26","key":"4843_CR28","doi-asserted-by":"publisher","first-page":"4236","DOI":"10.1200\/JCO.2006.05.6861","volume":"24","author":"KR Hess","year":"2006","unstructured":"Hess KR, Anderson K, Symmans WF, Valero V, Ibrahim N, Mejia JA, Booser D, Theriault RL, Buzdar AU, Dempsey PJ, et al.: Pharmacogenomic predictor of sensitivity to preoperative chemotherapy with paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide in breast cancer. J Clin Oncol 2006, 24(26):4236\u20134244. 10.1200\/JCO.2006.05.6861","journal-title":"J Clin Oncol"},{"issue":"6","key":"4843_CR29","doi-asserted-by":"publisher","first-page":"2020","DOI":"10.1182\/blood-2005-11-013458","volume":"108","author":"F Zhan","year":"2006","unstructured":"Zhan F, Huang Y, Colla S, Stewart JP, Hanamura I, Gupta S, Epstein J, Yaccoby S, Sawyer J, Burington B, et al.: The molecular classification of multiple myeloma. Blood 2006, 108(6):2020\u20132028. 10.1182\/blood-2005-11-013458","journal-title":"Blood"},{"issue":"6","key":"4843_CR30","doi-asserted-by":"publisher","first-page":"2276","DOI":"10.1182\/blood-2006-07-038430","volume":"109","author":"JD Shaughnessy Jr.","year":"2007","unstructured":"Shaughnessy JD Jr., Zhan F, Burington BE, Huang Y, Colla S, Hanamura I, Stewart JP, Kordsmeier B, Randolph C, Williams DR, et al.: A validated gene expression model of high-risk multiple myeloma is defined by deregulated expression of genes mapping to chromosome 1. Blood 2007, 109(6):2276\u20132284. 10.1182\/blood-2006-07-038430","journal-title":"Blood"},{"issue":"31","key":"4843_CR31","doi-asserted-by":"publisher","first-page":"5070","DOI":"10.1200\/JCO.2006.06.1879","volume":"24","author":"A Oberthuer","year":"2006","unstructured":"Oberthuer A, Berthold F, Warnat P, Hero B, Kahlert Y, Spitz R, Ernestus K, Konig R, Haas S, Eils R, et al.: Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification. J Clin Oncol 2006, 24(31):5070\u20135078. 10.1200\/JCO.2006.06.1879","journal-title":"J Clin Oncol"},{"issue":"9","key":"4843_CR32","doi-asserted-by":"publisher","first-page":"5116","DOI":"10.1073\/pnas.091062498","volume":"98","author":"VG Tusher","year":"2001","unstructured":"Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 2001, 98(9):5116\u20135121. 10.1073\/pnas.091062498","journal-title":"Proc Natl Acad Sci U S A"},{"key":"4843_CR33","volume-title":"R Foundation for Statistical computing","author":"RDC Team","year":"2010","unstructured":"Team RDC: R: A language and environment for statistical computing. R Foundation for Statistical computing Vienna, Austria ISBN 3\u2013900051\u201307\u20130; 2010. [http:\/\/www.R-project.org]"},{"key":"4843_CR34","doi-asserted-by":"publisher","first-page":"335","DOI":"10.1007\/3-540-28397-8_36","volume-title":"Data Analysis and Decision Support","author":"C Weihs","year":"2005","unstructured":"Weihs C, Ligges U, Luebke K, Rabbe N: klaR analyzing German business cycle. In Data Analysis and Decision Support. Edited by: Baier, D, Decker, R and Schmitd-Thieme, L. Springer-Verlag, Berlin; 2005:335\u2013343."},{"issue":"1-2","key":"4843_CR35","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s10462-009-9124-7","volume":"33","author":"L Rokach","year":"2010","unstructured":"Rokach L: Ensemble-based classifiers. The Artificial Intelligence Review 2010, 33(1\u20132):1\u201333. 10.1007\/s10462-009-9124-7","journal-title":"The Artificial Intelligence Review"},{"issue":"6","key":"4843_CR36","doi-asserted-by":"publisher","first-page":"755","DOI":"10.1016\/j.jmgm.2006.06.005","volume":"25","author":"P Gramatica","year":"2007","unstructured":"Gramatica P, Giani E, Papa E: Statistical external validation and consensus modeling: a QSPR case study for Koc prediction. J Mol Graph Model 2007, 25(6):755\u2013766. 10.1016\/j.jmgm.2006.06.005","journal-title":"J Mol Graph Model"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-12-S10-S3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T16:56:25Z","timestamp":1630515385000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-12-S10-S3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,10,18]]},"references-count":36,"journal-issue":{"issue":"S10","published-print":{"date-parts":[[2011,12]]}},"alternative-id":["4843"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-12-s10-s3","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,10,18]]},"assertion":[{"value":"18 October 2011","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S3"}}