{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T20:24:28Z","timestamp":1774124668040,"version":"3.50.1"},"reference-count":33,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>A recent publication described a supervised classification method for microarray data: Between Group Analysis (BGA). This method which is based on performing multivariate ordination of groups proved to be very efficient for both classification of samples into pre-defined groups and disease class prediction of new unknown samples. Classification and prediction with BGA are classically performed using the whole set of genes and no variable selection is required. We hypothesize that an optimized selection of highly discriminating genes might improve the prediction power of BGA.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We propose an optimized between-group classification (OBC) which uses a jackknife-based gene selection procedure. OBC emphasizes classification accuracy rather than feature selection. OBC is a backward optimization procedure that maximizes the percentage of between group inertia by removing the least influential genes one by one from the analysis. This selects a subset of highly discriminative genes which optimize disease class prediction. We apply OBC to four datasets and compared it to other classification methods.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>OBC considerably improved the classification and predictive accuracy of BGA, when assessed using independent data sets and leave-one-out cross-validation.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Availability<\/jats:title>\n            <jats:p>The R code is freely available [see Additional file 1] as well as supplementary information [see Additional file 2].<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-6-239","type":"journal-article","created":{"date-parts":[[2005,9,30]],"date-time":"2005-09-30T18:14:11Z","timestamp":1128104051000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data"],"prefix":"10.1186","volume":"6","author":[{"given":"Florent","family":"Baty","sequence":"first","affiliation":[]},{"given":"Michel P","family":"Bihl","sequence":"additional","affiliation":[]},{"given":"Guy","family":"Perri\u00e8re","sequence":"additional","affiliation":[]},{"given":"Aed\u00edn C","family":"Culhane","sequence":"additional","affiliation":[]},{"given":"Martin H","family":"Brutsche","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2005,9,28]]},"reference":[{"key":"564_CR1","volume-title":"Genome Information Systems and Technology","author":"L Li","year":"2001","unstructured":"Li L, Pedersen LG, Darden TA, Weinberg CR: Class prediction and discovery based on gene expression data. Genome Information Systems and Technology 2001."},{"issue":"12","key":"564_CR2","doi-asserted-by":"publisher","first-page":"R83","DOI":"10.1186\/gb-2003-4-12-r83","volume":"4","author":"KY Yeung","year":"2003","unstructured":"Yeung KY, Bumgarner RE: Multiclass classification of microarray data with repeated measurements: application to cancer. Genome Biol 2003, 4(12):R83. 10.1186\/gb-2003-4-12-r83","journal-title":"Genome Biol"},{"issue":"2\u20133","key":"564_CR3","doi-asserted-by":"publisher","first-page":"215","DOI":"10.1089\/1066527041410445","volume":"11","author":"W Li","year":"2004","unstructured":"Li W, Sun F, Grosse I: Extreme value distribution based gene selection criteria for discriminant microarray data analysis using logistic regression. J Comput Biol 2004, 11(2\u20133):215\u2013226. 10.1089\/1066527041410445","journal-title":"J Comput Biol"},{"key":"564_CR4","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1093\/nar\/gki144","volume":"33","author":"Y Tan","year":"2005","unstructured":"Tan Y, Shi L, Tong W, Wang C: Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data. Nucleic Acids Res 2005, 33: 56\u201365. 10.1093\/nar\/gki144","journal-title":"Nucleic Acids Res"},{"issue":"6","key":"564_CR5","doi-asserted-by":"publisher","first-page":"1264","DOI":"10.2144\/00296bc02","volume":"29","author":"M Xiong","year":"2000","unstructured":"Xiong M, Jin L, Li W, Boerwinkle E: Computational methods for gene expression-based tumor classification. Biotechniques 2000, 29(6):1264\u20138. 1270","journal-title":"Biotechniques"},{"issue":"12","key":"564_CR6","doi-asserted-by":"publisher","first-page":"1131","DOI":"10.1093\/bioinformatics\/17.12.1131","volume":"17","author":"L Li","year":"2001","unstructured":"Li L, Weinberg C, Darden T, Pedersen L: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA\/KNN method. Bioinformatics 2001, 17(12):1131\u201342. 10.1093\/bioinformatics\/17.12.1131","journal-title":"Bioinformatics"},{"issue":"3","key":"564_CR7","doi-asserted-by":"publisher","first-page":"503","DOI":"10.1101\/gr.104003","volume":"13","author":"J Lyons-Weiler","year":"2003","unstructured":"Lyons-Weiler J, Patel S, Bhattacharya S: A classification-based machine learning approach for the analysis of genome-wide expression data. Genome Res 2003, 13(3):503\u2013512. 10.1101\/gr.104003","journal-title":"Genome Res"},{"key":"564_CR8","doi-asserted-by":"publisher","first-page":"262","DOI":"10.1073\/pnas.97.1.262","volume":"97","author":"M Brown","year":"2000","unstructured":"Brown M, Grundy W, Lin D, Cristianini N, Sugnet C, Furey T, Ares M, Haussler D: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci USA 2000, 97: 262\u20137. 10.1073\/pnas.97.1.262","journal-title":"Proc Natl Acad Sci USA"},{"issue":"10","key":"564_CR9","doi-asserted-by":"publisher","first-page":"906","DOI":"10.1093\/bioinformatics\/16.10.906","volume":"16","author":"T Furey","year":"2000","unstructured":"Furey T, Cristianini N, Duffy N, Bednarski D, Schummer M, Haussler D: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 2000, 16(10):906\u201314. 10.1093\/bioinformatics\/16.10.906","journal-title":"Bioinformatics"},{"issue":"10","key":"564_CR10","doi-asserted-by":"publisher","first-page":"6567","DOI":"10.1073\/pnas.082099299","volume":"99","author":"R Tibshirani","year":"2002","unstructured":"Tibshirani R, Hastie T, Narasimhan B, Chu G: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 2002, 99(10):6567\u201372. [http:\/\/dx.doi.org\/10.1073\/pnas.082099299] 10.1073\/pnas.082099299","journal-title":"Proc Natl Acad Sci USA"},{"issue":"457","key":"564_CR11","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1198\/016214502753479248","volume":"97","author":"S Dudoit","year":"2002","unstructured":"Dudoit S, Fridlyand J, Speed TP: Comparison of discrimination methods for the classification of tumors using gene expression data. J Amer Stat Assoc 2002, 97(457):77\u201387. 10.1198\/016214502753479248","journal-title":"J Amer Stat Assoc"},{"key":"564_CR12","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1007\/978-1-4615-0873-1_11","volume-title":"Methods of microarray data analysis","author":"W Li","year":"2002","unstructured":"Li W, Yang Y: How many genes are needed for a discriminant microarray data analysis? In Methods of microarray data analysis. Edited by: Lin S, KF Johnson E. Kluwer Academic; 2002:137\u2013150."},{"issue":"2","key":"564_CR13","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1093\/bioinformatics\/bth469","volume":"21","author":"L Ein-Dor","year":"2005","unstructured":"Ein-Dor L, Kela I, Getz G, Givol D, Domany E: Outcome signature genes in breast cancer: is there a unique set? Bioinformatics 2005, 21(2):171\u2013178. 10.1093\/bioinformatics\/bth469","journal-title":"Bioinformatics"},{"key":"564_CR14","first-page":"403","volume":"8","author":"S Doledec","year":"1987","unstructured":"Doledec S, Chessel D: Rhytmes saisonniers et composantes stationnelles en milieu aquatique. I \u2013 Description d'un plan d'observation complet par projection de variables. Acta Oecologica Oecologia Generalis 1987, 8: 403\u2013426.","journal-title":"Acta Oecologica Oecologia Generalis"},{"issue":"12","key":"564_CR15","doi-asserted-by":"publisher","first-page":"1600","DOI":"10.1093\/bioinformatics\/18.12.1600","volume":"18","author":"AC Culhane","year":"2002","unstructured":"Culhane AC, Perri\u00e8re G, Considine EC, Cotter TG, Higgins DG: Between-group analysis of microarray data. Bioinformatics 2002, 18(12):1600\u20131608. 10.1093\/bioinformatics\/18.12.1600","journal-title":"Bioinformatics"},{"issue":"7","key":"564_CR16","doi-asserted-by":"publisher","first-page":"4168","DOI":"10.1073\/pnas.0230559100","volume":"100","author":"H Zhang","year":"2003","unstructured":"Zhang H, Yu CY, Singer B: Cell and tumor classification using gene expression data: construction of forests. Proc Natl Acad Sci USA 2003, 100(7):4168\u20134172. 10.1073\/pnas.0230559100","journal-title":"Proc Natl Acad Sci USA"},{"issue":"5","key":"564_CR17","doi-asserted-by":"publisher","first-page":"644","DOI":"10.1093\/bioinformatics\/btg462","volume":"20","author":"AV Antonov","year":"2004","unstructured":"Antonov AV, Tetko IV, Mader MT, Budczies J, Mewes HW: Optimization models for cancer classification: extracting gene interaction information from microarray expression data. Bioinformatics 2004, 20(5):644\u2013652. 10.1093\/bioinformatics\/btg462","journal-title":"Bioinformatics"},{"key":"564_CR18","first-page":"10","volume":"21","author":"RM Rutherford","year":"2004","unstructured":"Rutherford RM, Staedtler F, Kehren J, Chibout SD, Joos L, Tamm M, Gilmartin JJ, Brutsche MH: Functional genomics and prognosis in sarcoidosis\u2013the critical role of antigen presentation. Sarcoidosis Vasc Diffuse Lung Dis 2004, 21: 10\u201318.","journal-title":"Sarcoidosis Vasc Diffuse Lung Dis"},{"key":"564_CR19","unstructured":"Gene Expression Omnibus[http:\/\/www.ncbi.nlm.nih.gov\/geo\/]"},{"issue":"6","key":"564_CR20","doi-asserted-by":"publisher","first-page":"673","DOI":"10.1038\/89044","volume":"7","author":"J Khan","year":"2001","unstructured":"Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C, Meltzer PS: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 2001, 7(6):673\u2013679. 10.1038\/89044","journal-title":"Nat Med"},{"key":"564_CR21","unstructured":"Small round blue cell tumours dataset[http:\/\/research.nhgri.nih.gov\/microarray\/Supplement]"},{"issue":"12","key":"564_CR22","doi-asserted-by":"publisher","first-page":"6745","DOI":"10.1073\/pnas.96.12.6745","volume":"96","author":"U Alon","year":"1999","unstructured":"Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 1999, 96(12):6745\u20136750. 10.1073\/pnas.96.12.6745","journal-title":"Proc Natl Acad Sci USA"},{"key":"564_CR23","unstructured":"Colon cancer dataset[http:\/\/www.bioconductor.org\/packages\/data\/experiment\/stable\/src\/contrib\/html\/colonCA.html]"},{"issue":"5439","key":"564_CR24","doi-asserted-by":"publisher","first-page":"531","DOI":"10.1126\/science.286.5439.531","volume":"286","author":"TR Golub","year":"1999","unstructured":"Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999, 286(5439):531\u2013537. 10.1126\/science.286.5439.531","journal-title":"Science"},{"key":"564_CR25","unstructured":"Leukemia dataset[http:\/\/www.bioconductor.org\/packages\/data\/experiment\/stable\/src\/contrib\/html\/golubEsets.html]"},{"key":"564_CR26","volume-title":"R: A language and environment for statistical computing","author":"R Development Core Team","year":"2004","unstructured":"R Development Core Team: R: A language and environment for statistical computing.R Foundation for Statistical Computing, Vienna, Austria; 2004. [http:\/\/www.R-project.org]"},{"issue":"10","key":"564_CR27","doi-asserted-by":"publisher","first-page":"R80","DOI":"10.1186\/gb-2004-5-10-r80","volume":"5","author":"RC Gentleman","year":"2004","unstructured":"Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY, Zhang J: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004, 5(10):R80. 10.1186\/gb-2004-5-10-r80","journal-title":"Genome Biol"},{"key":"564_CR28","first-page":"5","volume":"4","author":"D Chessel","year":"2004","unstructured":"Chessel D, Dufour AB, Thioulouse J: The ade4 package \u2013 I: One-table methods. R News 2004, 4: 5\u201310. [http:\/\/cran.R-project.org\/doc\/Rnews\/]","journal-title":"R News"},{"issue":"11","key":"564_CR29","doi-asserted-by":"publisher","first-page":"2789","DOI":"10.1093\/bioinformatics\/bti394","volume":"21","author":"AC Culhane","year":"2005","unstructured":"Culhane AC, Thioulouse J, Perri\u00e8re G, Higgins DG: MADE4: an R package for multivariate analysis of gene expression data. Bioinformatics 2005, 21(11):2789\u20132790. 10.1093\/bioinformatics\/bti394","journal-title":"Bioinformatics"},{"issue":"Suppl 1","key":"564_CR30","doi-asserted-by":"publisher","first-page":"S96","DOI":"10.1093\/bioinformatics\/18.suppl_1.S96","volume":"18","author":"W Huber","year":"2002","unstructured":"Huber W, von Heydebreck A, Sultmann H, Poustka A, Vingron M: Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 2002, 18(Suppl 1):S96\u2013104.","journal-title":"Bioinformatics"},{"issue":"19","key":"564_CR31","doi-asserted-by":"publisher","first-page":"10781","DOI":"10.1073\/pnas.181597298","volume":"98","author":"K Fellenberg","year":"2001","unstructured":"Fellenberg K, Hauser NC, Brors B, Neutzner A, Hoheisel JD, Vingron M: Correspondence analysis applied to microarray data. Proc Natl Acad Sci USA 2001, 98(19):10781\u201310786. 10.1073\/pnas.181597298","journal-title":"Proc Natl Acad Sci USA"},{"issue":"4","key":"564_CR32","doi-asserted-by":"publisher","first-page":"1131","DOI":"10.1111\/j.0006-341X.2003.00130.x","volume":"59","author":"L Wouters","year":"2003","unstructured":"Wouters L, Gohlmann HW, Bijnens L, Kass SU, Molenberghs G, Lewi PJ: Graphical exploration of gene expression data: a comparative study of three multivariate methods. Biometrics 2003, 59(4):1131\u20131139. 10.1111\/j.0006-341X.2003.00130.x","journal-title":"Biometrics"},{"issue":"2","key":"564_CR33","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1016\/S0169-2607(02)00011-1","volume":"70","author":"G Perri\u00e8re","year":"2003","unstructured":"Perri\u00e8re G, Thioulouse J: Use of correspondence discriminant analysis to predict the subcellular location of bacterial proteins. Comput Methods Programs Biomed 2003, 70(2):99\u2013105. 10.1016\/S0169-2607(02)00011-1","journal-title":"Comput Methods Programs Biomed"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-6-239.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T21:39:27Z","timestamp":1706823567000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-6-239"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,9,28]]},"references-count":33,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2005,12]]}},"alternative-id":["564"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-6-239","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,9,28]]},"assertion":[{"value":"10 June 2005","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 September 2005","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 September 2005","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"239"}}