{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T20:14:27Z","timestamp":1760645667554},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Reproducibility analyses of biologically relevant microarray studies have mostly focused on overlap of detected biomarkers or correlation of differential expression evidences across studies. For clinical utility, direct inter-study prediction (i.e. to establish a prediction model in one study and apply to another) for disease diagnosis or prognosis prediction is more important. Normalization plays a key role for such a task. Traditionally, sample-wise normalization has been a standard for inter-array and inter-study normalization. For gene-wise normalization, it has been implemented for intra-study or inter-study predictions in a few papers while its rationale, strategy and effect remain unexplored.<\/jats:p>\n               <jats:p>Results: In this article, we investigate the effect of gene-wise normalization in microarray inter-study prediction. Gene-specific intensity discrepancies across studies are commonly found even after proper sample-wise normalization. We explore the rationale and necessity of gene-wise normalization. We also show that the ratio of sample sizes in normal versus diseased groups can greatly affect the performance of gene-wise normalization and an analytical method is developed to adjust for the imbalanced ratio effect. Both simulation results and applications to three lung cancer and two prostate cancer data sets, considering both binary classification and survival risk predictions, showed significant and robust improvement of the new adjustment. A calibration scheme is developed to apply the ratio-adjusted gene-wise normalization for prospective clinical trials. The number of calibration samples needed is estimated from existing studies and suggested for future applications. The result has important implication to the translational research of microarray as a practical disease diagnosis and prognosis prediction tool.<\/jats:p>\n               <jats:p>Contact: \u00a0ctseng@pitt.edu<\/jats:p>\n               <jats:p>Availability: \u00a0http:\/\/www.biostat.pitt.edu\/bioinfo\/<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp292","type":"journal-article","created":{"date-parts":[[2009,5,5]],"date-time":"2009-05-05T01:06:55Z","timestamp":1241485615000},"page":"1655-1661","source":"Crossref","is-referenced-by-count":16,"title":["Ratio adjustment and calibration scheme for gene-wise normalization to enhance microarray inter-study prediction"],"prefix":"10.1093","volume":"25","author":[{"given":"Chunrong","family":"Cheng","sequence":"first","affiliation":[{"name":"1 Department of Biostatistics, 2Department of Computational Biology, 3Department of Pathology and 4Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA"}]},{"given":"Kui","family":"Shen","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics, 2Department of Computational Biology, 3Department of Pathology and 4Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA"}]},{"given":"Chi","family":"Song","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics, 2Department of Computational Biology, 3Department of Pathology and 4Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA"}]},{"given":"Jianhua","family":"Luo","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics, 2Department of Computational Biology, 3Department of Pathology and 4Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA"}]},{"given":"George C.","family":"Tseng","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics, 2Department of Computational Biology, 3Department of Pathology and 4Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA"},{"name":"1 Department of Biostatistics, 2Department of Computational Biology, 3Department of Pathology and 4Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA"},{"name":"1 Department of Biostatistics, 2Department of Computational Biology, 3Department of Pathology and 4Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA"}]}],"member":"286","published-online":{"date-parts":[[2009,5,4]]},"reference":[{"key":"2023013112135835200_B1","doi-asserted-by":"crossref","first-page":"10101","DOI":"10.1073\/pnas.97.18.10101","article-title":"Singular value decomposition for genome-wide expression data processing and modeling","volume":"97","author":"Alter","year":"2000","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013112135835200_B2","doi-asserted-by":"crossref","first-page":"E108","DOI":"10.1371\/journal.pbio.0020108","article-title":"Semi-supervised methods to predict patient survival from gene expression data","volume":"2","author":"Bair","year":"2004","journal-title":"PLoS Biol."},{"key":"2023013112135835200_B3","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1038\/nm733","article-title":"Gene-expression profiles predict survival of patients with lung adenocarcinoma","volume":"8","author":"Beer","year":"2002","journal-title":"Nat. Med."},{"key":"2023013112135835200_B4","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1093\/bioinformatics\/btg385","article-title":"Adjustment of systematic microarray data biases","volume":"20","author":"Benito","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013112135835200_B5","doi-asserted-by":"crossref","first-page":"13790","DOI":"10.1073\/pnas.191502998","article-title":"Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses","volume":"98","author":"Bhattacharjee","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013112135835200_B6","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1016\/S0002-9440(10)63090-8","article-title":"Multi-platform, multi-site, microarray-based human tumor classification","volume":"164","author":"Bloom","year":"2004","journal-title":"Am. J. Pathol."},{"issue":"Suppl. 1","key":"2023013112135835200_B7","doi-asserted-by":"crossref","first-page":"S5","DOI":"10.1186\/1471-2105-8-S1-S5","article-title":"Cross platform microarray analysis for robust identification of differentially expressed genes","volume":"8","author":"Bosotti","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023013112135835200_B8","doi-asserted-by":"crossref","first-page":"R27","DOI":"10.1186\/gb-2003-4-4-r27","article-title":"MatchMiner: a tool for batch navigation among gene and gene product identifiers","volume":"4","author":"Bussey","year":"2003","journal-title":"Genome Biol."},{"key":"2023013112135835200_B9","doi-asserted-by":"crossref","DOI":"10.2202\/1544-6115.1204","article-title":"Combining results of microarray experiments: a rank aggregation approach","volume":"5","author":"DeConde","year":"2006","journal-title":"Stat. Appl. Genet. Mol. Biol."},{"key":"2023013112135835200_B10","doi-asserted-by":"crossref","first-page":"822","DOI":"10.1038\/35090585","article-title":"Delineation of prognostic biomarkers in prostate cancer","volume":"412","author":"Dhanasekaran","year":"2001","journal-title":"Nature"},{"key":"2023013112135835200_B11","doi-asserted-by":"crossref","first-page":"13784","DOI":"10.1073\/pnas.241500798","article-title":"Diversity of gene expression in adenocarcinoma of the lung","volume":"98","author":"Garber","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013112135835200_B12","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1126\/science.286.5439.531","article-title":"Molecular classification of cancer: class discovery and class prediction by gene expression monitoring","volume":"286","author":"Golub","year":"1999","journal-title":"Science"},{"key":"2023013112135835200_B13","doi-asserted-by":"crossref","first-page":"2543","DOI":"10.1001\/jama.1982.03320430047030","article-title":"Evaluating the yield of medical tests","volume":"247","author":"Harrel","year":"1982","journal-title":"JAMA"},{"key":"2023013112135835200_B14","doi-asserted-by":"crossref","first-page":"e15","DOI":"10.1093\/nar\/gng015","article-title":"Summaries of Affymetrix GeneChip probe level data","volume":"31","author":"Irizarry","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023013112135835200_B15","doi-asserted-by":"crossref","first-page":"789","DOI":"10.1093\/bioinformatics\/btk046","article-title":"Comparison of Affymetrix GeneChip expression measures","volume":"22","author":"Irizarry","year":"2006","journal-title":"Bioinformatics"},{"key":"2023013112135835200_B16","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1186\/1471-2105-5-81","article-title":"Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes","volume":"5","author":"Jiang","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023013112135835200_B17","doi-asserted-by":"crossref","first-page":"832","DOI":"10.1038\/nbt1217","article-title":"A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies","volume":"24","author":"Kuo","year":"2006","journal-title":"Nat. Biotechnol."},{"key":"2023013112135835200_B18","doi-asserted-by":"crossref","first-page":"570","DOI":"10.1016\/j.jbi.2007.11.005","article-title":"Cross-generation and cross-laboratory predictions of Affymetrix microarrays by rank-based methods","volume":"41","author":"Liu","year":"2008","journal-title":"J. Biomed. Inform."},{"key":"2023013112135835200_B19","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1186\/1471-2164-5-71","article-title":"Inter-platform comparability of microarrays in acute lymphoblastic leukemia","volume":"5","author":"Mitchell","year":"2004","journal-title":"BMC Genomics"},{"key":"2023013112135835200_B20","doi-asserted-by":"crossref","first-page":"2922","DOI":"10.1158\/1078-0432.CCR-03-0490","article-title":"A cross-study comparison of gene expression studies for the molecular classification of lung cancer","volume":"10","author":"Parmigiani","year":"2004","journal-title":"Clin. Cancer. Res."},{"key":"2023013112135835200_B21","doi-asserted-by":"crossref","first-page":"1154","DOI":"10.1093\/bioinformatics\/btn083","article-title":"Merging two gene-expression studies via cross-platform normalization","volume":"24","author":"Shabalin","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013112135835200_B22","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1186\/1471-2164-5-94","article-title":"Prognostic meta-signature of breast cancer developed by two-stage mixture modeling of microarray data","volume":"5","author":"Shen","year":"2004","journal-title":"BMC Genomics"},{"key":"2023013112135835200_B23","doi-asserted-by":"crossref","first-page":"1151","DOI":"10.1038\/nbt1239","article-title":"The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements","volume":"24","author":"Shi","year":"2006","journal-title":"Nat. Biotechnol."},{"key":"2023013112135835200_B24","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/nm0102-68","article-title":"Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning","volume":"8","author":"Shipp","year":"2002","journal-title":"Nat. Med."},{"key":"2023013112135835200_B25","doi-asserted-by":"crossref","first-page":"5676","DOI":"10.1093\/nar\/gkg763","article-title":"Evaluation of gene expression measurements from commercial microarray platforms","volume":"31","author":"Tan","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023013112135835200_B26","article-title":"A statistical framework to infer functional gene associations from multiple biologically interrelated microarray experiments","author":"Teng","year":"2008","journal-title":"J. Am. Stat. Assoc."},{"key":"2023013112135835200_B27","doi-asserted-by":"crossref","first-page":"2549","DOI":"10.1093\/nar\/29.12.2549","article-title":"Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects","volume":"29","author":"Tseng","year":"2001","journal-title":"Nucleic Acids Res."},{"key":"2023013112135835200_B28","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1038\/415530a","article-title":"Gene expression profiling predicts clinical outcome of breast cancer","volume":"415","author":"van't Veer","year":"2002","journal-title":"Nature"},{"key":"2023013112135835200_B29","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1186\/1471-2105-6-265","article-title":"Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes","volume":"6","author":"Warnat","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023013112135835200_B30","first-page":"5974","article-title":"Analysis of gene expression identifies candidate markers and pharmacological targets in prostate cancer","volume":"61","author":"Welsh","year":"2001","journal-title":"Cancer Res."},{"key":"2023013112135835200_B31","doi-asserted-by":"crossref","first-page":"3905","DOI":"10.1093\/bioinformatics\/bti647","article-title":"Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data","volume":"21","author":"Xu","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013112135835200_B32","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1186\/1471-2105-9-125","article-title":"Merging microarray data from separate breast cancer studies provides a robust prognostic test","volume":"9","author":"Xu","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023013112135835200_B33","doi-asserted-by":"crossref","first-page":"e15","DOI":"10.1093\/nar\/30.4.e15","article-title":"Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation","volume":"30","author":"Yang","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"2023013112135835200_B34","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1002\/em.20290","article-title":"Review of the literature examining the correlation among DNA microarray technologies","volume":"48","author":"Yauk","year":"2007","journal-title":"Environ. Mol. Mutagen."},{"key":"2023013112135835200_B35","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1002\/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3","article-title":"Index for rating diagnostic tests","volume":"3","author":"Youden","year":"1950","journal-title":"Cancer"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/13\/1655\/48996525\/bioinformatics_25_13_1655.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/13\/1655\/48996525\/bioinformatics_25_13_1655.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T21:44:00Z","timestamp":1675201440000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/13\/1655\/196784"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,5,4]]},"references-count":35,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2009,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp292","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,7,1]]},"published":{"date-parts":[[2009,5,4]]}}}