{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T14:28:41Z","timestamp":1768832921704,"version":"3.49.0"},"reference-count":18,"publisher":"Oxford University Press (OUP)","issue":"23","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2005,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Missing values are problematic for the analysis of microarray data. Imputation methods have been compared in terms of the similarity between imputed and true values in simulation experiments and not of their influence on the final analysis. The focus has been on missing at random, while entries are missing also not at random.<\/jats:p>\n               <jats:p>Results: We investigate the influence of imputation on the detection of differentially expressed genes from cDNA microarray data. We apply ANOVA for microarrays and SAM and look to the differentially expressed genes that are lost because of imputation. We show that this new measure provides useful information that the traditional root mean squared error cannot capture. We also show that the type of missingness matters: imputing 5% missing not at random has the same effect as imputing 10\u201330% missing at random. We propose a new method for imputation (LinImp), fitting a simple linear model for each channel separately, and compare it with the widely used KNNimpute method. For 10% missing at random, KNNimpute leads to twice as many lost differentially expressed genes as LinImp.<\/jats:p>\n               <jats:p>Availability: The R package for LinImp is available at<\/jats:p>\n               <jats:p>Contact: \u00a0idasch@math.uio.no<\/jats:p>\n               <jats:p>Supplementary information: \u00a0<\/jats:p>","DOI":"10.1093\/bioinformatics\/bti708","type":"journal-article","created":{"date-parts":[[2005,10,11]],"date-time":"2005-10-11T00:23:46Z","timestamp":1128990226000},"page":"4272-4279","source":"Crossref","is-referenced-by-count":51,"title":["The influence of missing value imputation on detection of differentially expressed genes from microarray data"],"prefix":"10.1093","volume":"21","author":[{"given":"Ida","family":"Scheel","sequence":"first","affiliation":[]},{"given":"Magne","family":"Aldrin","sequence":"additional","affiliation":[]},{"given":"Ingrid K.","family":"Glad","sequence":"additional","affiliation":[]},{"given":"Ragnhild","family":"S\u00f8rum","sequence":"additional","affiliation":[]},{"given":"Heidi","family":"Lyng","sequence":"additional","affiliation":[]},{"given":"Arnoldo","family":"Frigessi","sequence":"additional","affiliation":[]}],"member":"286","published-online":{"date-parts":[[2005,10,10]]},"reference":[{"key":"2023061010032586400_b1","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1186\/1471-2105-5-114","article-title":"Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering","volume":"5","author":"de Brevern","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023061010032586400_b2","doi-asserted-by":"crossref","first-page":"e34","DOI":"10.1093\/nar\/gnh026","article-title":"LSimpute: accurate estimation of missing values in microarray data with least squares methods","volume":"32","author":"B\u00f8","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023061010032586400_b3","doi-asserted-by":"crossref","first-page":"10","DOI":"10.2202\/1544-6115.1120","article-title":"Prediction of missing values in microarray and use of mixed models to evaluate the predictors","volume":"4","author":"Feten","year":"2005","journal-title":"Stat. Appl. Genet. Mol. Biol."},{"key":"2023061010032586400_b4","doi-asserted-by":"crossref","first-page":"2987","DOI":"10.1091\/mbc.12.10.2987","article-title":"Genomic expression responses to DNA-damaging agents and the regulatory role of the yeast ATR homolog Mec1p","volume":"12","author":"Gasch","year":"2001","journal-title":"Mol. Biol. Cell"},{"key":"2023061010032586400_b5","doi-asserted-by":"crossref","first-page":"819","DOI":"10.1089\/10665270050514954","article-title":"Analysis of variance for gene expression microarray data","volume":"7","author":"Kerr","year":"2000","journal-title":"J. Comput. Biol."},{"key":"2023061010032586400_b6","first-page":"203","article-title":"Statistical analysis of a gene expression microarray experiment with replication","volume":"12","author":"Kerr","year":"2002","journal-title":"Stat. Sinica"},{"key":"2023061010032586400_b7","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1093\/bioinformatics\/bth499","article-title":"Missing value estimation for DNA microarray gene expression data: local least squares imputation","volume":"21","author":"Kim","year":"2005","journal-title":"Bioinformatics"},{"key":"2023061010032586400_b8","volume-title":"Analysis of Microarray Gene Expression Data","author":"Lee","year":"2004"},{"key":"2023061010032586400_b9","doi-asserted-by":"crossref","first-page":"347","DOI":"10.6339\/JDS.2004.02(4).170","article-title":"Evaluation of missing value estimation for microarray data","volume":"2","author":"Nguyen","year":"2004","journal-title":"J. Data Sci."},{"key":"2023061010032586400_b10","doi-asserted-by":"crossref","first-page":"2088","DOI":"10.1093\/bioinformatics\/btg287","article-title":"A Bayesian missing value estimation method for gene expression profile data","volume":"19","author":"Oba","year":"2003","journal-title":"Bioinformatics"},{"key":"2023061010032586400_b11","doi-asserted-by":"crossref","first-page":"917","DOI":"10.1093\/bioinformatics\/bth007","article-title":"Gaussian mixture clustering and imputation of microarray data","volume":"20","author":"Ouyang","year":"2004","journal-title":"Bioinformatics"},{"key":"2023061010032586400_b12","first-page":"275","volume-title":"Data Preparation for Data Mining","author":"Pyle","year":"1999"},{"key":"2023061010032586400_b13","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1214\/ss\/1056397486","article-title":"Statistical challenges in functional genomics","volume":"18","author":"Sebastiani","year":"2003","journal-title":"Stat. Sci."},{"key":"2023061010032586400_b14","doi-asserted-by":"crossref","first-page":"6567","DOI":"10.1073\/pnas.082099299","article-title":"Diagnosis of multiple cancer types by shrunken centroids of gene expression","volume":"99","author":"Tibshirani","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023061010032586400_b15","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1093\/bioinformatics\/17.6.520","article-title":"Missing value estimation methods for cDNA microarrays","volume":"17","author":"Troyanskaya","year":"2001","journal-title":"Bioinformatics"},{"key":"2023061010032586400_b16","doi-asserted-by":"crossref","first-page":"5116","DOI":"10.1073\/pnas.091062498","article-title":"Significance analysis of microarrays applied to the ionizing radiation response","volume":"98","author":"Tusher","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023061010032586400_b17","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1002\/0470011084","volume-title":"Statistics for Microarrays: Design, Analysis and Inference","author":"Wit","year":"2004"},{"key":"2023061010032586400_b18","doi-asserted-by":"crossref","first-page":"2302","DOI":"10.1093\/bioinformatics\/btg323","article-title":"Missing-value estimation using linear and non-linear regression with Bayesian gene selection","volume":"19","author":"Zhou","year":"2003","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/23\/4272\/50567301\/bioinformatics_21_23_4272.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/21\/23\/4272\/50567301\/bioinformatics_21_23_4272.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,10]],"date-time":"2023-06-10T10:03:47Z","timestamp":1686391427000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/21\/23\/4272\/195443"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,10,10]]},"references-count":18,"journal-issue":{"issue":"23","published-print":{"date-parts":[[2005,12,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bti708","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2005,12]]},"published":{"date-parts":[[2005,10,10]]}}}