{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T23:00:59Z","timestamp":1773270059096,"version":"3.50.1"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"23","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2006,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Microarrays datasets frequently contain a large number of missing values (MVs), which need to be estimated and replaced for subsequent data mining. The focus of the paper is to study the effects of different MV treatments for cDNA microarray data on disease classification analysis.<\/jats:p>\n               <jats:p>Results: By analyzing five datasets, we demonstrate that among three kinds of classifiers evaluated in this study, support vector machine (SVM) classifiers are robust to varied MV imputation methods [e.g. replacing MVs by zero, K nearest-neighbor (KNN) imputation algorithm, local least square imputation and Bayesian principal component analysis], while the classification and regression tree classifiers are sensitive in terms of classification accuracy. The KNNclassifiers built on differentially expressed genes (DEGs) are robust to the varied MV treatments, but the performances of the KNN classifiers based on all measured genes can be significantly deteriorated when imputing MVs for genes with larger missing rate (MR) (e.g. MR &amp;gt; 5%). Generally, while replacing MVs by zero performs relatively poor, the other imputation algorithms have little difference in affecting classification performances of the SVM or KNN classifiers. We further demonstrate the power and feasibility of our recently proposed functional expression profile (FEP) approach as means to handle microarray data with MVs. The FEPs, which are derived from the functional modules that are enriched with sets of DEGs and thus can be consistently identified under varied MV treatments, achieve precise disease classification with better biological interpretation. We conclude that the choice of MV treatments should be determined in context of the later approaches used for disease classification. The suggested exclusion criterion of ignoring the genes with larger MR (e.g. &amp;gt;5%), while justifiable for some classifiers such as KNN classifiers, might not be considered as a general rule for all classifiers.<\/jats:p>\n               <jats:p>Contact: \u00a0guoz@ems.hrbmu.edu.cn; yangbf@ems.hrbmu.edu.cn<\/jats:p>\n               <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btl339","type":"journal-article","created":{"date-parts":[[2006,6,30]],"date-time":"2006-06-30T14:27:07Z","timestamp":1151677627000},"page":"2883-2889","source":"Crossref","is-referenced-by-count":30,"title":["Effects of replacing the unreliable cDNA microarray measurements on the disease classification based on gene expression profiles and functional modules"],"prefix":"10.1093","volume":"22","author":[{"given":"Dong","family":"Wang","sequence":"first","affiliation":[{"name":"Department of Bioinformatics and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 1 \u00a0 1 \u00a0 \u00a0 Harbin 150086, China"}]},{"given":"Yingli","family":"Lv","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 1 \u00a0 1 \u00a0 \u00a0 Harbin 150086, China"}]},{"given":"Zheng","family":"Guo","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 1 \u00a0 1 \u00a0 \u00a0 Harbin 150086, China"},{"name":"Department of Pharmacology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 2 \u00a0 2 \u00a0 \u00a0 Harbin 150086, China"}]},{"given":"Xia","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 1 \u00a0 1 \u00a0 \u00a0 Harbin 150086, China"}]},{"given":"Yanhui","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 1 \u00a0 1 \u00a0 \u00a0 Harbin 150086, China"}]},{"given":"Jing","family":"Zhu","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 1 \u00a0 1 \u00a0 \u00a0 Harbin 150086, China"}]},{"given":"Da","family":"Yang","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 1 \u00a0 1 \u00a0 \u00a0 Harbin 150086, China"}]},{"given":"Jianzhen","family":"Xu","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 1 \u00a0 1 \u00a0 \u00a0 Harbin 150086, China"}]},{"given":"Chenguang","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 1 \u00a0 1 \u00a0 \u00a0 Harbin 150086, China"}]},{"given":"Shaoqi","family":"Rao","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 1 \u00a0 1 \u00a0 \u00a0 Harbin 150086, China"},{"name":"Department of Molecular Cardiology, The Cleveland Clinic Foundation 3 \u00a0 3 \u00a0 \u00a0 9500 Euclid Avenue, Cleveland, Ohio 44195, USA"},{"name":"Department of Cardiovascular Medicine, The Cleveland Clinic Foundation 4 \u00a0 4 \u00a0 \u00a0 9500 Euclid Avenue, Cleveland, Ohio 44195, USA"}]},{"given":"Baofeng","family":"Yang","sequence":"additional","affiliation":[{"name":"Department of Pharmacology and Bio-pharmaceutical Key Laboratory of Heilongjiang Province and State, Harbin Medical University 2 \u00a0 2 \u00a0 \u00a0 Harbin 150086, China"}]}],"member":"286","published-online":{"date-parts":[[2006,6,29]]},"reference":[{"key":"2023012409222232700_b1","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1038\/35000501","article-title":"Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling","volume":"403","author":"Alizadeh","year":"2000","journal-title":"Nature"},{"key":"2023012409222232700_b2","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. The Gene Ontology Consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet."},{"key":"2023012409222232700_b3","doi-asserted-by":"crossref","first-page":"55","DOI":"10.2174\/157489306775330615","article-title":"Gene expression profile classification: a review","volume":"1","author":"Asyali","year":"2006","journal-title":"Current Bioinform."},{"key":"2023012409222232700_b4","doi-asserted-by":"crossref","first-page":"682","DOI":"10.1093\/bioinformatics\/btg468","article-title":"Degrees of differential gene expression: detecting biologically significant expression differences and estimating their magnitudes","volume":"20","author":"Bickel","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012409222232700_b5","doi-asserted-by":"crossref","first-page":"RESEARCH0017","DOI":"10.1186\/gb-2002-3-4-research0017","article-title":"New feature subset selection procedures for classification of expression profiles","volume":"3","author":"Bo","year":"2002","journal-title":"Genome Biol."},{"key":"2023012409222232700_b6","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1093\/bioinformatics\/btg399","article-title":"Is cross-validation better than resubstitution for ranking genes?","volume":"20","author":"Braga-Neto","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012409222232700_b7","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1186\/1471-2105-5-34","article-title":"Iterative Group Analysis (iGA): a simple tool to enhance sensitivity and facilitate interpretation of microarray experiments","volume":"5","author":"Breitling","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023012409222232700_b8","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1002\/cncr.20727","article-title":"Prostate carcinoma incidence in relation to prediagnostic circulating levels of insulin-like growth factor I, insulin-like growth factor binding protein 3, and insulin","volume":"103","author":"Chen","year":"2005","journal-title":"Cancer"},{"key":"2023012409222232700_b9","doi-asserted-by":"crossref","first-page":"3208","DOI":"10.1091\/mbc.e02-12-0833","article-title":"Variation in gene expression patterns in human gastric cancers","volume":"14","author":"Chen","year":"2003","journal-title":"Mol. Biol. Cell."},{"key":"2023012409222232700_b10","doi-asserted-by":"crossref","first-page":"1198","DOI":"10.1038\/modpathol.3800167","article-title":"Novel endothelial cell markers in hepatocellular carcinoma","volume":"17","author":"Chen","year":"2004","journal-title":"Mod. Pathol."},{"key":"2023012409222232700_b11","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1186\/1471-2105-5-114","article-title":"Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering","volume":"5","author":"de Brevern","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023012409222232700_b12","first-page":"98","article-title":"Global functional profiling of gene expression","volume":"81","author":"Draghici","year":"2003","journal-title":"Genomics"},{"key":"2023012409222232700_b13","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1198\/016214502753479248","article-title":"Comparison of discrimination methods for the classification of tumors using gene expression data","volume":"97","author":"Dudoit","year":"2002","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012409222232700_b14","doi-asserted-by":"crossref","first-page":"906","DOI":"10.1093\/bioinformatics\/16.10.906","article-title":"Support vector machine classification and validation of cancer tissue samples using microarray expression data","volume":"16","author":"Furey","year":"2000","journal-title":"Bioinformatics"},{"key":"2023012409222232700_b15","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1186\/1471-2105-6-58","article-title":"Towards precise classification of cancers based on robust gene functional expression profiles","volume":"6","author":"Guo","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023012409222232700_b16","doi-asserted-by":"crossref","first-page":"C47","DOI":"10.1038\/35011540","article-title":"From molecular to modular cell biology","volume":"402","author":"Hartwell","year":"1999","journal-title":"Nature"},{"key":"2023012409222232700_b17","doi-asserted-by":"crossref","first-page":"R70","DOI":"10.1186\/gb-2003-4-10-r70","article-title":"Identifying biological themes within lists of genes with EASE","volume":"4","author":"Hosack","year":"2003","journal-title":"Genome Biol."},{"key":"2023012409222232700_b18","doi-asserted-by":"crossref","first-page":"4155","DOI":"10.1093\/bioinformatics\/bti638","article-title":"DNA microarray data imputation and significance analysis of differential expression","volume":"21","author":"Jornsten","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012409222232700_b19","doi-asserted-by":"crossref","first-page":"849","DOI":"10.1053\/hupa.2001.26463","article-title":"Co-downregulation of cell adhesion proteins alpha- and beta-catenins, p120CTN, E-cadherin, and CD44 in prostatic adenocarcinomas","volume":"32","author":"Kallakury","year":"2001","journal-title":"Hum. Pathol."},{"key":"2023012409222232700_b20","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1093\/bioinformatics\/bth499","article-title":"Missing value estimation for DNA microarray gene expression data: local least squares imputation","volume":"21","author":"Kim","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012409222232700_b21","doi-asserted-by":"crossref","first-page":"709","DOI":"10.1677\/erc.1.00535","article-title":"The role of fibroblast growth factors and their receptors in prostate cancer","volume":"11","author":"Kwabi-Addo","year":"2004","journal-title":"Endocr. Relat. Cancer"},{"key":"2023012409222232700_b22","doi-asserted-by":"crossref","first-page":"811","DOI":"10.1073\/pnas.0304146101","article-title":"Gene expression profiling identifies clinically relevant subtypes of prostate cancer","volume":"101","author":"Lapointe","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409222232700_b23","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1016\/j.ygeno.2004.09.007","article-title":"A robust hybrid between genetic algorithm and support vector machine for extracting an optimal feature gene subset","volume":"85","author":"Li","year":"2005","journal-title":"Genomics"},{"key":"2023012409222232700_b24","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1016\/j.canlet.2004.01.022","article-title":"Gene expression based classification of gastric carcinoma","volume":"210","author":"Norsett","year":"2004","journal-title":"Cancer Lett."},{"key":"2023012409222232700_b25","doi-asserted-by":"crossref","first-page":"2088","DOI":"10.1093\/bioinformatics\/btg287","article-title":"A Bayesian missing value estimation method for gene expression profile data","volume":"19","author":"Oba","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012409222232700_b26","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1186\/1471-2105-5-124","article-title":"Handling multiple testing while interpreting microarrays with the Gene Ontology Database","volume":"5","author":"Osier","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023012409222232700_b27","doi-asserted-by":"crossref","first-page":"8961","DOI":"10.1073\/pnas.0502674102","article-title":"Effects of threshold choice on biological conclusions reached during analysis of gene expression by DNA microarrays","volume":"102","author":"Pan","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409222232700_b28","doi-asserted-by":"crossref","first-page":"4272","DOI":"10.1093\/bioinformatics\/bti708","article-title":"The influence of missing value imputation on detection of differentially expressed genes from microarray data","volume":"21","author":"Scheel","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012409222232700_b29","doi-asserted-by":"crossref","first-page":"1090","DOI":"10.1038\/ng1434","article-title":"A module map showing conditional activity of expression modules in cancer","volume":"36","author":"Segal","year":"2004","journal-title":"Nat. Genet."},{"key":"2023012409222232700_b30","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1093\/jnci\/95.1.14","article-title":"Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification","volume":"95","author":"Simon","year":"2003","journal-title":"J. Natl Cancer Inst."},{"key":"2023012409222232700_b31","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1093\/bioinformatics\/17.6.520","article-title":"Missing value estimation methods for DNA microarrays","volume":"17","author":"Troyanskaya","year":"2001","journal-title":"Bioinformatics"},{"key":"2023012409222232700_b32","doi-asserted-by":"crossref","first-page":"5116","DOI":"10.1073\/pnas.091062498","article-title":"Significance analysis of microarrays applied to the ionizing radiation response","volume":"98","author":"Tusher","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409222232700_b33","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1186\/1471-2105-6-265","article-title":"Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes","volume":"6","author":"Warnat","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023012409222232700_b34","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1002\/jcp.1041640319","article-title":"Interaction of CD44 variant isoforms with hyaluronic acid and the cytoskeleton in human prostate cancer cells","volume":"164","author":"Welsh","year":"1995","journal-title":"J. Cell. Physiol."},{"key":"2023012409222232700_b35","doi-asserted-by":"crossref","first-page":"4168","DOI":"10.1073\/pnas.0230559100","article-title":"Cell and tumor classification using gene expression data: construction of forests","volume":"100","author":"Zhang","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409222232700_b36","doi-asserted-by":"crossref","first-page":"2523","DOI":"10.1091\/mbc.e03-11-0786","article-title":"Different gene expression patterns in invasive lobular and ductal carcinomas of the breast","volume":"15","author":"Zhao","year":"2004","journal-title":"Mol. Biol. Cell"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/23\/2883\/48842746\/bioinformatics_22_23_2883.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/23\/2883\/48842746\/bioinformatics_22_23_2883.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T10:04:00Z","timestamp":1674554640000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/22\/23\/2883\/277966"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,6,29]]},"references-count":36,"journal-issue":{"issue":"23","published-print":{"date-parts":[[2006,12,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btl339","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2006,12,1]]},"published":{"date-parts":[[2006,6,29]]}}}