{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:03Z","timestamp":1772138043794,"version":"3.50.1"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"16","license":[{"start":{"date-parts":[[2021,2,20]],"date-time":"2021-02-20T00:00:00Z","timestamp":1613779200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"European Union\u2019s Horizon 2020 research and innovation programme"},{"name":"Marie Sk\u0142odowska-Curie","award":["765158"],"award-info":[{"award-number":["765158"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,8,25]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Cross-(multi)platform normalization of gene-expression microarray data remains an unresolved issue. Despite the existence of several algorithms, they are either constrained by the need to normalize all samples of all platforms together, compromising scalability and reuse, by adherence to the platforms of a specific provider, or simply by poor performance. In addition, many of the methods presented in the literature have not been specifically tested against multi-platform data and\/or other methods applicable in this context. Thus, we set out to develop a normalization algorithm appropriate for gene-expression studies based on multiple, potentially large microarray sets collected along multiple platforms and at different times, applicable in systematic studies aimed at extracting knowledge from the wealth of microarray data available in public repositories; for example, for the extraction of Real-World Data to complement data from Randomized Controlled Trials. Our main focus or criterion for performance was on the capacity of the algorithm to properly separate samples from different biological groups.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We present CuBlock, an algorithm addressing this objective, together with a strategy to validate cross-platform normalization methods. To validate the algorithm and benchmark it against existing methods, we used two distinct datasets, one specifically generated for testing and standardization purposes and one from an actual experimental study. Using these datasets, we benchmarked CuBlock against ComBat (Johnson et al., 2007), UPC (Piccolo et al., 2013), YuGene (L\u00ea Cao et al., 2014), DBNorm (Meng et al., 2017), Shambhala (Borisov et al., 2019) and a simple log2 transform as reference. We note that many other popular normalization methods are not applicable in this context. CuBlock was the only algorithm in this group that could always and clearly differentiate the underlying biological groups after mixing the data, from up to six different platforms in this study.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>CuBlock can be downloaded from https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/77882-cublock.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab105","type":"journal-article","created":{"date-parts":[[2021,2,16]],"date-time":"2021-02-16T16:32:16Z","timestamp":1613493136000},"page":"2365-2373","source":"Crossref","is-referenced-by-count":16,"title":["CuBlock: a cross-platform normalization method for gene-expression microarrays"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6138-0612","authenticated-orcid":false,"given":"Valentin","family":"Junet","sequence":"first","affiliation":[{"name":"Anaxomics Biotech SL , Barcelona 08008, Spain"},{"name":"Institute of Biotechnology and Biomedicine, Universitat Aut\u00f2noma de Barcelona , Barcelona 08193, Spain"}]},{"given":"Judith","family":"Farr\u00e9s","sequence":"additional","affiliation":[{"name":"Anaxomics Biotech SL , Barcelona 08008, Spain"}]},{"given":"Jos\u00e9 M","family":"Mas","sequence":"additional","affiliation":[{"name":"Anaxomics Biotech SL , Barcelona 08008, Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9235-6730","authenticated-orcid":false,"given":"Xavier","family":"Daura","sequence":"additional","affiliation":[{"name":"Institute of Biotechnology and Biomedicine, Universitat Aut\u00f2noma de Barcelona , Barcelona 08193, Spain"},{"name":"Catalan Institution for Research and Advanced Studies (ICREA) , Barcelona 08010, Spain"}]}],"member":"286","published-online":{"date-parts":[[2021,2,20]]},"reference":[{"key":"2023051609130277100_btab105-B1","doi-asserted-by":"crossref","first-page":"e1912869","DOI":"10.1001\/jamanetworkopen.2019.12869","article-title":"Feasibility of using real-world data to replicate clinical trial evidence","volume":"2","author":"Bartlett","year":"2019","journal-title":"JAMA Netw. Open"},{"key":"2023051609130277100_btab105-B2","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1093\/bioinformatics\/btg385","article-title":"Adjustment of systematic microarray data biases","volume":"20","author":"Benito","year":"2004","journal-title":"Bioinformatics"},{"key":"2023051609130277100_btab105-B3","doi-asserted-by":"crossref","first-page":"1033","DOI":"10.1002\/pds.4297","article-title":"Good practices for real-world data studies of treatment and\/or comparative effectiveness: recommendations from the joint ISPOR-ISPE Special Task Force on real-world evidence in health care decision making","volume":"26","author":"Berger","year":"2017","journal-title":"Pharmacoepidemiol. Drug Saf"},{"key":"2023051609130277100_btab105-B4","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1186\/s12859-019-2641-8","article-title":"Shambhala: a platform-agnostic data harmonizer for gene expression data","volume":"20","author":"Borisov","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2023051609130277100_btab105-B5","doi-asserted-by":"crossref","first-page":"e0177678","DOI":"10.1371\/journal.pone.0177678","article-title":"Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric","volume":"12","author":"Boughorbel","year":"2017","journal-title":"PLoS One"},{"key":"2023051609130277100_btab105-B6","doi-asserted-by":"crossref","first-page":"22.1.1","DOI":"10.1002\/0471142727.mb2201s101","article-title":"Overview of DNA microarrays: types, applications, and their future","volume":"101","author":"Bumgarner","year":"2013","journal-title":"Curr. Protoc. Mol. Biol"},{"key":"2023051609130277100_btab105-B7","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-vector networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach. Learn"},{"key":"2023051609130277100_btab105-B8","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","article-title":"An introduction to ROC analysis","volume":"27","author":"Fawcett","year":"2006","journal-title":"Pattern Recogn. Lett"},{"key":"2023051609130277100_btab105-B9","year":"2018"},{"key":"2023051609130277100_btab105-B10","doi-asserted-by":"crossref","first-page":"1585","DOI":"10.1093\/bioinformatics\/18.12.1585","article-title":"Robust estimators for expression analysis","volume":"18","author":"Hubbell","year":"2002","journal-title":"Bioinformatics"},{"key":"2023051609130277100_btab105-B11","doi-asserted-by":"crossref","first-page":"e0194844","DOI":"10.1371\/journal.pone.0194844","article-title":"Integrative multi-platform meta-analysis of gene expression profiles in pancreatic ductal adenocarcinoma patients for identifying novel diagnostic biomarkers","volume":"13","author":"Irigoyen","year":"2018","journal-title":"PLoS One"},{"key":"2023051609130277100_btab105-B12","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1093\/biostatistics\/4.2.249","article-title":"Exploration, normalization, and summaries of high density oligonucleotide array probe level data","volume":"4","author":"Irizarry","year":"2003","journal-title":"Biostatistics"},{"key":"2023051609130277100_btab105-B13","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1093\/biostatistics\/kxj037","article-title":"Adjusting batch effects in microarray expression data using empirical Bayes methods","volume":"8","author":"Johnson","year":"2007","journal-title":"Biostatistics"},{"key":"2023051609130277100_btab105-B14","doi-asserted-by":"crossref","first-page":"13057","DOI":"10.1073\/pnas.94.24.13057","article-title":"Yeast microarrays for genome wide parallel genetic and gene expression analysis","volume":"94","author":"Lashkari","year":"1997","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051609130277100_btab105-B15","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1016\/j.ygeno.2014.03.001","article-title":"YuGene: a simple approach to scale gene expression data derived from different platforms for integrated analyses","volume":"103","author":"L\u00ea Cao","year":"2014","journal-title":"Genomics"},{"key":"2023051609130277100_btab105-B16","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1109\/TIT.1982.1056489","article-title":"Least squares quantization in PCM","volume":"28","author":"Lloyd","year":"1982","journal-title":"IEEE Trans. Inf. Theory"},{"key":"2023051609130277100_btab105-B17","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Maaten","year":"2008","journal-title":"J. Mach. Learn. Res"},{"key":"2023051609130277100_btab105-B18","doi-asserted-by":"crossref","first-page":"813","DOI":"10.1158\/0008-5472.CAN-12-2633","article-title":"Polo-like Kinase 1: a potential therapeutic option in combination with conventional chemotherapy for the management of patients with triple-negative breast cancer","volume":"73","author":"Maire","year":"2013","journal-title":"Cancer Res"},{"key":"2023051609130277100_btab105-B19","doi-asserted-by":"crossref","first-page":"e63712","DOI":"10.1371\/journal.pone.0063712","article-title":"TTK\/hMPS1 is an attractive therapeutic target for triple-negative breast cancer","volume":"8","author":"Maire","year":"2013","journal-title":"PLoS One"},{"key":"2023051609130277100_btab105-B20","doi-asserted-by":"crossref","first-page":"1151","DOI":"10.1038\/nbt1239","article-title":"The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements","volume":"24","year":"2006","journal-title":"Nat. Biotechnol"},{"key":"2023051609130277100_btab105-B21","doi-asserted-by":"crossref","first-page":"e0122333","DOI":"10.1371\/journal.pone.0122333","article-title":"Transcriptome analysis of Wnt3a-treated triple-negative breast cancer cells","volume":"10","author":"Maubant","year":"2015","journal-title":"PLoS One"},{"key":"2023051609130277100_btab105-B22","doi-asserted-by":"crossref","first-page":"527","DOI":"10.1186\/s12859-017-1912-5","article-title":"DBNorm: normalizing high-density oligonucleotide microarray data based on distributions","volume":"18","author":"Meng","year":"2017","journal-title":"BMC Bioinformatics"},{"key":"2023051609130277100_btab105-B23","doi-asserted-by":"crossref","first-page":"1344","DOI":"10.1126\/science.1158441","article-title":"The transcriptional landscape of the yeast genome defined by RNA sequencing","volume":"320","author":"Nagalakshmi","year":"2008","journal-title":"Science"},{"key":"2023051609130277100_btab105-B24","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1016\/j.ygeno.2012.08.003","article-title":"A single-sample microarray normalization method to facilitate personalized-medicine workflows","volume":"100","author":"Piccolo","year":"2012","journal-title":"Genomics"},{"key":"2023051609130277100_btab105-B25","doi-asserted-by":"crossref","first-page":"17778","DOI":"10.1073\/pnas.1305823110","article-title":"Multiplatform single-sample estimates of transcriptional activation","volume":"110","author":"Piccolo","year":"2013","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051609130277100_btab105-B26","doi-asserted-by":"crossref","first-page":"763","DOI":"10.1093\/hmg\/ddm012","article-title":"Success and failure in human spermatogenesis as revealed by teratozoospermic RNAs","volume":"16","author":"Platts","year":"2007","journal-title":"Hum. Mol. Genet"},{"key":"2023051609130277100_btab105-B27","doi-asserted-by":"crossref","first-page":"R95","DOI":"10.1186\/gb-2013-14-9-r95","article-title":"Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data","volume":"14","author":"Rapaport","year":"2013","journal-title":"Genome Biol"},{"key":"2023051609130277100_btab105-B28","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: a graphical aid to the interpretation and validation of cluster analysis","volume":"20","author":"Rousseeuw","year":"1987","journal-title":"J. Comput. Appl. Math"},{"key":"2023051609130277100_btab105-B29","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1186\/1471-2105-12-467","article-title":"Empirical comparison of cross-platform normalization methods for gene expression data","volume":"12","author":"Rudy","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023051609130277100_btab105-B30","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1126\/science.270.5235.467","article-title":"Quantitative monitoring of gene expression patterns with a complementary DNA microarray","volume":"270","author":"Schena","year":"1995","journal-title":"Science"},{"key":"2023051609130277100_btab105-B31","doi-asserted-by":"crossref","first-page":"903","DOI":"10.1038\/nbt.2957","article-title":"A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium","volume":"32","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023051609130277100_btab105-B32","doi-asserted-by":"crossref","first-page":"1154","DOI":"10.1093\/bioinformatics\/btn083","article-title":"Merging two gene-expression studies via cross-platform normalization","volume":"24","author":"Shabalin","year":"2008","journal-title":"Bioinformatics"},{"key":"2023051609130277100_btab105-B33","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1038\/nrd.2017.25","article-title":"Accelerating development of scientific evidence for medical products within the existing US regulatory framework","volume":"16","author":"Sherman","year":"2017","journal-title":"Nat. Rev. Drug Discov"},{"key":"2023051609130277100_btab105-B34","doi-asserted-by":"crossref","first-page":"2616","DOI":"10.1214\/009053604000000823","article-title":"Approximately unbiased tests of regions using multistep-multiscale bootstrap resampling","volume":"32","author":"Shimodaira","year":"2004","journal-title":"Ann. Stat"},{"key":"2023051609130277100_btab105-B35","doi-asserted-by":"crossref","first-page":"479","DOI":"10.1111\/1467-9868.00346","article-title":"A direct approach to false discovery rates","volume":"64","author":"Storey","year":"2002","journal-title":"J. R. Stat. Soc. B"},{"key":"2023051609130277100_btab105-B36","doi-asserted-by":"crossref","first-page":"1540","DOI":"10.1093\/bioinformatics\/btl117","article-title":"Pvclust: an R package for assessing the uncertainty in hierarchical clustering","volume":"22","author":"Suzuki","year":"2006","journal-title":"Bioinformatics"},{"key":"2023051609130277100_btab105-B37","first-page":"1","article-title":"Discrepancies between observational studies and randomized controlled trials","volume":"73","author":"Trotta","year":"2012","journal-title":"Focus Farmacovigilanza"},{"key":"2023051609130277100_btab105-B38","doi-asserted-by":"crossref","first-page":"389","DOI":"10.3390\/microarrays4030389","article-title":"Microarray meta-analysis and cross-platform normalization: integrative genomics for robust biomarker discovery","volume":"4","author":"Walsh","year":"2015","journal-title":"Microarrays"},{"key":"2023051609130277100_btab105-B39","doi-asserted-by":"crossref","first-page":"e15","DOI":"10.1093\/nar\/30.4.e15","article-title":"Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation","volume":"30","author":"Yang","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023051609130277100_btab105-B40","doi-asserted-by":"crossref","first-page":"2486","DOI":"10.1093\/bioinformatics\/btz974","article-title":"MatchMixeR: a cross-platform normalization method for gene expression data integration","volume":"36","author":"Zhang","year":"2020","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab105\/36416698\/btab105.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/16\/2365\/50339370\/btab105.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/16\/2365\/50339370\/btab105.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T05:16:31Z","timestamp":1684214191000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/16\/2365\/6145566"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,2,20]]},"references-count":40,"journal-issue":{"issue":"16","published-print":{"date-parts":[[2021,8,25]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab105","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.10.29.360198","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,8,15]]},"published":{"date-parts":[[2021,2,20]]}}}