{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:22Z","timestamp":1772138062354,"version":"3.50.1"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2023,10,6]],"date-time":"2023-10-06T00:00:00Z","timestamp":1696550400000},"content-version":"vor","delay-in-days":5,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Centre for Molecular Medicine Norway","award":["NCMM, 187615"],"award-info":[{"award-number":["NCMM, 187615"]}]},{"DOI":"10.13039\/501100005416","name":"Research Council of Norway","doi-asserted-by":"publisher","award":["313932"],"award-info":[{"award-number":["313932"]}],"id":[{"id":"10.13039\/501100005416","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100008730","name":"Norwegian Cancer Society","doi-asserted-by":"publisher","award":["214871"],"award-info":[{"award-number":["214871"]}],"id":[{"id":"10.13039\/100008730","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,10,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Gene co-expression measurements are widely used in computational biology to identify coordinated expression patterns across a group of samples. Coordinated expression of genes may indicate that they are controlled by the same transcriptional regulatory program, or involved in common biological processes. Gene co-expression is generally estimated from RNA-Sequencing data, which are commonly normalized to remove technical variability. Here, we demonstrate that certain normalization methods, in particular quantile-based methods, can introduce false-positive associations between genes. These false-positive associations can consequently hamper downstream co-expression network analysis. Quantile-based normalization can, however, be extremely powerful. In particular, when preprocessing large-scale heterogeneous data, quantile-based normalization methods such as smooth quantile normalization can be applied to remove technical variability while maintaining global differences in expression for samples with different biological attributes.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We developed SNAIL (Smooth-quantile Normalization Adaptation for the Inference of co-expression Links), a normalization method based on smooth quantile normalization specifically designed for modeling of co-expression measurements. We show that SNAIL avoids formation of false-positive associations in co-expression as well as in downstream network analyses. Using SNAIL, one can avoid arbitrary gene filtering and retain associations to genes that only express in small subgroups of samples. This highlights the method\u2019s potential future impact on network modeling and other association-based approaches in large-scale heterogeneous data.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The implementation of the SNAIL algorithm and code to reproduce the analyses described in this work can be found in the GitHub repository https:\/\/github.com\/kuijjerlab\/PySNAIL.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad610","type":"journal-article","created":{"date-parts":[[2023,10,6]],"date-time":"2023-10-06T22:27:06Z","timestamp":1696631226000},"source":"Crossref","is-referenced-by-count":7,"title":["Adjustment of spurious correlations in co-expression measurements from RNA-Sequencing data"],"prefix":"10.1093","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3054-1409","authenticated-orcid":false,"given":"Ping-Han","family":"Hsieh","sequence":"first","affiliation":[{"name":"Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo , Oslo 0318, Norway"},{"name":"Department of Informatics, University of Oslo , Oslo 0316, Norway"}]},{"given":"Camila Miranda","family":"Lopes-Ramos","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Harvard T.H. Chan School of Public Health , Boston, MA 02115, United States"},{"name":"Department of Medicine, Harvard Medical School, Boston, MA 02115 , USA"},{"name":"Channing Division of Network Medicine, Brigham and Women's Hospital , Boston, MA 02115, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1317-7422","authenticated-orcid":false,"given":"Manuela","family":"Zucknick","sequence":"additional","affiliation":[{"name":"Oslo Centre for Biostatistics and Epidemiology, Institute of Basic Medical Sciences, University of Oslo , Oslo 0317, Norway"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4959-1409","authenticated-orcid":false,"given":"Geir Kjetil","family":"Sandve","sequence":"additional","affiliation":[{"name":"Department of Informatics, University of Oslo , Oslo 0316, Norway"}]},{"given":"Kimberly","family":"Glass","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Harvard T.H. Chan School of Public Health , Boston, MA 02115, United States"},{"name":"Channing Division of Network Medicine, Brigham and Women's Hospital , Boston, MA 02115, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6280-3130","authenticated-orcid":false,"given":"Marieke Lydia","family":"Kuijjer","sequence":"additional","affiliation":[{"name":"Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo , Oslo 0318, Norway"},{"name":"Department of Pathology, Leiden University Medical Center , Leiden 2300RC, The Netherlands"},{"name":"Leiden Center of Computational Oncology, Leiden University Medical Center ,Leiden 2300RC, The Netherlands"}]}],"member":"286","published-online":{"date-parts":[[2023,10,6]]},"reference":[{"key":"2023102510312207200_btad610-B1","first-page":"R106","article-title":"Differential expression analysis for sequence count data","volume":"11","author":"Anders","year":"2010","journal-title":"Nat Prec"},{"key":"2023102510312207200_btad610-B2","doi-asserted-by":"crossref","first-page":"5274","DOI":"10.1038\/s41467-019-13345-5","article-title":"Personalised analytics for rare disease diagnostics","volume":"10","author":"Anderson","year":"2019","journal-title":"Nat Commun"},{"key":"2023102510312207200_btad610-B3","doi-asserted-by":"crossref","first-page":"648","DOI":"10.1126\/science.1262110","article-title":"The genotype-tissue expression (GTEx) pilot analysis: multitissue gene regulation in humans","volume":"348","author":"Ardlie","year":"2015","journal-title":"Science"},{"key":"2023102510312207200_btad610-B4","article-title":"Encodexplorer: a compilation of encode metadata","volume":"1","author":"Beauparlant","year":"2015","journal-title":"R Package Version"},{"key":"2023102510312207200_btad610-B5","doi-asserted-by":"crossref","first-page":"1177","DOI":"10.1016\/j.cell.2017.05.038","article-title":"An expanded view of complex traits: from polygenic to omnigenic","volume":"169","author":"Boyle","year":"2017","journal-title":"Cell"},{"key":"2023102510312207200_btad610-B6","doi-asserted-by":"crossref","first-page":"319","DOI":"10.1038\/nbt.3838","article-title":"Reproducible RNA-seq analysis using recount2","volume":"35","author":"Collado-Torres","year":"2017","journal-title":"Nat Biotechnol"},{"key":"2023102510312207200_btad610-B7","doi-asserted-by":"crossref","first-page":"1184","DOI":"10.1038\/nprot.2009.97","article-title":"Mapping identifiers for the integration of genomic datasets with the R\/bioconductor package biomart","volume":"4","author":"Durinck","year":"2009","journal-title":"Nat Protoc"},{"key":"2023102510312207200_btad610-B8","doi-asserted-by":"crossref","first-page":"776","DOI":"10.1093\/bib\/bbx008","article-title":"Selecting between-sample RNA-seq normalization methods from the perspective of their assumptions","volume":"19","author":"Evans","year":"2018","journal-title":"Brief Bioinform"},{"key":"2023102510312207200_btad610-B9","doi-asserted-by":"crossref","first-page":"e64832","DOI":"10.1371\/journal.pone.0064832","article-title":"Passing messages between biological networks to refine predicted interactions","volume":"8","author":"Glass","year":"2013","journal-title":"PLoS One"},{"key":"2023102510312207200_btad610-B10","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1186\/1471-2164-12-23","article-title":"Genome-wide patterns of promoter sharing and co-expression in bovine skeletal muscle","volume":"12","author":"Gu","year":"2011","journal-title":"BMC Genomics"},{"key":"2023102510312207200_btad610-B11","doi-asserted-by":"crossref","first-page":"1663","DOI":"10.1261\/rna.048025.114","article-title":"Integrated network analysis reveals distinct regulatory roles of transcription factors and microRNAs","volume":"22","author":"Guo","year":"2016","journal-title":"RNA"},{"key":"2023102510312207200_btad610-B12","author":"Hagberg","year":"2008"},{"key":"2023102510312207200_btad610-B13","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1186\/1752-0509-6-145","article-title":"TIGRESS: trustful inference of gene regulation using stability selection","volume":"6","author":"Haury","year":"2012","journal-title":"BMC Syst Biol"},{"key":"2023102510312207200_btad610-B14","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1186\/s12915-020-00846-9","article-title":"LSTrAP-crowd: prediction of novel components of bacterial ribosomes with crowd-sourced analysis of RNA sequencing data","volume":"18","author":"Hew","year":"2020","journal-title":"BMC Biol"},{"key":"2023102510312207200_btad610-B15","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1093\/biostatistics\/kxx028","article-title":"Smooth quantile normalization","volume":"19","author":"Hicks","year":"2018","journal-title":"Biostatistics"},{"key":"2023102510312207200_btad610-B16","doi-asserted-by":"crossref","first-page":"e12776","DOI":"10.1371\/journal.pone.0012776","article-title":"Inferring regulatory networks from expression data using tree-based methods","volume":"5","author":"Irrthum","year":"2010","journal-title":"PLoS One"},{"key":"2023102510312207200_btad610-B17","doi-asserted-by":"crossref","first-page":"D457","DOI":"10.1093\/nar\/gkv1070","article-title":"KEGG as a reference resource for gene and protein annotation","volume":"44","author":"Kanehisa","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023102510312207200_btad610-B18","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1145\/345966.345982","article-title":"Hubs, authorities, and communities","volume":"31","author":"Kleinberg","year":"1999","journal-title":"ACM Comput Surv"},{"key":"2023102510312207200_btad610-B19","doi-asserted-by":"crossref","first-page":"1003","DOI":"10.1186\/s12885-019-6235-7","article-title":"lionessR: single sample network inference in R","volume":"19","author":"Kuijjer","year":"2019","journal-title":"BMC Cancer"},{"key":"2023102510312207200_btad610-B20","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1016\/j.isci.2019.03.021","article-title":"Estimating sample-specific regulatory networks","volume":"14","author":"Kuijjer","year":"2019","journal-title":"Iscience"},{"key":"2023102510312207200_btad610-B21","doi-asserted-by":"crossref","first-page":"4765","DOI":"10.1093\/bioinformatics\/btaa571","article-title":"PUMA: PANDA using microrna associations","volume":"36","author":"Kuijjer","year":"2020","journal-title":"Bioinformatics"},{"key":"2023102510312207200_btad610-B22","doi-asserted-by":"crossref","first-page":"2233","DOI":"10.1093\/bioinformatics\/btw216","article-title":"ARACNe-AP: gene network reverse engineering through adaptive partitioning inference of mutual information","volume":"32","author":"Lachmann","year":"2016","journal-title":"Bioinformatics"},{"key":"2023102510312207200_btad610-B23","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1186\/1471-2105-9-559","article-title":"WGCNA: an R package for weighted correlation network analysis","volume":"9","author":"Langfelder","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023102510312207200_btad610-B24","doi-asserted-by":"crossref","first-page":"5538","DOI":"10.1158\/0008-5472.CAN-18-0454","article-title":"Gene regulatory network analysis identifies sex-linked differences in Colon cancer drug metabolism","volume":"78","author":"Lopes-Ramos","year":"2018","journal-title":"Cancer Res"},{"key":"2023102510312207200_btad610-B25","doi-asserted-by":"crossref","first-page":"107795","DOI":"10.1016\/j.celrep.2020.107795","article-title":"Sex differences in gene expression and regulatory networks across 29 human tissues","volume":"31","author":"Lopes-Ramos","year":"2020","journal-title":"Cell Rep"},{"key":"2023102510312207200_btad610-B26","doi-asserted-by":"crossref","first-page":"5401","DOI":"10.1158\/0008-5472.CAN-21-0730","article-title":"Regulatory network of PD1 signaling is associated with prognosis in glioblastoma multiforme","volume":"81","author":"Lopes-Ramos","year":"2021","journal-title":"Cancer Res"},{"key":"2023102510312207200_btad610-B27","doi-asserted-by":"crossref","first-page":"2473","DOI":"10.1093\/bioinformatics\/btp462","article-title":"Relationship between gene co-expression and sharing of transcription factor binding sites in Drosophila melanogaster","volume":"25","author":"Marco","year":"2009","journal-title":"Bioinformatics"},{"key":"2023102510312207200_btad610-B28","doi-asserted-by":"crossref","first-page":"79879","DOI":"10.1155\/2007\/79879","article-title":"Information-theoretic inference of large transcriptional regulatory networks","volume":"2007","author":"Meyer","year":"2007","journal-title":"EURASIP J Bioinform Syst Biol"},{"key":"2023102510312207200_btad610-B29","doi-asserted-by":"crossref","first-page":"33","DOI":"10.12688\/f1000research.29032.1","article-title":"Sustainable data analysis with snakemake","volume":"10","author":"M\u00f6lder","year":"2021","journal-title":"F1000Res"},{"key":"2023102510312207200_btad610-B30","doi-asserted-by":"crossref","first-page":"3066","DOI":"10.1093\/bioinformatics\/btv305","article-title":"Coregnet: reconstruction and integrated analysis of co-regulatory networks","volume":"31","author":"Nicolle","year":"2015","journal-title":"Bioinformatics"},{"key":"2023102510312207200_btad610-B31","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1186\/s12859-017-1847-x","article-title":"Tissue-aware RNA-seq processing and normalization for heterogeneous and sparse data","volume":"18","author":"Paulson","year":"2017","journal-title":"BMC Bioinformatics"},{"key":"2023102510312207200_btad610-B32","doi-asserted-by":"crossref","first-page":"i197","DOI":"10.1093\/bioinformatics\/btv268","article-title":"Integrative random Forest for gene regulatory network inference","volume":"31","author":"Petralia","year":"2015","journal-title":"Bioinformatics"},{"key":"2023102510312207200_btad610-B33","doi-asserted-by":"crossref","first-page":"e1004220","DOI":"10.1371\/journal.pcbi.1004220","article-title":"Sharing and specificity of co-expression networks across 35 human tissues","volume":"11","author":"Pierson","year":"2015","journal-title":"PLoS Comput Biol"},{"key":"2023102510312207200_btad610-B34","doi-asserted-by":"crossref","first-page":"e87","DOI":"10.1093\/nar\/gkv300","article-title":"cMonkey2: automated, systematic, integrated detection of co-regulated gene modules for any organism","volume":"43","author":"Reiss","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023102510312207200_btad610-B35","doi-asserted-by":"crossref","first-page":"e47","DOI":"10.1093\/nar\/gkv007","article-title":"limma powers differential expression analyses for RNA-sequencing and microarray studies","volume":"43","author":"Ritchie","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023102510312207200_btad610-B36","doi-asserted-by":"crossref","first-page":"R25","DOI":"10.1186\/gb-2010-11-3-r25","article-title":"A scaling normalization method for differential expression analysis of rna-seq data","volume":"11","author":"Robinson","year":"2010","journal-title":"Genome Biol"},{"key":"2023102510312207200_btad610-B37","doi-asserted-by":"crossref","first-page":"1843","DOI":"10.1101\/gr.216721.116","article-title":"Co-expression networks reveal the tissue-specific regulation of transcription and splicing","volume":"27","author":"Saha","year":"2017","journal-title":"Genome Res"},{"key":"2023102510312207200_btad610-B38","doi-asserted-by":"crossref","first-page":"e1489","DOI":"10.1002\/wsbm.1489","article-title":"Molecular networks in network medicine: development and applications","volume":"12","author":"Silverman","year":"2020","journal-title":"Wiley Interdiscip Rev Syst Biol Med"},{"key":"2023102510312207200_btad610-B39","doi-asserted-by":"crossref","first-page":"1077","DOI":"10.1016\/j.celrep.2017.10.001","article-title":"Understanding tissue-specific gene regulation","volume":"21","author":"Sonawane","year":"2017","journal-title":"Cell Rep"},{"key":"2023102510312207200_btad610-B40","doi-asserted-by":"crossref","first-page":"418","DOI":"10.1186\/gb-2012-13-8-418","article-title":"An encyclopedia of mouse dna elements (mouse encode)","volume":"13","author":"Stamatoyannopoulos","year":"2012","journal-title":"Genome Biol"},{"key":"2023102510312207200_btad610-B41","doi-asserted-by":"crossref","first-page":"D607","DOI":"10.1093\/nar\/gky1131","article-title":"String v11: protein\u2013protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets","volume":"47","author":"Szklarczyk","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023102510312207200_btad610-B42","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1101\/gr.227124.117","article-title":"Mapping transcription factor occupancy using minimal numbers of cells in vitro and in vivo","volume":"28","author":"Tosti","year":"2018","journal-title":"Genome Res"},{"key":"2023102510312207200_btad610-B43","doi-asserted-by":"crossref","first-page":"1113","DOI":"10.1038\/ng.2764","article-title":"The cancer genome atlas pan-cancer analysis project","volume":"45","author":"Weinstein","year":"2013","journal-title":"Nat Genet"},{"key":"2023102510312207200_btad610-B44","doi-asserted-by":"crossref","first-page":"2217","DOI":"10.1038\/s41467-021-22448-x","article-title":"A compendium and comparative epigenomics analysis of cis-regulatory elements in the pig genome","volume":"12","author":"Zhao","year":"2021","journal-title":"Nat Commun"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad610\/51925350\/btad610.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/10\/btad610\/52516291\/btad610.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/10\/btad610\/52516291\/btad610.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,25]],"date-time":"2023-10-25T06:32:10Z","timestamp":1698215530000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad610\/7295542"}},"subtitle":[],"editor":[{"given":"Valentina","family":"Boeva","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,10,1]]},"references-count":44,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2023,10,3]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad610","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.03.25.436972","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,10,1]]},"published":{"date-parts":[[2023,10,1]]},"article-number":"btad610"}}