{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T04:08:37Z","timestamp":1772770117658,"version":"3.50.1"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2020,10,18]],"date-time":"2020-10-18T00:00:00Z","timestamp":1602979200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Spanish Ministry of Economy and Competitiveness","award":["TEC2014-60337-R"],"award-info":[{"award-number":["TEC2014-60337-R"]}]},{"name":"Spanish Ministry of Economy and Competitiveness","award":["DPI2017-89827-R"],"award-info":[{"award-number":["DPI2017-89827-R"]}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01GM104400"],"award-info":[{"award-number":["R01GM104400"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Networking Biomedical Research Centre"},{"name":"Bioengineering, Biomaterials and Nanomedicine"},{"name":"Instituto de Investigaci\u00f3n Carlos III"},{"name":"Share4Rare","award":["780262"],"award-info":[{"award-number":["780262"]}]},{"name":"B2SLab","award":["SGR 952"],"award-info":[{"award-number":["SGR 952"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,5,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Network diffusion and label propagation are fundamental tools in computational biology, with applications like gene\u2013disease association, protein function prediction and module discovery. More recently, several publications have introduced a permutation analysis after the propagation process, due to concerns that network topology can bias diffusion scores. This opens the question of the statistical properties and the presence of bias of such diffusion processes in each of its applications. In this work, we characterized some common null models behind the permutation analysis and the statistical properties of the diffusion scores. We benchmarked seven diffusion scores on three case studies: synthetic signals on a yeast interactome, simulated differential gene expression on a protein\u2013protein interaction network and prospective gene set prediction on another interaction network. For clarity, all the datasets were based on binary labels, but we also present theoretical results for quantitative labels.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Diffusion scores starting from binary labels were affected by the label codification and exhibited a problem-dependent topological bias that could be removed by the statistical normalization. Parametric and non-parametric normalization addressed both points by being codification-independent and by equalizing the bias. We identified and quantified two sources of bias\u2014mean value and variance\u2014that yielded performance differences when normalizing the scores. We provided closed formulae for both and showed how the null covariance is related to the spectral properties of the graph. Despite none of the proposed scores systematically outperformed the others, normalization was preferred when the sought positive labels were not aligned with the bias. We conclude that the decision on bias removal should be problem and data-driven, i.e. based on a quantitative analysis of the bias and its relation to the positive entities.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability<\/jats:title>\n                    <jats:p>The code is publicly available at https:\/\/github.com\/b2slab\/diffuBench and the data underlying this article are available at https:\/\/github.com\/b2slab\/retroData<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa896","type":"journal-article","created":{"date-parts":[[2020,10,7]],"date-time":"2020-10-07T08:02:40Z","timestamp":1602057760000},"page":"845-852","source":"Crossref","is-referenced-by-count":4,"title":["The effect of statistical normalization on network propagation scores"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6426-8204","authenticated-orcid":false,"given":"Sergio","family":"Picart-Armada","sequence":"first","affiliation":[{"name":"B2SLab, Departament d\u2019Enginyeria de Sistemes, Autom\u00e0tica i Inform\u00e0tica Industrial, Universitat Polit\u00e8cnica de Catalunya, CIBER-BBN , Barcelona, 08028, Spain"},{"name":"Esplugues de Llobregat, Institut de Recerca Pedi\u00e0trica Hospital Sant Joan de D\u00e9u , Barcelona, 08950, Spain"}]},{"given":"Wesley K","family":"Thompson","sequence":"additional","affiliation":[{"name":"Mental Health Center Sct. Hans , 4000 Roskilde, Denmark"},{"name":"Department of Family Medicine and Public Health, University of California , San Diego, La Jolla, CA, USA"}]},{"given":"Alfonso","family":"Buil","sequence":"additional","affiliation":[{"name":"Mental Health Center Sct. Hans , 4000 Roskilde, Denmark"}]},{"given":"Alexandre","family":"Perera-Lluna","sequence":"additional","affiliation":[{"name":"B2SLab, Departament d\u2019Enginyeria de Sistemes, Autom\u00e0tica i Inform\u00e0tica Industrial, Universitat Polit\u00e8cnica de Catalunya, CIBER-BBN , Barcelona, 08028, Spain"},{"name":"Esplugues de Llobregat, Institut de Recerca Pedi\u00e0trica Hospital Sant Joan de D\u00e9u , Barcelona, 08950, Spain"}]}],"member":"286","published-online":{"date-parts":[[2020,10,18]]},"reference":[{"key":"2023051705204011100_btaa896-B8976749","year":"1999","journal-title":"The PageRank Citation Ranking: Bringing Order to the Web.\u00a0Stanford InfoLab, Stanford"},{"key":"2023051705204011100_btaa896-B1","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1016\/j.cell.2005.04.020","article-title":"Systems biology: its practice and challenges","volume":"121","author":"Aderem","year":"2005","journal-title":"Cell"},{"key":"2023051705204011100_btaa896-B2","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nrg2918","article-title":"Network medicine: a network-based approach to human disease","volume":"12","author":"Barab\u00e1si","year":"2011","journal-title":"Nat. Rev. Genet"},{"key":"2023051705204011100_btaa896-B3","doi-asserted-by":"crossref","first-page":"34841","DOI":"10.1038\/srep34841","article-title":"Network diffusion-based analysis of high-throughput data for the detection of differentially enriched modules","volume":"6","author":"Bersanelli","year":"2016","journal-title":"Sci. Rep"},{"key":"2023051705204011100_btaa896-B4","doi-asserted-by":"crossref","first-page":"4","DOI":"10.3389\/fgene.2019.00004","article-title":"Comparative analysis of normalization methods for network propagation","volume":"10","author":"Biran","year":"2019","journal-title":"Front. Genet"},{"key":"2023051705204011100_btaa896-B5","doi-asserted-by":"crossref","first-page":"i219","DOI":"10.1093\/bioinformatics\/btu263","article-title":"New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence","volume":"30","author":"Cao","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051705204011100_btaa896-B6","doi-asserted-by":"crossref","first-page":"D369","DOI":"10.1093\/nar\/gkw1102","article-title":"The biogrid interaction database: 2017 update","volume":"45","author":"Chatr-Aryamontri","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023051705204011100_btaa896-B7","doi-asserted-by":"crossref","first-page":"2771","DOI":"10.1182\/blood-2003-09-3243","article-title":"Gene expression profile of adult t-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival","volume":"103","author":"Chiaretti","year":"2004","journal-title":"Blood"},{"key":"2023051705204011100_btaa896-B8","doi-asserted-by":"crossref","first-page":"551","DOI":"10.1038\/nrg.2017.38","article-title":"Network propagation: a universal amplifier of genetic associations","volume":"18","author":"Cowen","year":"2017","journal-title":"Nat. Rev. Genet"},{"key":"2023051705204011100_btaa896-B9","author":"Csardi","year":"2015"},{"key":"2023051705204011100_btaa896-B10","doi-asserted-by":"crossref","first-page":"e73074","DOI":"10.1371\/journal.pone.0073074","article-title":"Network and data integration for biomarker signature discovery via network smoothed T-statistics","volume":"8","author":"Cun","year":"2013","journal-title":"PLoS One"},{"key":"2023051705204011100_btaa896-B11","doi-asserted-by":"publisher","author":"Dittrich","year":"2010","DOI":"10.18129\/B9.bioc.DLBCL"},{"key":"2023051705204011100_btaa896-B12","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1186\/1756-0381-4-19","article-title":"DADA: degree-aware algorithms for network-based disease gene prioritization","volume":"4","author":"Erten","year":"2011","journal-title":"BioData Mining"},{"key":"2023051705204011100_btaa896-B13","doi-asserted-by":"crossref","first-page":"855","DOI":"10.1145\/2939672.2939754","volume-title":"Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Grover","year":"2016"},{"key":"2023051705204011100_btaa896-B14","article-title":"A large-scale benchmark of gene prioritization methods","volume":"7, 46598","author":"Guala","year":"2017","journal-title":"Sci. Rep"},{"key":"2023051705204011100_btaa896-B15","doi-asserted-by":"crossref","first-page":"e1007403","DOI":"10.1371\/journal.pcbi.1007403","article-title":"Benchmarking network algorithms for contextualizing genes of interest","volume":"15","author":"Hill","year":"2019","journal-title":"PLoS Comput. Biol"},{"key":"2023051705204011100_btaa896-B16","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1145\/3308558.3313483","volume-title":"The World Wide Web Conference","author":"Ibrahim","year":"2019"},{"key":"2023051705204011100_btaa896-B17","doi-asserted-by":"crossref","first-page":"1829","DOI":"10.1093\/bioinformatics\/btx029","article-title":"Aptrank: an adaptive pagerank model for protein function prediction on bi-relational graphs","volume":"33","author":"Jiang","year":"2017","journal-title":"Bioinformatics"},{"key":"2023051705204011100_btaa896-B18","doi-asserted-by":"crossref","first-page":"D353","DOI":"10.1093\/nar\/gkw1092","article-title":"Kegg: new perspectives on genomes, pathways, diseases and drugs","volume":"45","author":"Kanehisa","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023051705204011100_btaa896-B19","doi-asserted-by":"crossref","first-page":"1109","DOI":"10.1101\/gr.118992.110","article-title":"Prioritizing candidate disease genes by network-based boosting of genome-wide association data","volume":"21","author":"Lee","year":"2011","journal-title":"Genome Res"},{"key":"2023051705204011100_btaa896-B20","doi-asserted-by":"publisher","author":"Li","year":"2009","DOI":"10.18129\/B9.bioc.ALL"},{"key":"2023051705204011100_btaa896-B21","doi-asserted-by":"crossref","first-page":"1645","DOI":"10.1021\/acs.jcim.8b00663","article-title":"Evaluation of cross-validation strategies in sequence-based binding prediction using deep learning","volume":"59","author":"Lopez-del Rio","year":"2019","journal-title":"J. Chem. Inform. Model"},{"key":"2023051705204011100_btaa896-B22","doi-asserted-by":"crossref","first-page":"D411","DOI":"10.1093\/nar\/gkj141","article-title":"Human protein reference database 2006 update","volume":"34","author":"Mishra","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023051705204011100_btaa896-B23","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1038\/nrg3552","article-title":"Integrative approaches for finding modular structure in biological networks","volume":"14","author":"Mitra","year":"2013","journal-title":"Nat. Rev. Genet"},{"key":"2023051705204011100_btaa896-B24","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/gb-2008-9-s1-s4","article-title":"GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function","volume":"9","author":"Mostafavi","year":"2008","journal-title":"Genome Biol"},{"key":"2023051705204011100_btaa896-B25","doi-asserted-by":"crossref","first-page":"601","DOI":"10.1038\/35001165","article-title":"Guilt-by-association goes global","volume":"403","author":"Oliver","year":"2000","journal-title":"Nature"},{"key":"2023051705204011100_btaa896-B26","doi-asserted-by":"crossref","first-page":"2757","DOI":"10.1093\/bioinformatics\/btt471","article-title":"Discovering causal pathways linking genomic events to transcriptional states using Tied Diffusion Through Interacting Events (TieDIE)","volume":"29","author":"Paull","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051705204011100_btaa896-B27","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1093\/bioinformatics\/btx632","article-title":"diffuStats: an R package to compute diffusion-based scores on biological networks","volume":"34","author":"Picart-Armada","year":"2017","journal-title":"Bioinformatics"},{"key":"2023051705204011100_btaa896-B28","doi-asserted-by":"crossref","first-page":"e0189012","DOI":"10.1371\/journal.pone.0189012","article-title":"Null diffusion-based enrichment for metabolomics data","volume":"12","author":"Picart-Armada","year":"2017","journal-title":"PLoS One"},{"key":"2023051705204011100_btaa896-B29","doi-asserted-by":"crossref","first-page":"e1007276","DOI":"10.1371\/journal.pcbi.1007276","article-title":"Benchmarking network propagation methods for disease gene identification","volume":"15","author":"Picart-Armada","year":"2019","journal-title":"PLoS Comput. Biol"},{"key":"2023051705204011100_btaa896-B30","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1093\/bioinformatics\/bti069","article-title":"Inferring pathways from gene lists using a literature-derived network of biological relationships","volume":"21","author":"Rajagopalan","year":"2005","journal-title":"Bioinformatics"},{"key":"2023051705204011100_btaa896-B31","doi-asserted-by":"crossref","first-page":"1937","DOI":"10.1056\/NEJMoa012914","article-title":"The use of molecular profiling to predict survival after chemotherapy for diffuse large-b-cell lymphoma","volume":"346","author":"Rosenwald","year":"2002","journal-title":"N. Engl. J. Med"},{"key":"2023051705204011100_btaa896-B32","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1093\/bioinformatics\/btw570","article-title":"Precrec: fast and accurate precision\u2013recall and ROC curve calculations in R","volume":"33","author":"Saito","year":"2017","journal-title":"Bioinformatics"},{"key":"2023051705204011100_btaa896-B33","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1038\/msb4100129","article-title":"Network-based prediction of protein function","volume":"3","author":"Sharan","year":"2007","journal-title":"Mol. Syst. Biol"},{"key":"2023051705204011100_btaa896-B34","first-page":"144","author":"Smola","year":"2003"},{"key":"2023051705204011100_btaa896-B35","doi-asserted-by":"crossref","first-page":"919","DOI":"10.1093\/bib\/bbz042","article-title":"Graph convolutional networks for computational drug development and discovery","volume":"21","author":"Sun","year":"2020","journal-title":"Brief. Bioinformatics"},{"key":"2023051705204011100_btaa896-B36","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1016\/j.artmed.2014.03.003","article-title":"An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods","volume":"61","author":"Valentini","year":"2014","journal-title":"Artif. Intell. Med"},{"key":"2023051705204011100_btaa896-B37","doi-asserted-by":"crossref","first-page":"506","DOI":"10.1007\/978-3-642-12683-3_33","article-title":"Algorithms for detecting significantly mutated pathways in cancer","author":"Vandin","year":"2010","journal-title":"Lect. Notes Comput. Sci"},{"key":"2023051705204011100_btaa896-B38","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1038\/nature750","article-title":"Comparative assessment of large-scale data sets of protein\u2013protein interactions","volume":"417","author":"Von Mering","year":"2002","journal-title":"Nature"},{"key":"2023051705204011100_btaa896-B39","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2700381","article-title":"Graph-based label propagation in digital media: a review","volume":"47","author":"Zoidi","year":"2015","journal-title":"ACM Comput. Surveys"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa896\/33976081\/btaa896.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/6\/845\/50356756\/btaa896.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/6\/845\/50356756\/btaa896.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T01:22:41Z","timestamp":1684286561000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/6\/845\/5929688"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,10,18]]},"references-count":40,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2021,5,5]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa896","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.01.20.911842","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,3,15]]},"published":{"date-parts":[[2020,10,18]]}}}