{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T12:16:40Z","timestamp":1774527400047,"version":"3.50.1"},"reference-count":63,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T00:00:00Z","timestamp":1675296000000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"AMDROMA \u2018Algorithmic and Mechanism Design Research in Online Markets\u2019"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,2,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Gene\u2013disease associations are fundamental for understanding disease etiology and developing effective interventions and treatments. Identifying genes not yet associated with a disease due to a lack of studies is a challenging task in which prioritization based on prior knowledge is an important element. The computational search for new candidate disease genes may be eased by positive-unlabeled learning, the machine learning (ML) setting in which only a subset of instances are labeled as positive while the rest of the dataset is unlabeled. In this work, we propose a set of effective network-based features to be used in a novel Markov diffusion-based multi-class labeling strategy for putative disease gene discovery.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>The performances of the new labeling algorithm and the effectiveness of the proposed features have been tested on 10 different disease datasets using three ML algorithms. The new features have been compared against classical topological and functional\/ontological features and a set of network- and biological-derived features already used in gene discovery tasks. The predictive power of the integrated methodology in searching for new disease genes has been found to be competitive against state-of-the-art algorithms.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>The source code of NIAPU can be accessed at https:\/\/github.com\/AndMastro\/NIAPU. The source data used in this study are available online on the respective websites.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac848","type":"journal-article","created":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T09:42:51Z","timestamp":1675330971000},"source":"Crossref","is-referenced-by-count":11,"title":["NIAPU: network-informed adaptive positive-unlabeled learning for disease gene identification"],"prefix":"10.1093","volume":"39","author":[{"given":"Paola","family":"Stolfi","sequence":"first","affiliation":[{"name":"Institute for Applied Computing (IAC) \u2018Mauro Picone\u2019, National Research Council of Italy (CNR) , Rome 00185, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3456-9428","authenticated-orcid":false,"given":"Andrea","family":"Mastropietro","sequence":"additional","affiliation":[{"name":"Department of Computer, Control and Management Engineering (DIAG) \u2018Antonio Ruberti\u2019, Sapienza University of Rome, Rome 00185, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Giuseppe","family":"Pasculli","sequence":"additional","affiliation":[{"name":"Department of Computer, Control and Management Engineering (DIAG) \u2018Antonio Ruberti\u2019, Sapienza University of Rome, Rome 00185, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3635-7664","authenticated-orcid":false,"given":"Paolo","family":"Tieri","sequence":"additional","affiliation":[{"name":"Institute for Applied Computing (IAC) \u2018Mauro Picone\u2019, National Research Council of Italy (CNR) , Rome 00185, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Davide","family":"Vergni","sequence":"additional","affiliation":[{"name":"Institute for Applied Computing (IAC) \u2018Mauro Picone\u2019, National Research Council of Italy (CNR) , Rome 00185, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2023,2,2]]},"reference":[{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"026103","DOI":"10.1103\/PhysRevE.73.026103","article-title":"Ring structures and mean first passage time in networks","volume":"73","author":"Baronchelli","year":"2006","journal-title":"Phys. Rev. E Stat. Nonlin. Soft Matter Phys"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1007\/s10994-020-05877-5","article-title":"Learning from positive and unlabeled data: a survey","volume":"109","author":"Bekker","year":"2020","journal-title":"Mach. Learn"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"253128","DOI":"10.1155\/2014\/253128","article-title":"A knowledge-driven approach to extract disease-related biomarkers from the literature","volume":"2014","author":"Bravo","year":"2014","journal-title":"Biomed. Res. Int"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-015-0472-9","article-title":"Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research","volume":"16","author":"Bravo","year":"2015","journal-title":"BMC Bioinformatics"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1186\/1471-2105-9-207","article-title":"Extraction of semantic biomedical relations from text using conditional random fields","volume":"9","author":"Bundschus","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023021615070613300_","first-page":"1845","author":"Bundschus","year":"2010"},{"key":"2023021615070613300_","first-page":"61","author":"Can","year":"2005"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"e1005598","DOI":"10.1371\/journal.pcbi.1005598","article-title":"Network propagation in the cytoscape cyberinfrastructure","volume":"13","author":"Carlin","year":"2017","journal-title":"PLoS Comput. Biol"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-14-S18-S1","article-title":"Enrichr: interactive and collaborative html5 gene list enrichment analysis tool","volume":"14","author":"Chen","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1186\/1471-2105-10-73","article-title":"Disease candidate gene identification and prioritization using protein interaction networks","volume":"10","author":"Chen","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"e6875","DOI":"10.1371\/journal.pone.0006875","article-title":"Apoptotic engulfment pathway and schizophrenia","volume":"4","author":"Chen","year":"2009","journal-title":"PLoS ONE"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1016\/j.neucom.2014.10.081","article-title":"A robust ensemble approach to learn from positive and unlabeled data using SVM base models","volume":"160","author":"Claesen","year":"2015","journal-title":"Neurocomputing"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"7167","DOI":"10.1038\/s41598-018-25408-6","article-title":"An initial melanoma diagnosis may increase the subsequent risk of prostate cancer: results from the New South Wales cancer registry","volume":"8","author":"Cole-Clark","year":"2018","journal-title":"Sci. Rep"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-vector networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach. Learn"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1176\/ajp.149.4.443","article-title":"Depression and Parkinson\u2019s disease: a review","volume":"149","author":"Cummings","year":"1992","journal-title":"Am. J. Psychiatry"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1016\/j.tig.2021.09.005","article-title":"Every gene can (and possibly will) be associated with cancer","volume":"38","author":"De Magalh\u00e3es","year":"2022","journal-title":"Trends Genet"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"429","DOI":"10.1002\/wsbm.1177","article-title":"Recent approaches to the prioritization of candidate disease genes","volume":"4","author":"Doncheva","year":"2012","journal-title":"Wiley Interdiscip. Rev. Syst. Biol. Med"},{"key":"2023021615070613300_","first-page":"155","article-title":"Support vector regression machines","volume":"9","author":"Drucker","year":"1997","journal-title":"Adv. Neural Inform. Process. Syst"},{"key":"2023021615070613300_","first-page":"213","author":"Elkan","year":"2008"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"1575","DOI":"10.1093\/nar\/30.7.1575","article-title":"An efficient algorithm for large-scale detection of protein families","volume":"30","author":"Enright","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1111\/adb.12589","article-title":"Sex hormones in alcohol consumption: a systematic review of evidence","volume":"24","author":"Erol","year":"2019","journal-title":"Addict. Biol"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"8485","DOI":"10.1038\/s41598-018-26468-4","article-title":"The role of glycosyltransferase enzyme GCNT3 in colon and ovarian cancer prognosis and chemoresistance","volume":"8","author":"Fern\u00e1ndez","year":"2018","journal-title":"Sci. Rep"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"e1004120","DOI":"10.1371\/journal.pcbi.1004120","article-title":"A DIseAse MOdule detection (DIAMOnD) algorithm derived from a systematic analysis of connectivity patterns of disease proteins in the human interactome","volume":"11","author":"Ghiassian","year":"2015","journal-title":"PLoS Comput. Biol"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"e43557","DOI":"10.1371\/journal.pone.0043557","article-title":"Exploiting protein\u2013protein interaction networks for genome-wide disease\u2013gene prioritization","volume":"7","author":"Guney","year":"2012","journal-title":"PLoS ONE"},{"key":"2023021615070613300_","author":"Hastie","year":"2001"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"2909","DOI":"10.3934\/mbe.2021147","article-title":"Network diffusion with centrality measures to identify disease-related genes","volume":"18","author":"Janyasupab","year":"2021","journal-title":"Math. Biosci. Eng"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1016\/j.physa.2018.05.128","article-title":"A biased least squares support vector machine based on Mahalanobis distance for Pu learning","volume":"509","author":"Ke","year":"2018","journal-title":"Phys. A Statist. Mech. Appl"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"949","DOI":"10.1016\/j.ajhg.2008.02.013","article-title":"Walking the interactome for prioritization of candidate disease genes","volume":"82","author":"K\u00f6hler","year":"2008","journal-title":"Am. J. Hum. Genet"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"W90","DOI":"10.1093\/nar\/gkw377","article-title":"Enrichr: a comprehensive gene set enrichment analysis web server 2016 update","volume":"44","author":"Kuleshov","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"e1007306","DOI":"10.1371\/journal.pgen.1007306","article-title":"One for all and all for one: improving replication of genetic studies through network diffusion","volume":"14","author":"Lancour","year":"2018","journal-title":"PLoS Genet"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"1950","DOI":"10.3390\/antiox10121950","article-title":"Clinical diagnosis and treatment of Leigh syndrome based on surf1: genotype and phenotype","volume":"10","author":"Lee","year":"2021","journal-title":"Antioxidants"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"1219","DOI":"10.1093\/bioinformatics\/btq108","article-title":"Genome-wide inferring gene\u2013phenotype relationship by walking on the heterogeneous network","volume":"26","author":"Li","year":"2010","journal-title":"Bioinformatics"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-11-S1-S20","article-title":"Integration of multiple data sources to prioritize candidate genes using discounted rating system","volume":"11","author":"Li","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023021615070613300_","first-page":"179","author":"Liu","year":"2003"},{"key":"2023021615070613300_","first-page":"e10448","article-title":"Manganese, a likely cause of \u2018Parkinson\u2019s in cirrhosis\u2019, a unique clinical entity of acquired hepatocerebral degeneration","volume":"12","author":"Mehkari","year":"2020","journal-title":"Cureus"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1016\/j.patrec.2013.06.010","article-title":"A bagging SVM to learn from positive and unlabeled examples","volume":"37","author":"Mordelet","year":"2014","journal-title":"Pattern Recogn. Lett"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"i302","DOI":"10.1093\/bioinformatics\/bti1054","article-title":"Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps","volume":"21","author":"Nabieva","year":"2005","journal-title":"Bioinformatics"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"460","DOI":"10.1186\/1471-2105-11-460","article-title":"Candidate gene prioritization by network analysis of differential expression using machine learning approaches","volume":"11","author":"Nitsch","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"578","DOI":"10.12688\/f1000research.10788.1","article-title":"Recent advances in predicting gene\u2013disease associations","volume":"6","author":"Opap","year":"2017","journal-title":"F1000Research"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"1713","DOI":"10.3390\/genes12111713","article-title":"Moses: a new approach to integrate interactome topology and functional features for disease gene prediction","volume":"12","author":"Petti","year":"2021","journal-title":"Genes"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"e1007276","DOI":"10.1371\/journal.pcbi.1007276","article-title":"Benchmarking network propagation methods for disease gene identification","volume":"15","author":"Picart-Armada","year":"2019","journal-title":"PLoS Comput. Biol"},{"key":"2023021615070613300_","article-title":"DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants","author":"Pi\u00f1ero","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023021615070613300_","first-page":"D845","article-title":"The DisGeNET knowledge platform for disease genomics: 2019 update","volume":"48","author":"Pi\u00f1ero","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"678","DOI":"10.1111\/j.1742-4658.2012.08471.x","article-title":"Computational approaches to disease\u2013gene prediction: rationale, classification and successes","volume":"279","author":"Piro","year":"2012","journal-title":"FEBS J"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"e1111","DOI":"10.1038\/tp.2017.83","article-title":"Perturbations in the apoptotic pathway and mitochondrial network dynamics in peripheral blood mononuclear cells from bipolar disorder patients","volume":"7","author":"Scaini","year":"2017","journal-title":"Transl. Psychiatry"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"1694","DOI":"10.3390\/biomedicines10071694","article-title":"Network proximity-based drug repurposing strategy for early and late stages of primary biliary cholangitis","volume":"10","author":"Shahini","year":"2022","journal-title":"Biomedicines"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"S106","DOI":"10.1016\/S1590-8658(22)00356-5","article-title":"Network proximity-based drug repurposing strategy for primary biliary cirrhosis","volume":"54","author":"Shahini","year":"2022","journal-title":"Dig. Liver Dis"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"751","DOI":"10.1038\/mp.2010.52","article-title":"Altered expression of genes involved in inflammation and apoptosis in frontal cortex in major depression","volume":"16","author":"Shelton","year":"2011","journal-title":"Mol. Psychiatry"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"e1489","DOI":"10.1002\/wsbm.1489","article-title":"Molecular networks in network medicine: development and applications","volume":"12","author":"Silverman","year":"2020","journal-title":"Wiley Interdiscip. Rev. Syst. Biol. Med"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"D535","DOI":"10.1093\/nar\/gkj109","article-title":"BioGRID: a general repository for interaction datasets","volume":"34","author":"Stark","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"61","DOI":"10.7150\/ijbs.7.61","article-title":"Prediction of human disease-related gene clusters by clustering analysis","volume":"7","author":"Sun","year":"2011","journal-title":"Int. J. Biol. Sci"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"805","DOI":"10.1016\/B978-0-12-809633-8.20290-2","volume-title":"Encyclopedia of Bioinformatics and Computational Biology","author":"Tieri","year":"2019"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"497","DOI":"10.1093\/bioinformatics\/bty637","article-title":"Random walk with restart on multiplex and heterogeneous biological networks","volume":"35","author":"Valdeolivas","year":"2019","journal-title":"Bioinformatics"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"e2011069","DOI":"10.4084\/mjhid.2011.069","article-title":"Incidence of acute myeloid leukemia after breast cancer","volume":"3","author":"Valentini","year":"2011","journal-title":"Mediterr. J. Hematol. Infect. Dis"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"1274","DOI":"10.1093\/bioinformatics\/btm087","article-title":"A new method to measure the semantic similarity of GO terms","volume":"23","author":"Wang","year":"2007","journal-title":"Bioinformatics"},{"key":"2023021615070613300_","first-page":"266","author":"White","year":"2003"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"e90","DOI":"10.1002\/cpz1.90","article-title":"Gene set knowledge discovery with Enrichr","volume":"1","author":"Xie","year":"2021","journal-title":"Curr. Protoc"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"2800","DOI":"10.1093\/bioinformatics\/btl467","article-title":"Discovering disease-genes by topological features in human protein\u2013protein interaction network","volume":"22","author":"Xu","year":"2006","journal-title":"Bioinformatics"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12889-015-1355-8","article-title":"Association between alcohol consumption and the risk of ovarian cancer: a meta-analysis of prospective observational studies","volume":"15","author":"Yan-Hong","year":"2015","journal-title":"BMC Public Health"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"2640","DOI":"10.1093\/bioinformatics\/bts504","article-title":"Positive-unlabeled learning for disease gene identification","volume":"28","author":"Yang","year":"2012","journal-title":"Bioinformatics"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"e97079","DOI":"10.1371\/journal.pone.0097079","article-title":"Ensemble positive unlabeled learning for disease gene identification","volume":"9","author":"Yang","year":"2014","journal-title":"PLoS ONE"},{"key":"2023021615070613300_","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1002\/mrdd.20163","article-title":"Alzheimer\u2019s disease in down syndrome: neurobiology and risk","volume":"13","author":"Zigman","year":"2007","journal-title":"Ment. Retard. Dev. Disabil. Res. Rev"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btac848\/49047340\/btac848.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/2\/btac848\/49228305\/btac848.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/2\/btac848\/49228305\/btac848.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,23]],"date-time":"2023-03-23T18:20:14Z","timestamp":1679595614000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btac848\/7023926"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2023,2,1]]},"references-count":63,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,2,3]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac848","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,2,1]]},"published":{"date-parts":[[2023,2,1]]},"article-number":"btac848"}}