{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T01:55:38Z","timestamp":1775872538275,"version":"3.50.1"},"reference-count":42,"publisher":"Oxford University Press (OUP)","license":[{"start":{"date-parts":[[2022,3,28]],"date-time":"2022-03-28T00:00:00Z","timestamp":1648425600000},"content-version":"vor","delay-in-days":86,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,3,28]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The scientific knowledge about which genes are involved in which diseases grows rapidly, which makes it difficult to keep up with new publications and genetics datasets. The DISEASES database aims to provide a comprehensive overview by systematically integrating and assigning confidence scores to evidence for disease\u2013gene associations from curated databases, genome-wide association studies (GWAS) and automatic text mining of the biomedical literature. Here, we present a major update to this resource, which greatly increases the number of associations from all these sources. This is especially true for the text-mined associations, which have increased by at least 9-fold at all confidence cutoffs. We show that this dramatic increase is primarily due to adding full-text articles to the text corpus, secondarily due to improvements to both the disease and gene dictionaries used for named entity recognition, and only to a very small extent due to the growth in number of PubMed abstracts. DISEASES now also makes use of a new GWAS database, Target Illumination by GWAS Analytics, which considerably increased the number of GWAS-derived disease\u2013gene associations. DISEASES itself is also integrated into several other databases and resources, including GeneCards\/MalaCards, Pharos\/Target Central Resource Database and the Cytoscape stringApp. All data in DISEASES are updated on a weekly basis and is available via a web interface at https:\/\/diseases.jensenlab.org, from where it can also be downloaded under open licenses.<\/jats:p>\n                  <jats:p>Database URL: https:\/\/diseases.jensenlab.org<\/jats:p>","DOI":"10.1093\/database\/baac019","type":"journal-article","created":{"date-parts":[[2022,3,11]],"date-time":"2022-03-11T15:10:03Z","timestamp":1647011403000},"source":"Crossref","is-referenced-by-count":120,"title":["Diseases 2.0: a weekly updated database of disease\u2013gene associations from text mining and data integration"],"prefix":"10.1093","volume":"2022","author":[{"given":"Dhouha","family":"Grissa","sequence":"first","affiliation":[{"name":"Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen 2200, Denmark"}]},{"given":"Alexander","family":"Junge","sequence":"additional","affiliation":[{"name":"Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen 2200, Denmark"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6195-6976","authenticated-orcid":false,"given":"Tudor I","family":"Oprea","sequence":"additional","affiliation":[{"name":"Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen 2200, Denmark"},{"name":"Department of Internal Medicine, Division of Translational Informatics, University of New Mexico Health Sciences Center, Albuquerque, NM, USA"}]},{"given":"Lars Juhl","family":"Jensen","sequence":"additional","affiliation":[{"name":"Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen 2200, Denmark"}]}],"member":"286","published-online":{"date-parts":[[2022,3,24]]},"reference":[{"key":"2022032914104401800_R1","doi-asserted-by":"publisher","first-page":"83","DOI":"10.1016\/j.ymeth.2014.11.020","article-title":"DISEASES: text mining and data integration of disease-gene associations","volume":"74","author":"Pletscher-Frankild","year":"2015","journal-title":"Methods"},{"key":"2022032914104401800_R2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pcbi.1005962","article-title":"A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts","volume":"14","author":"Westergaard","year":"2018","journal-title":"PLoS Comput. Biol."},{"key":"2022032914104401800_R3","doi-asserted-by":"publisher","first-page":"3533","DOI":"10.1093\/bioinformatics\/btz070","article-title":"PMC text mining subset in BioC: about three million full-text articles and growing","volume":"35","author":"Comeau","year":"2019","journal-title":"Bioinformatics"},{"key":"2022032914104401800_R4","doi-asserted-by":"publisher","DOI":"10.3389\/fphar.2020.602030","article-title":"A novel text-mining approach for retrieving pharmacogenomics associations from the literature","volume":"11","author":"Pandi","year":"2020","journal-title":"Front. Pharmacol."},{"key":"2022032914104401800_R5","doi-asserted-by":"publisher","DOI":"10.3389\/fmicb.2015.01386","article-title":"Literature mining and ontology based analysis of host-Brucella gene\u2013gene interaction network","volume":"6","author":"Karadeniz","year":"2015","journal-title":"Front. Microbiol."},{"key":"2022032914104401800_R6","doi-asserted-by":"publisher","DOI":"10.2196\/28247","article-title":"A novel metric to quantify the effect of pathway enrichment evaluation with respect to biomedical text-mined terms: development and feasibility study","volume":"9","author":"Qin","year":"2021","journal-title":"JMIR Med. Inform."},{"key":"2022032914104401800_R7","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1007\/978-981-10-1503-8_7","article-title":"Text mining for precision medicine: bringing structure to EHRs and biomedical literature to understand genes and health","volume":"939","author":"Simmons","year":"2016","journal-title":"Adv. Exp. Med. Biol."},{"key":"2022032914104401800_R8","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-018-2048-y","article-title":"The research on gene-disease association based on text-mining of PubMed","volume":"19","author":"Zhou","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2022032914104401800_R9","first-page":"pp. 135","volume-title":"Mining Biological Networks from Full-Text Articles","author":"Czarnecki","year":"2014"},{"key":"2022032914104401800_R10","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1038\/ng0501-21","article-title":"A literature network of human genes for high-throughput analysis of gene expression","volume":"28","author":"Jenssen","year":"2001","journal-title":"Nat. Genet."},{"key":"2022032914104401800_R11","doi-asserted-by":"publisher","first-page":"2559","DOI":"10.1093\/bioinformatics\/btn469","article-title":"FACTA: a text search engine for finding associated biomedical concepts","volume":"24","author":"Tsuruoka","year":"2008","journal-title":"Bioinformatics"},{"key":"2022032914104401800_R12","first-page":"D158","article-title":"UniProt: the universal protein knowledgebase","volume":"46","author":"The UniProt Consortium","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R13","doi-asserted-by":"publisher","first-page":"D1038","DOI":"10.1093\/nar\/gky1151","article-title":"OMIM.org: leveraging knowledge across phenotype\u2013gene relationships","volume":"47","author":"Amberger","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R14","first-page":"274","article-title":"Genetics home reference: helping patients understand the role of genetics in health and disease","volume":"9","author":"Fomous","year":"2006","journal-title":"Community Genet."},{"key":"2022032914104401800_R15","doi-asserted-by":"publisher","DOI":"10.1002\/0471142905.hg1011s57","article-title":"The Catalogue of Somatic Mutations in Cancer (COSMIC)","author":"Forbes","year":"2008","journal-title":"Curr. Protoc. Hum. Genet"},{"key":"2022032914104401800_R16","doi-asserted-by":"publisher","first-page":"555","DOI":"10.1038\/s41568-020-0290-x","article-title":"A compendium of mutational cancer driver genes","volume":"20","author":"Mart\u00ednez-Jim\u00e9nez","year":"2020","journal-title":"Nat. Rev. Cancer"},{"key":"2022032914104401800_R17","doi-asserted-by":"publisher","DOI":"10.1093\/database\/baw100","article-title":"The Harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins","author":"Rouillard","year":"2016","journal-title":"Database"},{"key":"2022032914104401800_R18","first-page":"D933","article-title":"GWAS Central: a comprehensive resource for the discovery and comparison of genotype and phenotype data from genome-wide association studies","volume":"48","author":"Beck","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R19","doi-asserted-by":"publisher","first-page":"D869","DOI":"10.1093\/nar\/gkv1317","article-title":"GWASdb v2: an update database for human genetic variants identified by genome-wide association studies","volume":"44","author":"Li","year":"2015","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R20","doi-asserted-by":"publisher","first-page":"241","DOI":"10.1038\/nrg2554","article-title":"Human genetic variation and its contribution to complex traits","volume":"10","author":"Frazer","year":"2009","journal-title":"Nat. Rev. Genet."},{"key":"2022032914104401800_R21","doi-asserted-by":"publisher","first-page":"D1036","DOI":"10.1093\/nar\/gkr899","article-title":"DistiLD database: diseases and traits in linkage disequilibrium blocks","volume":"40","author":"Pallej\u00e0","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R22","doi-asserted-by":"publisher","first-page":"3865","DOI":"10.1093\/bioinformatics\/btab427","article-title":"TIGA: target illumination GWAS analytics","volume":"37","author":"Yang","year":"2021","journal-title":"Bioinformatics"},{"key":"2022032914104401800_R23","doi-asserted-by":"publisher","first-page":"D877","DOI":"10.1093\/nar\/gkw1012","article-title":"MalaCards: an amalgamated human disease compendium with diverse clinical and genetic annotation and structured search","volume":"45","author":"Rappaport","year":"2016","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R24","first-page":"D845","article-title":"The DisGeNET knowledge platform for disease genomics: 2019 update","volume":"48","author":"Pi\u00f1ero","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R25","doi-asserted-by":"publisher","first-page":"D1334","DOI":"10.1093\/nar\/gkaa993","article-title":"TCRD and Pharos 2021: mining the human proteome for disease biology","volume":"49","author":"Sheils","year":"2020","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R26","doi-asserted-by":"publisher","first-page":"D1302","DOI":"10.1093\/nar\/gkaa1027","article-title":"Open Targets Platform: supporting systematic drug\u2013target identification and prioritisation","volume":"49","author":"Ochoa","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R27","doi-asserted-by":"publisher","first-page":"D607","DOI":"10.1093\/nar\/gky1131","article-title":"STRING v11: protein\u2013protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets","volume":"47","author":"Szklarczyk","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R28","doi-asserted-by":"publisher","first-page":"D955","DOI":"10.1093\/nar\/gky1032","article-title":"Human Disease Ontology 2018 update: classification, content and workflow expansion","volume":"47","author":"Schriml","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R29","doi-asserted-by":"publisher","first-page":"112","DOI":"10.1080\/13506129.2019.1603143","article-title":"AmyCo: the amyloidoses collection","volume":"26","author":"Nastou","year":"2019","journal-title":"Amyloid"},{"key":"2022032914104401800_R30","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pbio.1002541","article-title":"Relative Citation Ratio (RCR): a new metric that uses citation rates to measure influence at the article level","volume":"14","author":"Hutchins","year":"2016","journal-title":"PLoS Biol."},{"key":"2022032914104401800_R31","article-title":"BioC and simplified use of the PMC open access dataset for biomedical text mining","author":"Do\u01e7an","year":"2014"},{"key":"2022032914104401800_R32","doi-asserted-by":"crossref","DOI":"10.1126\/science.abb4930","article-title":"A single \u2018paper mill\u2019 appears to have churned out 400 papers, sleuths find","author":"Chawla","year":"2020","journal-title":"Science"},{"key":"2022032914104401800_R33","author":"Joulin","year":"2017","journal-title":"Bag of Tricks for Efficient Text Classification"},{"key":"2022032914104401800_R34","doi-asserted-by":"publisher","first-page":"D48","DOI":"10.1093\/nar\/gks1236","article-title":"Ensembl 2013","volume":"41","author":"Flicek","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R35","doi-asserted-by":"publisher","first-page":"D545","DOI":"10.1093\/nar\/gks1066","article-title":"Genenames.org: the HGNC resources in 2013","volume":"41","author":"Gray","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R36","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0065390","article-title":"The SPECIES and ORGANISMS resources for fast and accurate identification of taxonomic names in text","volume":"8","author":"Pafilis","year":"2013","journal-title":"PLoS One"},{"key":"2022032914104401800_R37","doi-asserted-by":"publisher","first-page":"516","DOI":"10.1038\/d41586-021-00733-5","article-title":"The fight against fake-paper factories that churn out sham science","volume":"591","author":"Else","year":"2021","journal-title":"Nature"},{"key":"2022032914104401800_R38","doi-asserted-by":"crossref","first-page":"1.30.1","DOI":"10.1002\/cpbi.5","article-title":"The GeneCards suite: from gene data mining to disease genome sequence analyses","volume":"54","author":"Stelzer","year":"2016","journal-title":"Curr. Protoc. Bioinform."},{"key":"2022032914104401800_R39","doi-asserted-by":"publisher","first-page":"2601","DOI":"10.1093\/bioinformatics\/btx200","article-title":"TIN-X: target importance and novelty explorer","volume":"33","author":"Cannon","year":"2017","journal-title":"Bioinformatics"},{"key":"2022032914104401800_R40","doi-asserted-by":"publisher","first-page":"W571","DOI":"10.1093\/nar\/gkz393","article-title":"Geneshot: search engine for ranking genes from arbitrary text queries","volume":"47","author":"Lachmann","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"2022032914104401800_R41","doi-asserted-by":"publisher","DOI":"10.1093\/database\/baw100","article-title":"The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins","volume":"2016","author":"Rouillard","year":"2016","journal-title":"Database"},{"key":"2022032914104401800_R42","doi-asserted-by":"publisher","first-page":"623","DOI":"10.1021\/acs.jproteome.8b00702","article-title":"Cytoscape StringApp: network analysis and visualization of proteomics data","volume":"18","author":"Doncheva","year":"2019","journal-title":"J. Proteome Res."}],"container-title":["Database"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baac019\/43083556\/baac019.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baac019\/43083556\/baac019.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T10:12:35Z","timestamp":1648548755000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/database\/article\/doi\/10.1093\/database\/baac019\/6554833"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,1]]},"references-count":42,"URL":"https:\/\/doi.org\/10.1093\/database\/baac019","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.12.07.471296","asserted-by":"object"}]},"ISSN":["1758-0463"],"issn-type":[{"value":"1758-0463","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,1,1]]},"published":{"date-parts":[[2022,1,1]]},"article-number":"baac019"}}