{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:36:48Z","timestamp":1772138208946,"version":"3.50.1"},"reference-count":32,"publisher":"Oxford University Press (OUP)","license":[{"start":{"date-parts":[[2025,1,18]],"date-time":"2025-01-18T00:00:00Z","timestamp":1737158400000},"content-version":"vor","delay-in-days":17,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004191","name":"Novo Nordisk","doi-asserted-by":"publisher","award":["NFF17OC0027594 NNF14CC0001"],"award-info":[{"award-number":["NFF17OC0027594 NNF14CC0001"]}],"id":[{"id":"10.13039\/501100004191","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Marie Sklodowska-Curie","award":["101023676"],"award-info":[{"award-number":["101023676"]}]},{"DOI":"10.13039\/501100004191","name":"Novo Nordisk","doi-asserted-by":"publisher","award":["NFF17OC0027594 NNF14CC0001"],"award-info":[{"award-number":["NFF17OC0027594 NNF14CC0001"]}],"id":[{"id":"10.13039\/501100004191","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Marie Sklodowska-Curie","award":["101023676"],"award-info":[{"award-number":["101023676"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,1,13]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Lifestyle factors (LSFs) are increasingly recognized as instrumental in both the development and control of diseases. Despite their importance, there is a lack of methods to extract relations between LSFs and diseases from the literature, a step necessary to consolidate the currently available knowledge into a structured form. As simple co-occurrence-based relation extraction (RE) approaches are unable to distinguish between the different types of LSF-disease relations, context-aware models such as transformers are required to extract and classify these relations into specific relation types. However, no comprehensive LSF\u2013disease RE system existed, nor a corpus suitable for developing one. We present LSD600 (available at https:\/\/zenodo.org\/records\/13952449), the first corpus specifically designed for LSF\u2013disease RE, comprising 600 abstracts with 1900 relations of eight distinct types between 5027 diseases and 6930 LSF entities. We evaluated LSD600\u2019s quality by training a RoBERTa model on the corpus, achieving an F-score of 68.5% for the multilabel RE task on the held-out test set. We further validated LSD600 by using the trained model on the two Nutrition-Disease and FoodDisease datasets, where it achieved F-scores of 70.7% and 80.7%, respectively. Building on these performance results, LSD600 and the RE system trained on it can be valuable resources to fill the existing gap in this area and pave the way for downstream applications.<\/jats:p>\n                  <jats:p>Database URL: https:\/\/zenodo.org\/records\/13952449<\/jats:p>","DOI":"10.1093\/database\/baae129","type":"journal-article","created":{"date-parts":[[2024,12,9]],"date-time":"2024-12-09T15:32:31Z","timestamp":1733758351000},"source":"Crossref","is-referenced-by-count":0,"title":["LSD600: the first corpus of biomedical abstracts annotated with lifestyle\u2013disease relations"],"prefix":"10.1093","volume":"2025","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1933-2550","authenticated-orcid":false,"given":"Esmaeil","family":"Nourani","sequence":"first","affiliation":[{"name":"Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen , Blegdamsvej 3, Copenhagen 2200,","place":["Denmark"]},{"name":"Faculty of Information Technology and Computer Engineering, Azarbaijan Shahid Madani University , Tabriz,","place":["Iran"]}]},{"given":"Evangelia-Mantelena","family":"Makri","sequence":"additional","affiliation":[{"name":"Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen , Blegdamsvej 3, Copenhagen 2200,","place":["Denmark"]},{"name":"Department of Nutrition and Dietetics, Harokopio University , Athens 17676, Attiki,","place":["Greece"]}]},{"given":"Xiqing","family":"Mao","sequence":"additional","affiliation":[{"name":"Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen , Blegdamsvej 3, Copenhagen 2200,","place":["Denmark"]}]},{"given":"Sampo","family":"Pyysalo","sequence":"additional","affiliation":[{"name":"TurkuNLP group, Department of Computing, Faculty of Technology, University of Turku , Turku 20014,","place":["Finland"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0316-5866","authenticated-orcid":false,"given":"S\u00f8ren","family":"Brunak","sequence":"additional","affiliation":[{"name":"Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen , Blegdamsvej 3, Copenhagen 2200,","place":["Denmark"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3611-5726","authenticated-orcid":false,"given":"Katerina","family":"Nastou","sequence":"additional","affiliation":[{"name":"Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen , Blegdamsvej 3, Copenhagen 2200,","place":["Denmark"]}]},{"given":"Lars Juhl","family":"Jensen","sequence":"additional","affiliation":[{"name":"Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen , Blegdamsvej 3, Copenhagen 2200,","place":["Denmark"]}]}],"member":"286","published-online":{"date-parts":[[2025,1,17]]},"reference":[{"key":"2025092510515332900_R1","doi-asserted-by":"publisher","first-page":"817","DOI":"10.1093\/ije\/dyac238","article-title":"Lifestyle, genetic risk and incidence of cancer: a prospective cohort study of 13 cancer types","volume":"52","author":"Byrne","year":"2023","journal-title":"Int J Epidemiol"},{"key":"2025092510515332900_R2","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1038\/nrneurol.2016.187","article-title":"Interactions between genetic, lifestyle and environmental risk factors for multiple sclerosis","volume":"13","author":"Olsson","year":"2017","journal-title":"Nat Rev Neurol"},{"key":"2025092510515332900_R3","doi-asserted-by":"publisher","DOI":"10.14309\/ajg.0000000000002180","article-title":"The contribution of genetic risk and lifestyle factors in the development of adult-onset inflammatory bowel disease: a prospective cohort study","volume":"118","author":"Sun","year":"2023","journal-title":"Official J Am Coll Gastroenterol | ACG"},{"key":"2025092510515332900_R4","doi-asserted-by":"publisher","DOI":"10.1007\/s11886-019-1177-x","article-title":"Contributions of interactions between lifestyle and genetics on coronary artery disease risk","volume":"21","author":"Said","year":"2019","journal-title":"Curr Cardiol Rep"},{"key":"2025092510515332900_R5","doi-asserted-by":"publisher","DOI":"10.1186\/s12916-017-0938-x","article-title":"Lifestyle precision medicine: the next generation in type 2 diabetes prevention?","volume":"15","author":"Mutie","year":"2017","journal-title":"BMC Med"},{"key":"2025092510515332900_R6","doi-asserted-by":"publisher","DOI":"10.3390\/nu16050581","article-title":"Precision nutrition unveiled: gene\u2013nutrient interactions, microbiota dynamics, and lifestyle factors in obesity management","volume":"16","author":"Mansour","year":"2024","journal-title":"Nutrients"},{"key":"2025092510515332900_R7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/JBHI.2023.3338356","article-title":"KG4NH: a comprehensive knowledge graph for question answering in dietary nutrition and human health","author":"Fu","journal-title":"IEEE J. Biomed. Health Inform"},{"key":"2025092510515332900_R8","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2023.104460","article-title":"GENA: A knowledge graph for nutrition and mental health","volume":"145","author":"Dang","year":"2023","journal-title":"J Biomed Inform"},{"key":"2025092510515332900_R9","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-023-34981-4","article-title":"From language models to large-scale food and biomedical knowledge graphs","volume":"13","author":"Cenikj","year":"2023","journal-title":"Sci Rep"},{"key":"2025092510515332900_R10","volume":"2022","author":"Grissa","year":"2022","journal-title":"Diseases 2.0: a weekly updated database of disease\u2013gene associations from text mining and data integration"},{"key":"2025092510515332900_R11","first-page":"636","article-title":"Electromagnetic field induced biological effects in humans","volume":"72","author":"Kaszuba-Zwoi\u0144ska","year":"2015","journal-title":"Przegl Lek"},{"key":"2025092510515332900_R12","doi-asserted-by":"publisher","first-page":"241","DOI":"10.1016\/j.soncn.2016.05.005","article-title":"Ultraviolet radiation exposure and its impact on skin cancer risk","volume":"32","author":"Watson","year":"2016","journal-title":"Semin Oncol Nurs"},{"key":"2025092510515332900_R13","doi-asserted-by":"publisher","first-page":"473","DOI":"10.5604\/17322693.1101572","article-title":"Applications of electromagnetic radiation in medicine","volume":"68","author":"Mi\u0142owska","year":"2014","journal-title":"Postepy Hig Med Dosw"},{"key":"2025092510515332900_R14","article-title":"BERT: pre-training of deep bidirectional transformers for language understanding","author":"Devlin","year":"2018"},{"key":"2025092510515332900_R15","article-title":"RoBERTa: a robustly optimized BERT pretraining approach","author":"Liu","year":"2019"},{"key":"2025092510515332900_R16","article-title":"Clinical relation extraction using transformer-based models","author":"Yang","year":"2021"},{"key":"2025092510515332900_R17","doi-asserted-by":"publisher","DOI":"10.1093\/nargab\/lqab062","article-title":"RENET2: high-performance full-text gene\u2013disease relation extraction with iterative training data expansion","volume":"3","author":"Su","year":"2021","journal-title":"NAR Genomics Bioinform"},{"key":"2025092510515332900_R18","doi-asserted-by":"publisher","first-page":"408","DOI":"10.1093\/bioinformatics\/btq667","article-title":"Toward an automatic method for extracting cancer- and other disease-related point mutations from the biomedical literature","volume":"27","author":"Doughty","year":"2011","journal-title":"Bioinformatics"},{"key":"2025092510515332900_R19","volume-title":"BioCreative V CDR Task Corpus: a resource for chemical disease relation extraction","author":"Li","year":"2016"},{"key":"2025092510515332900_R20","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btae613","article-title":"Lifestyle factors in the biomedical literature: an ontology and comprehensive resources for named entity recognition","volume":"40","author":"Nourani","year":"2024","journal-title":"Bioinformatics"},{"key":"2025092510515332900_R21","doi-asserted-by":"publisher","first-page":"D955","DOI":"10.1093\/nar\/gky1032","article-title":"Human disease Ontology 2018 update: classification, content and workflow expansion","volume":"47","author":"Schriml","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2025092510515332900_R22","doi-asserted-by":"publisher","first-page":"112","DOI":"10.1080\/13506129.2019.1603143","article-title":"AmyCo: the amyloidoses collection","volume":"26","author":"Nastou","year":"2019","journal-title":"Amyloid"},{"key":"2025092510515332900_R23","first-page":"1","article-title":"Overview of BioNLP\u201909 shared task on event extraction","author":"Kim","year":"2009"},{"key":"2025092510515332900_R24","doi-asserted-by":"publisher","first-page":"146","DOI":"10.1093\/bib\/bbz130","article-title":"An extensive review of tools for manual annotation of documents","volume":"22","author":"Neves","year":"2021","journal-title":"Briefings Bioinform"},{"key":"2025092510515332900_R25","first-page":"102","article-title":"Brat: a Web-based Tool for NLP-Assisted Text Annotation","author":"Stenetorp","year":"2012"},{"key":"2025092510515332900_R26","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btae552","article-title":"STRING-ing together protein complexes: corpus and methods for extracting physical protein interactions from the biomedical literature. STRING-ing together protein complexes: corpus and methods for extracting physical protein interactions from the biomedical literature","volume":"40","author":"Mehryary","year":"2024","journal-title":"Bioinformatics"},{"key":"2025092510515332900_R27","doi-asserted-by":"publisher","DOI":"10.1093\/database\/baae095","article-title":"RegulaTome: a corpus of typed, directed, and signed relations between biomedical entities in the scientific literature. RegulaTome: a corpus of typed, directed, and signed relations between biomedical entities in the scientific literature","volume":"2024","author":"Nastou","year":"2024","journal-title":"Database"},{"key":"2025092510515332900_R28","doi-asserted-by":"crossref","DOI":"10.3115\/1572340.1572343","article-title":"Extracting complex biological events with rich graph-based feature sets","author":"Bj\u00f6rne","year":"2009"},{"key":"2025092510515332900_R29","doi-asserted-by":"publisher","DOI":"10.1093\/database\/baad080","article-title":"Overview of drugprot task at biocreative VII: data and methods for large-scale text mining and knowledge graph generation of heterogenous chemical\u2013protein relations","volume":"2023","author":"Miranda-Escalada","year":"2023","journal-title":"Database"},{"key":"2025092510515332900_R30","volume":"2018","author":"Mehryary","year":"2018","journal-title":"Potent pairing: ensemble of long short-term memory networks and support vector machine for chemical-protein relation extraction"},{"key":"2025092510515332900_R31","first-page":"764","article-title":"DocRED: a large-scale document-level relation extraction dataset","author":"Yao","year":"2019"},{"key":"2025092510515332900_R32","doi-asserted-by":"publisher","DOI":"10.1093\/database\/baae039","article-title":"DUVEL: an active-learning annotated biomedical corpus for the recognition of oligogenic combinations","volume":"2024","author":"Nachtegael","year":"2024","journal-title":"Database"}],"container-title":["Database"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baae129\/61489733\/baae129.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baae129\/61489733\/baae129.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,25]],"date-time":"2025-09-25T14:52:00Z","timestamp":1758811920000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/database\/article\/doi\/10.1093\/database\/baae129\/7959822"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025]]},"references-count":32,"URL":"https:\/\/doi.org\/10.1093\/database\/baae129","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.08.30.24312862","asserted-by":"object"}]},"ISSN":["1758-0463"],"issn-type":[{"value":"1758-0463","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025]]},"published":{"date-parts":[[2025]]},"article-number":"baae129"}}