{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T12:10:32Z","timestamp":1775736632913,"version":"3.50.1"},"reference-count":46,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T00:00:00Z","timestamp":1648512000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T00:00:00Z","timestamp":1648512000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100003185","name":"Fraunhofer-Gesellschaft","doi-asserted-by":"publisher","award":["MAVO Project"],"award-info":[{"award-number":["MAVO Project"]}],"id":[{"id":"10.13039\/501100003185","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Knowl Inf Syst"],"published-print":{"date-parts":[[2022,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Contextual information is widely considered for NLP and knowledge discovery in life sciences since it highly influences the exact meaning of natural language. The scientific challenge is not only to extract such context data, but also to store this data for further query and discovery approaches. Classical approaches use RDF triple stores, which have serious limitations. Here, we propose a multiple step knowledge graph approach using labeled property graphs based on polyglot persistence systems to utilize context data for context mining, graph queries, knowledge discovery and extraction. We introduce the graph-theoretic foundation for a general context concept within semantic networks and show a proof of concept based on biomedical literature and text mining. Our test system contains a knowledge graph derived from the entirety of PubMed and SCAIView data and is enriched with text mining data and domain-specific language data using Biological Expression Language. Here, context is a more general concept than annotations. This dense graph has more than 71M nodes and 850M relationships. We discuss the impact of this novel approach with 27 real-world use cases represented by graph queries. Storing and querying a giant knowledge graph as a labeled property graph is still a technological challenge. Here, we demonstrate how our data model is able to support the understanding and interpretation of biomedical data. We present several real-world use cases that utilize our massive, generated knowledge graph derived from PubMed data and enriched with additional contextual data. Finally, we show a working example in context of biologically relevant information using SCAIView.<\/jats:p>","DOI":"10.1007\/s10115-022-01668-7","type":"journal-article","created":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T15:34:03Z","timestamp":1648568043000},"page":"1239-1262","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Context mining and graph queries on giant biomedical knowledge graphs"],"prefix":"10.1007","volume":"64","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0245-7752","authenticated-orcid":false,"given":"Jens","family":"D\u00f6rpinghaus","sequence":"first","affiliation":[]},{"given":"Andreas","family":"Stefan","sequence":"additional","affiliation":[]},{"given":"Bruce","family":"Schultz","sequence":"additional","affiliation":[]},{"given":"Marc","family":"Jacobs","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,3,29]]},"reference":[{"key":"1668_CR1","doi-asserted-by":"crossref","unstructured":"Desai M, Mehta RG, Rana DP (2018) Issues and challenges in big graph modelling for smart city: an extensive survey. Int J Comput Intell IoT 1(1)","DOI":"10.1109\/CCAA.2018.8777454"},{"key":"1668_CR2","unstructured":"Dumontier M, Callahan A, Cruz-Toledo J, Ansell P, Emonet V, Belleau F, Droit A (2014) Bio2rdf release 3: a larger connected network of linked data for the life sciences. In: Proceedings of the 2014 international conference on posters and demonstrations track, vol 1272, pp 401\u2013404"},{"key":"1668_CR3","doi-asserted-by":"crossref","unstructured":"Callahan A, Cruz-Toledo J, Ansell P, Dumontier M (2013) Bio2rdf release 2: improved coverage, interoperability and provenance of life science linked data. In: Extended semantic web conference, pp 200\u2013212","DOI":"10.1007\/978-3-642-38288-8_14"},{"key":"1668_CR4","unstructured":"Li S, Xin L (2014) Research on integration and sharing of scientific data based on linked data-a case study of bio2rdf. Res Library Sci 21"},{"key":"1668_CR5","unstructured":"Natsiavas P, Koutkias V, Maglaveras N (2015) Exploring the capacity of open, linked data sources to assess adverse drug reaction signals. In: SWAT4LS, pp 224\u2013226"},{"key":"1668_CR6","doi-asserted-by":"crossref","unstructured":"Aggarwal CC, Zhai C (2012) An introduction to text mining. In: Mining text data. Springer, Berlin, pp 1\u201310","DOI":"10.1007\/978-1-4614-3223-4_1"},{"key":"1668_CR7","doi-asserted-by":"crossref","unstructured":"D\u00f6rpinghaus J, Stefan A (2019) Knowledge extraction and applications utilizing context data in knowledge graphs. In: 2019 Federated conference on computer science and information systems (FedCSIS). IEEE, pp 265\u2013272","DOI":"10.15439\/2019F3"},{"key":"1668_CR8","doi-asserted-by":"crossref","unstructured":"Hanisch D, Fundel K, Mevissen H-T, Zimmer R, Fluck J (2005) ProMiner: rule-based protein and gene entity recognition. BMC Bioinform 6 Suppl 1:14","DOI":"10.1186\/1471-2105-6-S1-S14"},{"key":"1668_CR9","unstructured":"Fluck J, Klenner A, Madan S, Ansari S, Bobic T, Hoeng J, Hofmann-Apitius M, Peitsch M (2013) Bel networks derived from qualitative translations of bionlp shared task annotations. In: Proceedings of the 2013 workshop on biomedical natural language processing, pp 80\u201388"},{"issue":"1","key":"1668_CR10","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1038\/75556","volume":"25","author":"M Ashburner","year":"2000","unstructured":"Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25(1):25","journal-title":"Nat Genet"},{"key":"1668_CR11","doi-asserted-by":"crossref","unstructured":"Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, Sajed T, Johnson D, Li C, Sayeeda Z, et\u00a0al (2017) Drugbank 5.0: a major update to the drugbank database for 2018. Nucleic Acids Res 46(D1):1074\u20131082","DOI":"10.1093\/nar\/gkx1037"},{"key":"1668_CR12","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1016\/j.ecoenv.2018.10.060","volume":"168","author":"K Khan","year":"2019","unstructured":"Khan K, Benfenati E, Roy K (2019) Consensus qsar modeling of toxicity of pharmaceuticals to different aquatic organisms: ranking and prioritization of the drugbank database compounds. Ecotoxicol Environ Saf 168:287\u2013297","journal-title":"Ecotoxicol Environ Saf"},{"key":"1668_CR13","first-page":"1","volume":"26","author":"J Hey","year":"2004","unstructured":"Hey J (2004) The data, information, knowledge, wisdom chain: the metaphorical link. Intergovernmental Oceanographic Commiss 26:1\u201318","journal-title":"Intergovernmental Oceanographic Commiss"},{"issue":"1","key":"1668_CR14","doi-asserted-by":"publisher","first-page":"59","DOI":"10.3233\/HSM-1987-7108","volume":"7","author":"M Zeleny","year":"1987","unstructured":"Zeleny M (1987) Management support systems: towards integrated knowledge management. Hum Syst Manag 7(1):59\u201370","journal-title":"Hum Syst Manag"},{"issue":"1","key":"1668_CR15","first-page":"3","volume":"16","author":"RL Ackoff","year":"1989","unstructured":"Ackoff RL (1989) From data to wisdom. J Appl Syst Anal 16(1):3\u20139","journal-title":"J Appl Syst Anal"},{"issue":"2","key":"1668_CR16","doi-asserted-by":"publisher","first-page":"163","DOI":"10.1177\/0165551506070706","volume":"33","author":"J Rowley","year":"2007","unstructured":"Rowley J (2007) The wisdom hierarchy: representations of the DIKW hierarchy. J Inf Sci 33(2):163\u2013180","journal-title":"J Inf Sci"},{"key":"1668_CR17","unstructured":"D\u00f6rpinghaus J, Jacobs M (2019) Semantic knowledge graph embeddings for biomedical research: Data integration using linked open data. In: Posters and demo track of the 15th international conference on semantic systems. (Poster and Demo Track at SEMANTiCS 2019) (2451), 46\u201350"},{"key":"1668_CR18","doi-asserted-by":"crossref","unstructured":"D\u00f6rpinghaus J, Darms J, Jacobs M (2018) What was the question? A systematization of information retrieval and nlp problems. In: 2018 Federated conference on computer science and information systems (FedCSIS). IEEE","DOI":"10.15439\/2018F168"},{"key":"1668_CR19","unstructured":"D\u00f6rpinghaus J, Klein J, Darms J, Madan S, Jacobs M (2018) Scaiview: a semantic search engine for biomedical research utilizing a microservice architecture. In: Proceedings of the posters and demos track of the 14th international conference on semantic systems - SEMANTiCS2018"},{"key":"1668_CR20","unstructured":"Webber J, Eifrem E (2015) Graph databases"},{"key":"1668_CR21","first-page":"114","volume":"51","author":"FB Rogers","year":"1963","unstructured":"Rogers FB (1963) Medical subject headings. Bull Med Libr Assoc 51:114\u2013116","journal-title":"Bull Med Libr Assoc"},{"issue":"6","key":"1668_CR22","doi-asserted-by":"publisher","first-page":"1113","DOI":"10.3390\/ijerph15061113","volume":"15","author":"H Yang","year":"2018","unstructured":"Yang H, Lee H (2018) Research trend visualization by mesh terms from pubmed. Int J Environ Res Public Health 15(6):1113","journal-title":"Int J Environ Res Public Health"},{"key":"1668_CR23","unstructured":"Cyganiak R, Wood D, Lanthaler M (2014) RDF 1.1 concepts and abstract syntax. W3C recommendation, W3C (February 2014). http:\/\/www.w3.org\/TR\/2014\/REC-rdf11-concepts-20140225\/"},{"key":"1668_CR24","unstructured":"Patel-Schneider P, Rudolph S, Kr\u00f6tzsch M, Hitzler P, Parsia B (2012) OWL 2 web ontology language primer (second edition). Technical report, W3C (December 2012). http:\/\/www.w3.org\/TR\/2012\/REC-owl2-primer-20121211\/"},{"key":"1668_CR25","unstructured":"Summers E, Isaac A (2009) SKOS simple knowledge organization system primer. W3C note, W3C (August 2009). http:\/\/www.w3.org\/TR\/2009\/NOTE-skos-primer-20090818\/"},{"issue":"1","key":"1668_CR26","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1002\/pra2.2015.14505201003","volume":"44","author":"M Zeng","year":"2007","unstructured":"Zeng M, Hlava M, Qin J, Hodge G, Bedford D (2007) Knowledge organization systems (kos) standards. Proc Assoc Inf Sci Technol 44(1):1\u20133","journal-title":"Proc Assoc Inf Sci Technol"},{"key":"1668_CR27","unstructured":"Guidelines for the construction (2005) format, and management of monolingual controlled vocabularies. Standard, National Information Standards Organization, Baltimore, Maryland, USA"},{"key":"1668_CR28","doi-asserted-by":"crossref","unstructured":"Zeng M (2008) Knowledge organization systems (kos) 35:160\u2013182","DOI":"10.5771\/0943-7444-2008-2-3-160"},{"issue":"2","key":"1668_CR29","doi-asserted-by":"publisher","first-page":"238","DOI":"10.1016\/j.jalz.2013.02.009","volume":"10","author":"A Malhotra","year":"2014","unstructured":"Malhotra A, Younesi E, G\u00fcndel M, M\u00fcller B, Heneka MT, Hofmann-Apitius M (2014) Ado: a disease ontology representing the domain knowledge specific to Alzheimer\u2019s disease. Alzheimer\u2019s Dementia 10(2):238\u2013246","journal-title":"Alzheimer\u2019s Dementia"},{"issue":"4","key":"1668_CR30","doi-asserted-by":"publisher","first-page":"1153","DOI":"10.3233\/JAD-161148","volume":"59","author":"A Iyappan","year":"2017","unstructured":"Iyappan A, Younesi E, Redolfi A, Vrooman H, Khanna S, Frisoni GB, Hofmann-Apitius M (2017) Neuroimaging feature terminology: a controlled terminology for the annotation of brain imaging features. J. Alzheimers Dis. 59(4):1153\u20131169","journal-title":"J. Alzheimers Dis."},{"key":"1668_CR31","doi-asserted-by":"publisher","unstructured":"Madan S, Fiosins M, Bonn S, Fluck J (2018). A semantic data integration methodology for translational neurodegenerative disease research. https:\/\/doi.org\/10.6084\/m9.figshare.7339244.v1","DOI":"10.6084\/m9.figshare.7339244.v1"},{"key":"1668_CR32","unstructured":"Vo\u00df J (2016) Classification of knowledge organization systems with wikidata. In: NKOS@ TPDL, pp 15\u201322"},{"key":"1668_CR33","unstructured":"Vrande\u010di\u0107 D (2018) Toward an abstract Wikipedia. In: Ortiz M, Schneider T (eds) 31st International workshop on description logics (DL). CEUR workshop proceedings, Aachen"},{"key":"1668_CR34","doi-asserted-by":"crossref","unstructured":"O\u00dfwald A, Sch\u00f6pfel J, Jacquemin B (2015) Continuing professional education in open access. a French-German survey. LIBER Quarterly. J Assoc Eur Res Libraries 26(2):43\u201366","DOI":"10.18352\/lq.10158"},{"issue":"1","key":"1668_CR35","doi-asserted-by":"publisher","first-page":"6193","DOI":"10.1038\/s41598-018-24571-0","volume":"8","author":"A Volanakis","year":"2018","unstructured":"Volanakis A, Krawczyk K (2018) Sciride finder: a citation-based paradigm in biomedical literature search. Sci Rep 8(1):6193","journal-title":"Sci Rep"},{"key":"1668_CR36","doi-asserted-by":"crossref","unstructured":"Madan S, Hodapp S, Senger P, Ansari S, Szostak J, Hoeng J, Peitsch M, Fluck J (2016) The BEL information extraction workflow (BELIEF): evaluation in the BioCreative V BEL and IAT track. Database 2016","DOI":"10.1093\/database\/baw136"},{"key":"1668_CR37","unstructured":"Madan S, Szostak J, D\u00f6rpinghaus J, Hoeng J, Fluck J (2017) Overview of BEL track: extraction of complex relationships and their conversion to BEL. In: Proceedings of the BioCreative VI workshop (2017)"},{"issue":"1","key":"1668_CR38","doi-asserted-by":"publisher","first-page":"50","DOI":"10.1145\/2206869.2206879","volume":"41","author":"PT Wood","year":"2012","unstructured":"Wood PT (2012) Query languages for graph databases. SIGMOD Rec 41(1):50\u201360. https:\/\/doi.org\/10.1145\/2206869.2206879","journal-title":"SIGMOD Rec"},{"issue":"5","key":"1668_CR39","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1145\/3104031","volume":"50","author":"R Angles","year":"2017","unstructured":"Angles R, Arenas M, Barcel\u00f3 P, Hogan A, Reutter J, Vrgo\u010d D (2017) Foundations of modern query languages for graph databases. ACM Comput Surv 50(5):68\u201316840. https:\/\/doi.org\/10.1145\/3104031","journal-title":"ACM Comput Surv"},{"issue":"1","key":"1668_CR40","doi-asserted-by":"publisher","first-page":"383","DOI":"10.1007\/s11192-018-2960-y","volume":"118","author":"J Kim","year":"2019","unstructured":"Kim J (2019) Correction to: Evaluating author name disambiguation for digital libraries: a case of dblp. Scientometrics 118(1):383\u2013383","journal-title":"Scientometrics"},{"key":"1668_CR41","unstructured":"Franzoni V, Lepri M, Milani A (2019) Topological and semantic graph-based author disambiguation on dblp data in neo4j. arXiv preprint arXiv:1901.08977"},{"issue":"1","key":"1668_CR42","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1075\/li.30.1.03nad","volume":"30","author":"D Nadeau","year":"2007","unstructured":"Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1):3\u201326","journal-title":"Lingvisticae Investigationes"},{"key":"1668_CR43","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1016\/j.neucom.2018.10.055","volume":"329","author":"D Cai","year":"2019","unstructured":"Cai D, Wu G (2019) Content-aware attributed entity embedding for synonymous named entity discovery. Neurocomputing 329:237\u2013247","journal-title":"Neurocomputing"},{"key":"1668_CR44","doi-asserted-by":"crossref","unstructured":"Prajapati P, Sivakumar P (2019) Context dependency relation extraction using modified evolutionary algorithm based on web mining. In: Emerging technologies in data mining and information security. Springer, G\u00f6ttingen, pp 259\u2013267","DOI":"10.1007\/978-981-13-1498-8_23"},{"key":"1668_CR45","doi-asserted-by":"crossref","unstructured":"Cook SA (1971) The complexity of theorem-proving procedures. In: Proceedings of the third annual ACM symposium on theory of computing, pp 151\u2013158 (1971). ACM","DOI":"10.1145\/800157.805047"},{"key":"1668_CR46","doi-asserted-by":"crossref","unstructured":"Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten J-W, da Silva\u00a0Santos LB, Bourne PE, et al (2016) The fair guiding principles for scientific data management and stewardship. Sci Data 3","DOI":"10.1038\/sdata.2016.18"}],"container-title":["Knowledge and Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10115-022-01668-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10115-022-01668-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10115-022-01668-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,21]],"date-time":"2024-09-21T06:05:55Z","timestamp":1726898755000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10115-022-01668-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,29]]},"references-count":46,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,5]]}},"alternative-id":["1668"],"URL":"https:\/\/doi.org\/10.1007\/s10115-022-01668-7","relation":{},"ISSN":["0219-1377","0219-3116"],"issn-type":[{"value":"0219-1377","type":"print"},{"value":"0219-3116","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,29]]},"assertion":[{"value":"9 May 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 February 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 February 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 March 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 August 2022","order":5,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Update","order":6,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Missing Open Access funding information has been added in the Funding Note.","order":7,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"Not applicable. The knowledge graph is available on request; the fundamental data is available using SCAIView.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Availability of data and materials"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}},{"value":"This study was funded by Fraunhofer Society under the MAVO Project; Human Brain Pharmacome; EU\/EFPIA Innovative Medicines Initiative Joint Undertaking under AETIONOMY (115568 to D.D.F.); European Union\u2019s Seventh Framework Programme (FP7\/2007-2013); and EFPIA. Open Access funding enabled and organized by Projekt DEAL.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Funding"}}]}}