{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:36:49Z","timestamp":1772138209635,"version":"3.50.1"},"reference-count":57,"publisher":"Oxford University Press (OUP)","license":[{"start":{"date-parts":[[2019,11,7]],"date-time":"2019-11-07T00:00:00Z","timestamp":1573084800000},"content-version":"vor","delay-in-days":310,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Swiss National Research Programme 75 \u2018Big Data\u2019","award":["167149"],"award-info":[{"award-number":["167149"]}]},{"DOI":"10.13039\/501100001711","name":"Swiss National Science Foundation","doi-asserted-by":"publisher","award":["150654"],"award-info":[{"award-number":["150654"]}],"id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Motivation: Data integration promises to be one of the main catalysts in enabling new insights to be drawn from the wealth of biological data available publicly. However, the heterogeneity of the different data sources, both at the syntactic and the semantic level, still poses significant challenges for achieving interoperability among biological databases.<\/jats:p>\n                  <jats:p>Results: We introduce an ontology-based federated approach for data integration. We applied this approach to three heterogeneous data stores that span different areas of biological knowledge: (i) Bgee, a gene expression relational database; (ii) Orthologous Matrix (OMA), a Hierarchical Data Format 5 orthology DS; and (iii) UniProtKB, a Resource Description Framework (RDF) store containing protein sequence and functional information. To enable federated queries across these sources, we first defined a new semantic model for gene expression called GenEx. We then show how the relational data in Bgee can be expressed as a virtual RDF graph, instantiating GenEx, through dedicated relational-to-RDF mappings. By applying these mappings, Bgee data are now accessible through a public SPARQL endpoint. Similarly, the materialized RDF data of OMA, expressed in terms of the Orthology ontology, is made available in a public SPARQL endpoint. We identified and formally described intersection points (i.e. virtual links) among the three data sources. These allow performing joint queries across the data stores. Finally, we lay the groundwork to enable nontechnical users to benefit from the integrated data, by providing a natural language template-based search interface.<\/jats:p>","DOI":"10.1093\/database\/baz106","type":"journal-article","created":{"date-parts":[[2019,8,17]],"date-time":"2019-08-17T15:12:11Z","timestamp":1566054731000},"source":"Crossref","is-referenced-by-count":29,"title":["Enabling semantic queries across federated bioinformatics databases"],"prefix":"10.1093","volume":"2019","author":[{"given":"Ana Claudia","family":"Sima","sequence":"first","affiliation":[{"name":"ZHAW Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur Switzerland"},{"name":"Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland"},{"name":"Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland"}]},{"given":"Tarcisio","family":"Mendes de Farias","sequence":"additional","affiliation":[{"name":"Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland"},{"name":"Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland"},{"name":"Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland"}]},{"given":"Erich","family":"Zbinden","sequence":"additional","affiliation":[{"name":"ZHAW Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland"}]},{"given":"Maria","family":"Anisimova","sequence":"additional","affiliation":[{"name":"ZHAW Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland"}]},{"given":"Manuel","family":"Gil","sequence":"additional","affiliation":[{"name":"ZHAW Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland"}]},{"given":"Heinz","family":"Stockinger","sequence":"additional","affiliation":[{"name":"SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland"}]},{"given":"Kurt","family":"Stockinger","sequence":"additional","affiliation":[{"name":"ZHAW Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur Switzerland"}]},{"given":"Marc","family":"Robinson-Rechavi","sequence":"additional","affiliation":[{"name":"SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland"},{"name":"Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland"}]},{"given":"Christophe","family":"Dessimoz","sequence":"additional","affiliation":[{"name":"Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland"},{"name":"Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland"},{"name":"Department of Genetics, Evolution, and Environment, University College London, Gower St, London WC1E 6BT, UK"},{"name":"Department of Computer Science, University College London, Gower St, London WC1E 6BT, UK"}]}],"member":"286","published-online":{"date-parts":[[2019,11,7]]},"reference":[{"key":"2019110708261371800_ref1","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1038\/nrg3868","article-title":"Methods of integrating data to uncover genotype-phenotype interactions","volume":"16","author":"Ritchie","year":"2015","journal-title":"Nat. Rev. Genet."},{"key":"2019110708261371800_ref2","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1038\/nrg.2018.4","article-title":"Integrative omics for health and disease","volume":"19","author":"Karczewski","year":"2018","journal-title":"Nat. Rev. Genet."},{"key":"2019110708261371800_ref3","doi-asserted-by":"crossref","first-page":"D712","DOI":"10.1093\/nar\/gkw1128","article-title":"The monarch initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species","volume":"45","author":"Mungall","year":"2017","journal-title":"Nucleic Acids Res."},{"key":"2019110708261371800_ref4","doi-asserted-by":"crossref","first-page":"D1","DOI":"10.1093\/nar\/gkx1235","article-title":"The 2018 nucleic acids research database issue and the online molecular biology database collection","volume":"46","author":"Rigden","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"2019110708261371800_ref5","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1186\/s12911-018-0636-4","article-title":"An ontology-guided semantic data integration framework to support integrative data analysis of cancer survival","volume":"18","author":"Zhang","year":"2018","journal-title":"BMC Med. Inform. Decis. Mak."},{"key":"2019110708261371800_ref6","doi-asserted-by":"crossref","first-page":"1651","DOI":"10.1093\/bioinformatics\/btq231","article-title":"Semantic integration of data on transcriptional regulation","volume":"26","author":"Baitaluk","year":"2010","journal-title":"Bioinformatics"},{"key":"2019110708261371800_ref7","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1016\/j.cageo.2018.03.004","article-title":"Ontology-driven data integration and visualization for exploring regional geologic time and paleontological information","volume":"115","author":"Wang","year":"2018","journal-title":"Comput. Geosci."},{"key":"2019110708261371800_ref8","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1007\/978-3-319-21542-6_7","article-title":"FOWLA, a federated architecture for ontologies","volume-title":"Rule Technologies: Foundations, Tools, and Applications","author":"Farias","year":"2015"},{"key":"2019110708261371800_ref9","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0116656","article-title":"Ontology-based data integration between clinical and research systems","volume":"10","author":"Mate","year":"2015","journal-title":"PLoS One"},{"key":"2019110708261371800_ref10","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1609\/aimag.v36i1.2565","article-title":"Exploiting semantics for big data integration","volume":"36","author":"Knoblock","year":"2015","journal-title":"AI Magazine"},{"key":"2019110708261371800_ref11","volume-title":"Leveraging Logical Rules for Efficacious Representation of Large Orthology Datasets","author":"de Farias","year":"2017"},{"key":"2019110708261371800_ref12","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1093\/bioinformatics\/btw612","article-title":"Cmapper: gene-centric connectivity mapper for EBI-RDF platform","volume":"33","author":"Shoaib","year":"2017","journal-title":"Bioinformatics"},{"key":"2019110708261371800_ref13","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/S0169-023X(97)00056-6","article-title":"Knowledge engineering: principles and methods","volume":"25","author":"Studer","year":"1998","journal-title":"Data Knowl. Eng."},{"key":"2019110708261371800_ref14","doi-asserted-by":"crossref","first-page":"W541","DOI":"10.1093\/nar\/gkr469","article-title":"BioPortal: enhanced functionality via new web services from the national center for biomedical ontology to access and use ontologies in software applications","volume":"39","author":"Whetzel","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2019110708261371800_ref15","doi-asserted-by":"crossref","first-page":"1251","DOI":"10.1038\/nbt1346","article-title":"The OBO foundry: coordinated evolution of ontologies to support biomedical data integration","volume":"25","author":"Smith","year":"2007","journal-title":"Nat. Biotechnol."},{"key":"2019110708261371800_ref16","doi-asserted-by":"crossref","first-page":"2699","DOI":"10.1093\/nar\/gky092","article-title":"UniProt: the universal protein knowledgebase","volume":"46","author":"UniProt Consortium","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"2019110708261371800_ref17","doi-asserted-by":"crossref","first-page":"D456","DOI":"10.1093\/nar\/gks1146","article-title":"The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013","volume":"41","author":"Hastings","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2019110708261371800_ref18","doi-asserted-by":"crossref","first-page":"687","DOI":"10.1016\/j.jbi.2008.01.008","article-title":"State of the nation in data integration for bioinformatics","volume":"41","author":"Goble","year":"2008","journal-title":"J. Biomed. Inform."},{"key":"2019110708261371800_ref19","doi-asserted-by":"publisher","first-page":"41","DOI":"10.5772\/21654","article-title":"Data integration in bioinformatics: current efforts and challenges","volume-title":"Bioinformatics-Trends and Methodologies Mahmood A. Mahdavi","author":"Zhang","year":"2011"},{"key":"2019110708261371800_ref20","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1186\/s40709-015-0032-5","article-title":"Data integration in biological research: an overview","volume":"22","author":"Lapatas","year":"2015","journal-title":"J. Biol. Res. (Thessalon.)"},{"key":"2019110708261371800_ref21","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1186\/s12859-015-0559-3","article-title":"KaBOB: ontology-based semantic integration of biomedical databases","volume":"16","author":"Livingston","journal-title":"BMC Bioinformatics"},{"key":"2019110708261371800_ref22","doi-asserted-by":"crossref","first-page":"706","DOI":"10.1016\/j.jbi.2008.03.004","article-title":"Bio2rdf: towards a mashup to build bioinformatics knowledge systems","volume":"41","author":"Belleau","year":"2008","journal-title":"J. Biomed. Inform."},{"key":"2019110708261371800_ref23","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-04930-9","article-title":"Expanding the pathway and interaction knowledge in linked life data","volume-title":"Proceedings of International Semantic Web Challenge ISWC 2009 Chantilly","author":"Momtchev","year":"2009"},{"key":"2019110708261371800_ref24","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1186\/s13326-017-0118-0","article-title":"Biofed: federated query processing over life sciences linked open data","volume":"8","author":"Hasnain","year":"2017","journal-title":"J. Biomed. Semantics"},{"key":"2019110708261371800_ref25","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1186\/s13326-017-0151-z","article-title":"Pibas fedsparql: a web-based platform for integration and exploration of bioinformatics datasets","volume":"8","author":"Djokic-Petrovic","year":"2017","journal-title":"J. Biomed. Semantics"},{"key":"2019110708261371800_ref26","doi-asserted-by":"publisher","first-page":"795","DOI":"10.3233\/SW-180327","article-title":"SpecINT: a framework for data integration over cheminformatics and bioinformatics RDF repositories. Semantic Web Journal","author":"Arsi\u0107","year":"2019"},{"key":"2019110708261371800_ref27","doi-asserted-by":"crossref","first-page":"1875","DOI":"10.1093\/bioinformatics\/btv064","article-title":"SPARQL-enabled identifier conversion with identifiers.org","volume":"31","author":"Wimalaratne","year":"2015","journal-title":"Bioinformatics"},{"key":"2019110708261371800_ref28","doi-asserted-by":"crossref","first-page":"989","DOI":"10.1109\/ICDE.2018.00093","article-title":"Seeping semantics: linking datasets using word embeddings for data discovery","volume-title":"IEEE 34th International Conference on Data Engineering (ICDE) 2018,","author":"Fernandez","year":"2018"},{"key":"2019110708261371800_ref29","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1016\/j.websem.2017.12.005","article-title":"LargeRDFBench: a billion triples benchmark for SPARQL endpoint federation","volume":"48","author":"Saleem","year":"2018","journal-title":"Web Semant."},{"key":"2019110708261371800_ref30","doi-asserted-by":"crossref","first-page":"D477","DOI":"10.1093\/nar\/gkx1019","article-title":"The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces","volume":"46","author":"Altenhoff","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"2019110708261371800_ref31","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1007\/978-3-540-69828-9_12","article-title":"Bgee: integrating and comparing heterogeneous transcriptome data among species","volume-title":"Data Integration in the Life Sciences","author":"Bastian","year":"2008"},{"key":"2019110708261371800_ref32","first-page":"778","article-title":"SPARQL 1.1 query language.","volume-title":"W3C Recommendation","author":"Harris","year":"2013"},{"key":"2019110708261371800_ref33","article-title":"Describing linked datasets with the VoID vocabulary","author":"Alexander","year":"2011"},{"key":"2019110708261371800_ref34","doi-asserted-by":"publisher","DOI":"10.1093\/database\/bax059","article-title":"BioSearch: a semantic search engine for Bio2RDF","volume":"2017","author":"Hu","year":"2017","journal-title":"Database (Oxford)"},{"key":"2019110708261371800_ref35","first-page":"129","article-title":"SMART: a web-based, ontology-driven, semantic web query answering application","volume":"295","author":"De Leon Battista","year":"2007","journal-title":"Semantic Web Challenge"},{"key":"2019110708261371800_ref36","doi-asserted-by":"crossref","first-page":"S7","DOI":"10.1186\/1471-2105-10-S10-S7","article-title":"GoWeb: a semantic search engine for the life science web","volume":"10","author":"Dietze","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2019110708261371800_ref37","article-title":"Practical linked data access via SPARQL: the case of wikidata","volume-title":"Proceeding WWW2018 Workshop on Linked Data on the Web (LDOW-18)","author":"Bielefeldt","year":"2018"},{"key":"2019110708261371800_ref38","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1145\/2166896.2166906","article-title":"Bioqueries: a social community sharing experiences while querying biological linked data","volume-title":"Proceedings of the 4th International Workshop on Semantic Web Applications and Tools for the Life Sciences, SWAT4LS\u201911","author":"Garc\u00eda-Godoy","year":"2012"},{"key":"2019110708261371800_ref39","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1186\/s12859-017-1531-1","article-title":"SPANG: a SPARQL client supporting generation and reuse of queries for distributed RDF databases","volume":"18","author":"Chiba","year":"2017","journal-title":"BMC Bioinformatics"},{"key":"2019110708261371800_ref40","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0053786","article-title":"Inferring hierarchical orthologous groups from orthologous gene pairs","volume":"8","author":"Altenhoff","year":"2013","journal-title":"PLoS One"},{"key":"2019110708261371800_ref41","doi-asserted-by":"publisher","DOI":"10.12688\/f1000research.9973.2","article-title":"BgeeDB, an R package for retrieval of curated expression datasets and for gene list expression localization enrichment tests","volume":"5","author":"Komljenovic","year":"2018","journal-title":"F1000Res."},{"key":"2019110708261371800_ref42","first-page":"1","article-title":"The BigDAWG polystore system and architecture","author":"Gadepally","year":"2016","journal-title":"IEEE High Performance Extreme Computing Conference (HPEC)"},{"key":"2019110708261371800_ref43","volume-title":"A Metadata Approach to Resolving Semantic Conflicts","author":"Siegel","year":"1991"},{"key":"2019110708261371800_ref44","first-page":"21","article-title":"Automatic ontology matching using application semantics","volume":"26","author":"Gal","year":"2005","journal-title":"AI magazine"},{"key":"2019110708261371800_ref45","doi-asserted-by":"publisher","DOI":"10.1038\/npre.2009.3193.1","article-title":"Uniprot in RDF: tackling data integration and distributed annotation with the semantic web. Nature Precedings, 3rd Biocuration Conference, 2019","author":"Redaschi","year":"2009"},{"key":"2019110708261371800_ref46","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1186\/s13326-016-0077-x","article-title":"The orthology ontology: development and applications","volume":"7","author":"Tom\u00e1s Fern\u00e1ndez-Breis","year":"2016","journal-title":"J. Biomed. Semantics"},{"key":"2019110708261371800_ref47","first-page":"323","article-title":"Gearing up to handle the mosaic nature of life in the quest for orthologs","volume":"34.2","author":"Forslund","year":"2017","journal-title":"Bioinformatics"},{"key":"2019110708261371800_ref48","doi-asserted-by":"crossref","first-page":"D926","DOI":"10.1093\/nar\/gkt1270","article-title":"Expression atlas update\u2014a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments","volume":"42","author":"Petryszak","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"2019110708261371800_ref49","doi-asserted-by":"crossref","first-page":"420747","DOI":"10.1155\/2008\/420747","article-title":"Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes. Adv","volume":"2008","author":"Hruz","year":"2008","journal-title":"Bioinformatics"},{"key":"2019110708261371800_ref50","doi-asserted-by":"publisher","DOI":"10.1093\/database\/bay003","article-title":"TISSUES 2.0: an integrative web resource on mammalian tissue expression","volume":"2018","author":"Palasca","year":"2018","journal-title":"Database (Oxford)"},{"key":"2019110708261371800_ref51","doi-asserted-by":"publisher","first-page":"R46","DOI":"10.1186\/gb-2005-6-5-r46","article-title":"Relations in biomedical ontologies","volume-title":"Genome Biol.","author":"Smith","year":"2005"},{"key":"2019110708261371800_ref52","doi-asserted-by":"crossref","first-page":"1338","DOI":"10.1093\/bioinformatics\/btt765","article-title":"The EBI RDF platform: linked open data for the life sciences","volume":"30","author":"Jupp","year":"2014","journal-title":"Bioinformatics"},{"issue":"3","key":"2019110708261371800_ref53","doi-asserted-by":"crossref","first-page":"471","DOI":"10.3233\/SW-160217","article-title":"Ontop: Answering SPARQL queries over relational databases","volume":"8","author":"Calvanese","year":"2017","journal-title":"Semantic Web"},{"key":"2019110708261371800_ref54","doi-asserted-by":"publisher","first-page":"R5","DOI":"10.1186\/gb-2012-13-1-r5","article-title":"Uberon, an integrative multi-species anatomy ontology","volume":"13","author":"Mungall","year":"2012","journal-title":"Genome Biol."},{"key":"2019110708261371800_ref55","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-030-33246-4_38","article-title":"VoIDext: Vocabulary and patterns for enhancing interoperable datasets with virtual links","author":"de Farias","year":"2019"},{"key":"2019110708261371800_ref56","doi-asserted-by":"crossref","first-page":"405","DOI":"10.3233\/SW-150208","article-title":"Sparklis: an expressive query builder for sparql endpoints with guidance in natural language","volume":"8","author":"Ferr\u00e9","year":"2017","journal-title":"Semantic Web"},{"key":"2019110708261371800_ref57","doi-asserted-by":"crossref","first-page":"311","DOI":"10.3233\/SW-160236","article-title":"Access control and the resource description framework: a survey","volume":"8","author":"Kirrane","year":"2017","journal-title":"Semantic Web"}],"container-title":["Database"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baz106\/30457488\/baz106.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baz106\/30457488\/baz106.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,22]],"date-time":"2024-07-22T00:12:34Z","timestamp":1721607154000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/database\/article\/doi\/10.1093\/database\/baz106\/5614223"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,1,1]]},"references-count":57,"URL":"https:\/\/doi.org\/10.1093\/database\/baz106","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/686600","asserted-by":"object"}]},"ISSN":["1758-0463"],"issn-type":[{"value":"1758-0463","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019]]},"published":{"date-parts":[[2019,1,1]]},"article-number":"baz106"}}