{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,5]],"date-time":"2026-01-05T22:25:53Z","timestamp":1767651953834,"version":"3.37.3"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"S10","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2009,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The National Cancer Institute (NCI) is developing caGrid as a means for sharing cancer-related data and services. As more data sets become available on caGrid, we need effective ways of accessing and integrating this information. Although the data models exposed on caGrid are semantically well annotated, it is currently up to the caGrid client to infer relationships between the different models and their classes. In this paper, we present a Semantic Web-based data warehouse (Corvus) for creating relationships among caGrid models. This is accomplished through the transformation of semantically-annotated caBIG<jats:sup>\u00ae<\/jats:sup> Unified Modeling Language (UML) information models into Web Ontology Language (OWL) ontologies that preserve those semantics. We demonstrate the validity of the approach by Semantic Extraction, Transformation and Loading (SETL) of data from two caGrid data sources, caTissue and caArray, as well as alignment and query of those sources in Corvus. We argue that semantic integration is necessary for integration of data from distributed web services and that Corvus is a useful way of accomplishing this. Our approach is generalizable and of broad utility to researchers facing similar integration challenges.<\/jats:p>","DOI":"10.1186\/1471-2105-10-s10-s2","type":"journal-article","created":{"date-parts":[[2009,10,1]],"date-time":"2009-10-01T18:15:14Z","timestamp":1254420914000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":25,"title":["Semantic web data warehousing for caGrid"],"prefix":"10.1186","volume":"10","author":[{"given":"Jamie P","family":"McCusker","sequence":"first","affiliation":[]},{"given":"Joshua A","family":"Phillips","sequence":"additional","affiliation":[]},{"given":"Alejandra Gonz\u00e1lez","family":"Beltr\u00e1n","sequence":"additional","affiliation":[]},{"given":"Anthony","family":"Finkelstein","sequence":"additional","affiliation":[]},{"given":"Michael","family":"Krauthammer","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2009,10,1]]},"reference":[{"issue":"5723","key":"3370_CR1","doi-asserted-by":"publisher","first-page":"821","DOI":"10.1126\/science.1112120","volume":"308","author":"KH Buetow","year":"2005","unstructured":"Buetow KH: Cyberinfrastructure: Empowering a \"Third Way\" in Biomedical Research. Science 2005, 308(5723):821\u2013824. 10.1126\/science.1112120","journal-title":"Science"},{"issue":"15","key":"3370_CR2","doi-asserted-by":"publisher","first-page":"1910","DOI":"10.1093\/bioinformatics\/btl272","volume":"22","author":"J Saltz","year":"2006","unstructured":"Saltz J, Oster S, Hastings S, Langella S, Kurc T, Sanchez W, Kher M, Manisundaram A, Shanbhag K, Covitz P: caGrid: design and implementation of the core architecture of the cancer biomedical informatics grid. Bioinformatics 2006, 22(15):1910. 10.1093\/bioinformatics\/btl272","journal-title":"Bioinformatics"},{"key":"3370_CR3","first-page":"573","volume-title":"AMIA Annual Symposium","author":"S Oster","year":"2007","unstructured":"Oster S, Langella S, Hastings S, Ervin D, Madduri R, Kurc T, Siebenlist F, Foster I, Shanbhag K, Covitz P: caGrid 1.0: A grid enterprise architecture for cancer research. AMIA Annual Symposium 2007, 573\u2013577."},{"key":"3370_CR4","first-page":"7","volume":"433","author":"SA Langella","year":"2007","unstructured":"Langella SA, Oster S, Hastings S, Siebenlist F, Phillips J, Ervin D, Permar J, Kurc T, Saltz J: The Cancer Biomedical Informatics Grid (caBIG) Security Infrastructure. AMIA Annu Symp Proc 2007, 433: 7.","journal-title":"AMIA Annu Symp Proc"},{"issue":"3","key":"3370_CR5","doi-asserted-by":"publisher","first-page":"363","DOI":"10.1197\/jamia.M2662","volume":"15","author":"S Langella","year":"2008","unstructured":"Langella S, Hastings S, Oster S, Pan T, Sharma A, Permar J, Ervin D, Cambazoglu BB, Kurc T, Saltz J: Sharing data and analytical resources securely in a biomedical research grid environment. Journal of the American Medical Informatics Association 2008, 15(3):363\u2013373. 10.1197\/jamia.M2662","journal-title":"Journal of the American Medical Informatics Association"},{"issue":"2","key":"3370_CR6","doi-asserted-by":"publisher","first-page":"114","DOI":"10.1016\/j.jbi.2004.09.001","volume":"38","author":"FW Hartel","year":"2005","unstructured":"Hartel FW, de Coronado S, Dionne R, Fragoso G, Golbeck J: Modeling a description logic vocabulary for cancer research. Journal of Biomedical Informatics 2005, 38(2):114\u2013129. 10.1016\/j.jbi.2004.09.001","journal-title":"Journal of Biomedical Informatics"},{"key":"3370_CR7","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1016\/j.jbi.2006.02.013","volume":"40","author":"N Sioutos","year":"2007","unstructured":"Sioutos N, Coronado S, Haber MW, Hartel FW, Shaiu WL, Wright LW: NCI Thesaurus: A semantic model integrating cancer-related clinical and molecular information. Journal of biomedical informatics 2007, 40: 30\u201343. 10.1016\/j.jbi.2006.02.013","journal-title":"Journal of biomedical informatics"},{"issue":"Pt 1","key":"3370_CR8","first-page":"33","volume":"11","author":"S de Coronado","year":"2004","unstructured":"de Coronado S, Haber MW, Sioutos N, Tuttle MS, Wright LW: NCI Thesaurus: using science-based terminology to integrate cancer research results. Stud Health Technol Inform. 2004, 11(Pt 1):33\u201337.","journal-title":"Stud Health Technol Inform"},{"key":"3370_CR9","volume-title":"Comparative and Functional Genomics","author":"G Fragoso","year":"2004","unstructured":"Fragoso G, de Coronado S, Haber M, Hartel F, Wright L: Overview and utilization of the NCI Thesaurus. Comparative and Functional Genomics 2004., 5(8):","edition":"5"},{"key":"3370_CR10","first-page":"1048","volume-title":"AMIA... Annual Symposium proceedings [electronic resource]","author":"DB Warzel","year":"2003","unstructured":"Warzel DB, Andonyadis C, McCurry B, Chilukuri R, Ishmukhamedov S, Covitz P: Common data element (CDE) management and deployment in clinical trials. In AMIA... Annual Symposium proceedings [electronic resource]. Volume 2003. American Medical Informatics Association; 2003:1048."},{"issue":"18","key":"3370_CR11","doi-asserted-by":"publisher","first-page":"2404","DOI":"10.1093\/bioinformatics\/btg335","volume":"19","author":"PA Covitz","year":"2003","unstructured":"Covitz PA, Hartel F, Schaefer C, Coronado SD, Fragoso G, Sahni H, Gustafson S, Buetow KH: caCORE: A common infrastructure for cancer informatics. Bioinformatics 2003, 19(18):2404\u20132412. 10.1093\/bioinformatics\/btg335","journal-title":"Bioinformatics"},{"key":"3370_CR12","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1016\/j.jbi.2007.03.009","volume":"41","author":"GA Komatsoulis","year":"2008","unstructured":"Komatsoulis GA, Warzel DB, Hartel FW, Shanbhag K, Chilukuri R, Fragoso G, Coronado S, Reeves DM, Hadfield JB, Ludet C: caCORE version 3: Implementation of a model driven, service-oriented architecture for semantic interoperability. Journal of biomedical informatics 2008, 41: 106\u2013123. 10.1016\/j.jbi.2007.03.009","journal-title":"Journal of biomedical informatics"},{"issue":"10","key":"3370_CR13","doi-asserted-by":"publisher","first-page":"551","DOI":"10.1016\/j.tig.2003.08.009","volume":"19","author":"H Ge","year":"2003","unstructured":"Ge H, Walhout AJM, Vidal M: Integrating 'omic' information: a bridge between genomics and systems biology. Trends in Genetics: TIG 2003, 19(10):551\u201360. PMID: 14550629 [http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/14550629] PMID: 14550629 10.1016\/j.tig.2003.08.009","journal-title":"Trends in Genetics: TIG"},{"key":"3370_CR14","first-page":"2004","volume":"10","author":"DL McGuinness","year":"2004","unstructured":"McGuinness DL, Harmelen FV: OWL web ontology language overview. W3C recommendation 2004, 10: 2004\u201303.","journal-title":"W3C recommendation"},{"issue":"3","key":"3370_CR15","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1300\/J111v34n03_04","volume":"34","author":"EJ Miller","year":"2001","unstructured":"Miller EJ: An introduction to the resource descriptionframework. Journal of Library Administration 2001, 34(3):245\u2013255. 10.1300\/J111v34n03_04","journal-title":"Journal of Library Administration"},{"key":"3370_CR16","volume-title":"W3C recommendation","author":"G Klyne","year":"2004","unstructured":"Klyne G, Carroll JJ, McBride B: Resource description framework (RDF): Concepts and abstract syntax. W3C recommendation 2004., 10:"},{"key":"3370_CR17","volume-title":"Information Systems","author":"M Spies","year":"2009","unstructured":"Spies M: An ontology modelling perspective on business reporting. Information Systems 2009."},{"volume-title":"The Description Logic Handbook","year":"2003","key":"3370_CR18","unstructured":"Baader F, Calvanese D, McGuinness DL, Nardi D, Patel-Schneider PF, (Eds): The Description Logic Handbook. Cambridge University Press; 2003."},{"issue":"1\u20132","key":"3370_CR19","doi-asserted-by":"publisher","first-page":"70","DOI":"10.1016\/j.artint.2005.05.003","volume":"168","author":"D Berardi","year":"2005","unstructured":"Berardi D, Calvanese D, De Giacomo G: Reasoning on UML Class Diagrams. Artificial Intelligence 2005, 168(1\u20132):70\u2013118. 10.1016\/j.artint.2005.05.003","journal-title":"Artificial Intelligence"},{"issue":"2","key":"3370_CR20","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1007\/s10009-006-0002-1","volume":"9","author":"D Ga\u0161evi\u0107","year":"2007","unstructured":"Ga\u0161evi\u0107 D, Djuri\u00e6 D, Deved V: MDA-based Automatic OWL Ontology Development. International Journal on Software Tools for Technology Transfer (STTT) 2007, 9(2):103\u2013117.","journal-title":"International Journal on Software Tools for Technology Transfer (STTT)"},{"key":"3370_CR21","volume-title":"Ontology Definition Metamodel \u2013 OMG Adopted Specification","author":"IBM","year":"2007","unstructured":"IBM: Ontology Definition Metamodel \u2013 OMG Adopted Specification.2007. [http:\/\/www.omg.org\/cgi-bin\/apps\/doc?ptc\/07\u201309\u201309.pdf] Accessed October 2008"},{"key":"3370_CR22","unstructured":"Knublauch H: UMLBackend: plug-in for Prot\u00e9g\u00e9.[http:\/\/protege.cim3.net\/cgi-bin\/wiki.pl?UMLBackend] Accessed April 2009"},{"key":"3370_CR23","first-page":"1619","volume-title":"Software and Systems Modeling","author":"J Evermann","year":"2008","unstructured":"Evermann J: A UML and OWL description of Bunge's upper-level ontology model. Software and Systems Modeling 2008, 1619\u20131366."},{"issue":"4","key":"3370_CR24","doi-asserted-by":"publisher","first-page":"559","DOI":"10.1197\/jamia.M2732","volume":"15","author":"EP Shironoshita","year":"2008","unstructured":"Shironoshita EP, Jean-Mary YR, Bradley R, Kabuka MR: semCDI: Semantic Query Formulation for caBIG. Journal of the American Medical Informatics Association (JAMIA) 2008, 15(4):559\u2013568. 10.1197\/jamia.M2732","journal-title":"Journal of the American Medical Informatics Association (JAMIA)"},{"key":"3370_CR25","first-page":"108","volume-title":"Proceedings of the 5th International Workshop on Data Integration in the Life Sciences (DILS'08), of Lecture Notes in Bioinformatics","author":"EP Shironoshita","year":"2008","unstructured":"Shironoshita EP, Bradley RM, Jean-Mary YR, Taylor TJ, Ryan MT, Kabuka MR: Semantic Representation and Querying of caBIG Data Services. In Proceedings of the 5th International Workshop on Data Integration in the Life Sciences (DILS'08), of Lecture Notes in Bioinformatics. Volume 5109. Edited by: Bairoch A, Cohen-Boulakia S, Froidevaux C. Springer; 2008:108\u2013115."},{"issue":"2","key":"3370_CR26","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1002\/ddr.430340203","volume":"34","author":"MR Boyd","year":"1995","unstructured":"Boyd MR, Paull KD: Some practical considerations and applications of the National Cancer Institute in vitro anticancer drug discovery screen. Drug Development Research 1995, 34(2):91\u2013109. 10.1002\/ddr.430340203","journal-title":"Drug Development Research"},{"key":"3370_CR27","unstructured":"caTissue Suite caGrid Service Endpoint[http:\/\/espresso.med.yale.edu:18080\/wsrf\/services\/cagrid\/CaTissueSuite]"},{"key":"3370_CR28","unstructured":"caArray \u2013 Experiment Details \u2013 E-GEOD-5949[http:\/\/espresso.med.yale.edu:38080\/caarray\/project\/shank-00006]"},{"key":"3370_CR29","unstructured":"SKY\/M-FISH\/CGH Database[http:\/\/www.ncbi.nlm.nih.gov\/sky\/skyweb.cgi?submitter=NCI60+cell+line+panelGenetics+Branch_I.R.Kirsch&form_type=display_cases]"},{"key":"3370_CR30","volume-title":"Comparison between cell lines from 9 different cancer tissue (NCI-60) (U95 platform)","author":"U Shankavaram","year":"2005","unstructured":"Shankavaram U, Weinstein J, Kahn A: Comparison between cell lines from 9 different cancer tissue (NCI-60) (U95 platform).2005. [http:\/\/www.ncbi.nlm.nih.gov\/geo\/query\/acc.cgi?acc=GSE5949]"},{"issue":"2","key":"3370_CR31","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1093\/bioinformatics\/btn617","volume":"25","author":"TF Rayner","year":"2009","unstructured":"Rayner TF, Rezwan FI, Lukk M, Bradley XZ, Farne A, Holloway E, Malone J, Williams E, Parkinson H: MAGETabulator, a suite of tools to support the microarray data format MAGE-TAB. Bioinformatics 2009, 25(2):279\u2013280. 10.1093\/bioinformatics\/btn617","journal-title":"Bioinformatics"},{"key":"3370_CR32","first-page":"185","volume-title":"Proceedings of the European Semantic Web Conference, of LNCS","author":"E Jim\u00e9nez-Ruiz","year":"2008","unstructured":"Jim\u00e9nez-Ruiz E, Grau BC, Sattler U, Schneider T, Llavori RB: Safe and Economic Re-Use of Ontologies: A Logic-Based Methodology and Tool Support.In Proceedings of the European Semantic Web Conference, of LNCS Edited by: Bechhofer S. 2008, 5021: 185\u2013199. [http:\/\/dx.doi.org\/10.1007\/978\u20133-540\u201368234\u20139_16]"},{"key":"3370_CR33","unstructured":"SQL n + 1 Selects Explained \u2013 Pramatr Blog[http:\/\/pramatr.com\/2009\/02\/05\/sql-n-1-selects-explained]"},{"key":"3370_CR34","unstructured":"CQL 2 \u2013 Data Services \u2013 cagrid.org[http:\/\/carid.org\/display\/dataservices\/CQL+2]"},{"issue":"5","key":"3370_CR35","doi-asserted-by":"publisher","first-page":"500","DOI":"10.1038\/ng0506-500","volume":"38","author":"M Reich","year":"2006","unstructured":"Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov J: GenePattern 2.0. Nature Genetics 2006, 38(5):500\u2013501. 10.1038\/ng0506-500","journal-title":"Nature Genetics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-10-S10-S2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,29]],"date-time":"2022-06-29T08:03:46Z","timestamp":1656489826000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-10-S10-S2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,10]]},"references-count":35,"journal-issue":{"issue":"S10","published-print":{"date-parts":[[2009,10]]}},"alternative-id":["3370"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-10-s10-s2","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2009,10]]},"assertion":[{"value":"1 October 2009","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S2"}}