{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,13]],"date-time":"2026-01-13T15:48:58Z","timestamp":1768319338870,"version":"3.49.0"},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"S4","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2012,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>The ONCO-i2b2 platform is a bioinformatics tool designed to integrate clinical and research data and support translational research in oncology. It is implemented by the University of Pavia and the IRCCS Fondazione Maugeri hospital (FSM), and grounded on the software developed by the Informatics for Integrating Biology and the Bedside (i2b2) research center. I2b2 has delivered an open source suite based on a data warehouse, which is efficiently interrogated to find sets of interesting patients through a query tool interface.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Methods<\/jats:title>\n            <jats:p>Onco-i2b2 integrates data coming from multiple sources and allows the users to jointly query them. I2b2 data are then stored in a data warehouse, where facts are hierarchically structured as ontologies. Onco-i2b2 gathers data from the FSM pathology unit (PU) database and from the hospital biobank and merges them with the clinical information from the hospital information system.<\/jats:p>\n            <jats:p>Our main effort was to provide a robust integrated research environment, giving a particular emphasis to the integration process and facing different challenges, consecutively listed: biospecimen samples privacy and anonymization; synchronization of the biobank database with the i2b2 data warehouse through a series of Extract, Transform, Load (ETL) operations; development and integration of a Natural Language Processing (NLP) module, to retrieve coded information, such as SNOMED terms and malignant tumors (TNM) classifications, and clinical tests results from unstructured medical records. Furthermore, we have developed an internal SNOMED ontology rested on the NCBO BioPortal web services.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>Onco-i2b2 manages data of more than 6,500 patients with breast cancer diagnosis collected between 2001 and 2011 (over 390 of them have at least one biological sample in the cancer biobank), more than 47,000 visits and 96,000 observations over 960 medical concepts.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>Onco-i2b2 is a concrete example of how integrated Information and Communication Technology architecture can be implemented to support translational research. The next steps of our project will involve the extension of its capabilities by implementing new plug-in devoted to bioinformatics data analysis as well as a temporal query module.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-13-s4-s5","type":"journal-article","created":{"date-parts":[[2012,3,28]],"date-time":"2012-03-28T10:46:23Z","timestamp":1332931583000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":25,"title":["An ICT infrastructure to integrate clinical and molecular data in oncology research"],"prefix":"10.1186","volume":"13","author":[{"given":"Daniele","family":"Segagni","sequence":"first","affiliation":[]},{"given":"Valentina","family":"Tibollo","sequence":"additional","affiliation":[]},{"given":"Arianna","family":"Dagliati","sequence":"additional","affiliation":[]},{"given":"Alberto","family":"Zambelli","sequence":"additional","affiliation":[]},{"given":"Silvia G","family":"Priori","sequence":"additional","affiliation":[]},{"given":"Riccardo","family":"Bellazzi","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2012,3,28]]},"reference":[{"key":"5102_CR1","first-page":"548","volume-title":"AMIA Annu Symp Proc","author":"SN Murphy","year":"2007","unstructured":"Murphy SN, Mendis M, Hackett K, Kuttan R, Pan W, Phillips LC, Gainer V, Berkowicz D, Glaser JP, Kohane I, Chueh HC: Architecture of the open-source clinical research chart from Informatics for Integrating Biology and the Bedside. AMIA Annu Symp Proc 2007, 548\u2013552."},{"key":"5102_CR2","unstructured":"i2b2 web site[https:\/\/www.i2b2.org] (last accessed November 23rd 2011)"},{"key":"5102_CR3","unstructured":"National Center of Biomedical Computing web site[http:\/\/www.ncbcs.org] (last accessed Noverber 23rd 2011)"},{"issue":"2","key":"5102_CR4","doi-asserted-by":"publisher","first-page":"124","DOI":"10.1136\/jamia.2009.000893","volume":"17","author":"SN Murphy","year":"2010","unstructured":"Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S, Kohane I: Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inform Assoc 2010, 17(2):124\u2013130. 10.1136\/jamia.2009.000893","journal-title":"J Am Med Inform Assoc"},{"key":"5102_CR5","volume-title":"The Data Warehouse ETL Toolkit","author":"Ralph Kimball","year":"2008","unstructured":"Kimball Ralph, Ross Margy: The Data Warehouse ETL Toolkit. 2nd edition. Wiley; 2008.","edition":"2"},{"key":"5102_CR6","first-page":"502","volume":"169","author":"S Mate","year":"2011","unstructured":"Mate S, B\u00fcrkle T, K\u00f6pcke F, Breil B, Wullich B, Dugas M, Prokosch HU, Ganslandt T: Populating the i2b2 database with heterogeneous EMR data: a semantic network approach. Stud Health Technol Inform 2011, 169: 502\u2013506.","journal-title":"Stud Health Technol Inform"},{"key":"5102_CR7","doi-asserted-by":"publisher","first-page":"218","DOI":"10.1186\/1471-2105-12-218","volume":"12","author":"T Adamusiak","year":"2011","unstructured":"Adamusiak T, Burdett T, Kurbatova N, Joeri van der Velde K, Abeygunawardena N, Antonakaki D, Kapushesky M, Parkinson H, Swertz MA: OntoCAT--simple ontology search and integration in Java, R and REST\/JavaScript. BMC Bioinformatics 2011, 12: 218. 10.1186\/1471-2105-12-218","journal-title":"BMC Bioinformatics"},{"key":"5102_CR8","volume-title":"J Am Med Inform Assoc","author":"SN Murphy","year":"2011","unstructured":"Murphy SN, Gainer V, Mendis M, Churchill S, Kohane I: Strategies for maintaining patient privacy in i2b2. J Am Med Inform Assoc 2011, in press."},{"issue":"3","key":"5102_CR9","doi-asserted-by":"publisher","first-page":"383","DOI":"10.1007\/s00439-011-1042-5","volume":"130","author":"B Malin","year":"2011","unstructured":"Malin B, Loukides G, Benitez K, Clayton EW: Identifiability in biobanks: models, measures, and mitigation strategies. Hum Genet 2011, 130(3):383\u2013392. 10.1007\/s00439-011-1042-5","journal-title":"Hum Genet"},{"key":"5102_CR10","unstructured":"Pentaho Data Integration Kettle Documentation[http:\/\/kettle.pentaho.com] (last accessed November 23rd 2011)"},{"key":"5102_CR11","volume-title":"Pentaho Solutions","author":"R Bouman","year":"2009","unstructured":"Bouman R, van Dongen J: Pentaho Solutions. Wiley; 2009."},{"key":"5102_CR12","unstructured":"Google Web Toolkit web site[http:\/\/code.google.com\/webtoolkit\/] (last accessed November 23rd 2011)"},{"issue":"4","key":"5102_CR13","doi-asserted-by":"publisher","first-page":"571","DOI":"10.1197\/jamia.M3083","volume":"16","author":"LC Childs","year":"2009","unstructured":"Childs LC, Enelow R, Simonsen L, Heintzelman NH, Kowalski KM, Taylor RJ: Description of a rule-based system for the i2b2 challenge in natural language processing for clinical data. J Am Med Inform Assoc 2009, 16(4):571\u2013575. 10.1197\/jamia.M3083","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"5102_CR14","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1197\/jamia.M2694","volume":"16","author":"ST Rosenbloom","year":"2009","unstructured":"Rosenbloom ST, Brown SH, Froehling D, Bauer BA, Wahner-Roedler DL, Gregg WM, Elkin PL: Using SNOMED CT to represent two interface terminologies. J Am Med Inform Assoc 2009, 16(1):81\u201388. 10.1197\/jamia.M2694","journal-title":"J Am Med Inform Assoc"},{"key":"5102_CR15","volume-title":"TNM Classification of Malignant Tumours","author":"LH Sobin","year":"2009","unstructured":"Sobin LH, Gospodarowicz MK, Wittekind C: TNM Classification of Malignant Tumours. 7th edition. Wiley; 2009.","edition":"7"},{"issue":"2","key":"5102_CR16","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1111\/j.1365-2559.2011.03879.x","volume":"59","author":"JJ Going","year":"2011","unstructured":"Going JJ: Observer prediction of HER2 amplification in HercepTest 2+ breast cancers as a potential audit instrument. Histopathology 2011, 59(2):333\u2013335.","journal-title":"Histopathology"},{"issue":"10","key":"5102_CR17","doi-asserted-by":"publisher","first-page":"1504","DOI":"10.1038\/sj.bjc.6603756","volume":"96","author":"E de Azambuja","year":"2007","unstructured":"de Azambuja E, Cardoso F, de Castro G Jr, Colozza M, Mano MS, Durbecq V, Sotiriou C, Larsimont D, Piccart-Gebhart MJ, Paesmans M: Ki-67 as prognostic marker in early breast cancer: a meta-analysis of published studies involving 12,155 patients. Br J Cancer 2007, 96(10):1504\u20131513. 10.1038\/sj.bjc.6603756","journal-title":"Br J Cancer"},{"key":"5102_CR18","volume-title":"Text Processing with GATE (Version 6)","author":"H Cunningham","year":"2011","unstructured":"Cunningham H, Maynard D, Bontcheva K: Text Processing with GATE (Version 6). University of Sheffield Department of Computer Science; 2011."},{"issue":"Web Server issu","key":"5102_CR19","doi-asserted-by":"publisher","first-page":"W170","DOI":"10.1093\/nar\/gkp440","volume":"37","author":"NF Noy","year":"2009","unstructured":"Noy NF, Shah NH, Whetzel PL, Dai B, Dorf M, Griffith N, Jonquet C, Rubin DL, Storey MA, Chute CG, Musen MA: BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Res 2009, 37(Web Server issue):W170-W173.","journal-title":"Nucleic Acids Res"},{"issue":"Web Server issu","key":"5102_CR20","doi-asserted-by":"publisher","first-page":"W541","DOI":"10.1093\/nar\/gkr469","volume":"39","author":"PL Whetzel","year":"2011","unstructured":"Whetzel PL, Noy NF, Shah NH, Alexander PR, Nyulas C, Tudorache T, Musen MA: BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications. Nucleic Acids Res 2011, 39(Web Server issue):W541-W545.","journal-title":"Nucleic Acids Res"},{"key":"5102_CR21","unstructured":"Bioportal Ontology OWL file[http:\/\/bioportal.bioontology.org\/ontologies\/2086?p=terms] (last accessed December 12th 2011)"},{"issue":"3","key":"5102_CR22","doi-asserted-by":"publisher","first-page":"314","DOI":"10.1136\/jamia.2010.007914","volume":"18","author":"D Segagni","year":"2011","unstructured":"Segagni D, Ferrazzi F, Larizza C, Tibollo V, Napolitano C, Priori SG, Bellazzi R: R engine cell: integrating R into the i2b2 software infrastructure. J Am Med Inform Assoc 2011, 18(3):314\u2013317. 10.1136\/jamia.2010.007914","journal-title":"J Am Med Inform Assoc"},{"key":"5102_CR23","unstructured":"I2b2 community website[https:\/\/community.i2b2.org] (last accessed November 23rd 2011)"},{"issue":"5","key":"5102_CR24","doi-asserted-by":"publisher","first-page":"624","DOI":"10.1197\/jamia.M3191","volume":"16","author":"GM Weber","year":"2009","unstructured":"Weber GM, Murphy SN, McMurry AJ, Macfadden D, Nigrin DJ, Churchill S, Kohane IS: The Shared Health Research Information Network (SHRINE): a prototype federated query tool for clinical data repositories. J Am Med Inform Assoc 2009, 16(5):624\u2013630. 10.1197\/jamia.M3191","journal-title":"J Am Med Inform Assoc"},{"key":"5102_CR25","unstructured":"Phillips LC, Minovitsky S, Ratnere I, Dubchak I, Kohane I, Murphy SN: Use Genomic Variants in Informatics for Integrating Biology and the Bedside (i2b2). AMIA Proceedings Library: 48\u201352 T2011"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-13-S4-S5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T18:42:13Z","timestamp":1630521733000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-13-S4-S5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,3,28]]},"references-count":25,"journal-issue":{"issue":"S4","published-print":{"date-parts":[[2012,12]]}},"alternative-id":["5102"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-13-s4-s5","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,3,28]]},"assertion":[{"value":"28 March 2012","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S5"}}