{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T09:50:15Z","timestamp":1775296215074,"version":"3.50.1"},"reference-count":54,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2019,4,20]],"date-time":"2019-04-20T00:00:00Z","timestamp":1555718400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>Vast amounts of clinical and biomedical research data are produced daily. These data can help enable data driven healthcare through novel biomedical discoveries, improved diagnostics processes, epidemiology, and education. However, finding, and gaining access to these data and relevant metadata that are necessary to achieve these goals remains a challenge. Furthermore, data management and enabling widespread, albeit controlled, use poses a major challenge for data producers. These data sources are often geographically distributed, with diverse characteristics, and are controlled by a host of logistical and legal factors that require appropriate governance and access control guarantees. To overcome these obstacles, a set of guiding principles under the term FAIR has been previously introduced. The primary desirable dataset properties are thus that the data should be Findable, Accessible, Interoperable, and Reusable (FAIR). In this paper, we introduce and describe an abstract framework that models these ideal goals, and could be a step toward supporting data driven research. We also develop a system instantiated on our framework called the Data integration and indexing System (DiiS). The system provides an integration model for making healthcare data available on a global scale. Our research work describes the challenges inhibiting data producers, data stewards, and data brokers in achieving FAIR goals for sharing biomedical data. We attempt to address some of the key challenges through the proposed system. We evaluated our framework using the software architecture testing technique and also looked at how different challenges in data integration are addressed by our system. Our evaluation shows that the DiiS framework is a user friendly data integration system that would greatly contribute to biomedical research.<\/jats:p>","DOI":"10.3390\/data4020054","type":"journal-article","created":{"date-parts":[[2019,4,22]],"date-time":"2019-04-22T11:02:53Z","timestamp":1555930973000},"page":"54","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["DiiS: A Biomedical Data Access Framework for Aiding Data Driven Research Supporting FAIR Principles"],"prefix":"10.3390","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2631-5751","authenticated-orcid":false,"given":"Priya","family":"Deshpande","sequence":"first","affiliation":[{"name":"College of Computing and Digital Media, DePaul University, Chicago, IL 60604, USA"}]},{"given":"Alexander","family":"Rasin","sequence":"additional","affiliation":[{"name":"College of Computing and Digital Media, DePaul University, Chicago, IL 60604, USA"}]},{"given":"Jacob","family":"Furst","sequence":"additional","affiliation":[{"name":"College of Computing and Digital Media, DePaul University, Chicago, IL 60604, USA"}]},{"given":"Daniela","family":"Raicu","sequence":"additional","affiliation":[{"name":"College of Computing and Digital Media, DePaul University, Chicago, IL 60604, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0040-1387","authenticated-orcid":false,"given":"Sameer","family":"Antani","sequence":"additional","affiliation":[{"name":"National Library of Medicine, Bethesda, MD 20894, USA"}]}],"member":"1968","published-online":{"date-parts":[[2019,4,20]]},"reference":[{"key":"ref_1","unstructured":"NIH (2019, April 19). STRIDES Initiative, Available online: https:\/\/commonfund.nih.gov\/strides\/."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"134023","DOI":"10.1155\/2014\/134023","article-title":"Managing, analysing, and integrating big data in medical bioinformatics: Open problems and future perspectives","volume":"2014","author":"Merelli","year":"2014","journal-title":"BioMed Res. Int."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"McQuilton, P., Gonzalez-Beltran, A., Rocca-Serra, P., Thurston, M., Lister, A., Maguire, E., and Sansone, S.A. (2016). BioSharing: Curated and crowd-sourced metadata standards, databases and data policies in the life sciences. Database, 2016.","DOI":"10.1093\/database\/baw075"},{"key":"ref_4","unstructured":"(2019, April 19). CrowdFlower 2016. Available online: http:\/\/visit.crowdflower.com\/."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1186\/2047-2501-2-3","article-title":"Big data analytics in healthcare: Promise and potential","volume":"2","author":"Raghupathi","year":"2014","journal-title":"Health Inf. Sci. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"398","DOI":"10.3122\/jabfm.2018.03.170274","article-title":"Practice facilitator strategies for addressing electronic health record data challenges for quality improvement: EvidenceNOW","volume":"31","author":"Hemler","year":"2018","journal-title":"J. Am. Board Fam. Med."},{"key":"ref_7","first-page":"1107","article-title":"Real-time Data Fusion Platforms: The Need of Multi-dimensional Data-driven Research in Biomedical Informatics","volume":"216","author":"Raje","year":"2015","journal-title":"Stud. Health Technol. Informat."},{"key":"ref_8","unstructured":"NIH (2019, April 19). PubMed, Available online: https:\/\/www.ncbi.nlm.nih.gov\/pubmed\/."},{"key":"ref_9","unstructured":"(2019, April 19). dryad. Available online: https:\/\/datadryad.org\/."},{"key":"ref_10","unstructured":"NCI (2019, April 19). NCI data, Available online: https:\/\/datascience.cancer.gov\/."},{"key":"ref_11","first-page":"24","article-title":"Infrastructure and distributed learning methodology for privacy-preserving multi-centric rapid learning health care: EuroCAT","volume":"4","author":"Deist","year":"2017","journal-title":"Clin. Transl. Radiat. Oncol."},{"key":"ref_12","first-page":"10","article-title":"An Integrated Database and Smart Search Tool for Medical Knowledge Extraction from Radiology Teaching Files","volume":"69","author":"Deshpande","year":"2017","journal-title":"Med. Informat. Healthc."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Holzinger, A., Dehmer, M., and Jurisica, I. (2014). Knowledge discovery and interactive data mining in bioinformatics-state-of-the-art, future challenges and research directions. BMC Bioinform., 15.","DOI":"10.1186\/1471-2105-15-S6-I1"},{"key":"ref_14","first-page":"561","article-title":"A Methodology for Fine-Grained Access Control in Exposing Biomedical Data","volume":"247","author":"Trifan","year":"2018","journal-title":"Stud. Health Technol. Informat."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"S46","DOI":"10.1016\/j.jbi.2010.08.001","article-title":"A method to implement fine-grained access control for personal health records through standard relational database queries","volume":"43","author":"Sujansky","year":"2010","journal-title":"J. Biomed. Informat."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"300","DOI":"10.1093\/jamia\/ocx121","article-title":"DataMed\u2013an open source discovery index for finding biomedical datasets","volume":"25","author":"Chen","year":"2018","journal-title":"J. Am. Med Informat. Assoc."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1038\/ng.3864","article-title":"Finding useful data across multiple biomedical data repositories using DataMed","volume":"49","author":"Sansone","year":"2017","journal-title":"Nat. Genet."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Ohno-Machado, L., Sansone, S.A., Alter, G., Fore, I., Grethe, J., Xu, H., Gonzalez-Beltran, A., Rocca-Serra, P., Soysal, E., and Zong, N. (2016). DataMed: Finding useful data across multiple biomedical data repositories. bioRxiv, 094888.","DOI":"10.1101\/094888"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1504\/IJMSO.2014.059126","article-title":"Metadata based management and sharing of distributed biomedical data","volume":"9","author":"Wang","year":"2014","journal-title":"Int. J. Metadata Semant. Ontol."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Trifan, A., and Oliveira, J.L. (2018, January 18\u201321). A FAIR marketplace for biomedical data custodians and clinical researchers. Proceedings of the 2018 IEEE 31st International Symposium on Computer-Based Medical Systems, Karlstad, Sweden.","DOI":"10.1109\/CBMS.2018.00040"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1163","DOI":"10.1377\/hlthaff.2014.0053","article-title":"Big data and new knowledge in medicine: The thinking, training, and tools needed for a learning health system","volume":"33","author":"Krumholz","year":"2014","journal-title":"Health Aff."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"3018","DOI":"10.1016\/j.jacc.2017.10.037","article-title":"Data sharing and cardiology: Platforms and possibilities","volume":"70","author":"Dey","year":"2017","journal-title":"J. Am. Coll. Cardiol."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1016\/j.acra.2015.10.004","article-title":"Big data and the future of radiology informatics","volume":"23","author":"Kansagra","year":"2016","journal-title":"Acad. Radiol."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1825","DOI":"10.1016\/j.jacc.2017.07.786","article-title":"Merits of Data Sharing","volume":"70","author":"Angraal","year":"2017","journal-title":"J. Am. Coll. Cardiol."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Deshpande, P., Rasin, A., Brown, E., Furst, J., Raicu, D., Montner, S., and Armato, S. (2018). Big Data Integration Case Study for Radiology Data Sources. IEEE Life Sci. Conf.","DOI":"10.1109\/LSC.2018.8572185"},{"key":"ref_26","unstructured":"RSNA (2019, April 19). RSNA TFS. Available online: http:\/\/mirc.rsna.org\/query."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"579","DOI":"10.2214\/ajr.179.3.1790579","article-title":"MyPACS.net: A Web-based teaching file authoring tool","volume":"3","author":"Weinberger","year":"2002","journal-title":"Am. J. Roentgenol."},{"key":"ref_28","unstructured":"RSNA (2019, April 19). RadLex Ontology. Available online: http:\/\/www.radlex.org\/."},{"key":"ref_29","unstructured":"SNOMED International (2019, April 19). SNOMEDCT Ontology. Available online: http:\/\/www.snomed.org\/."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"18","DOI":"10.4018\/IJKDB.2018070102","article-title":"Augmenting Medical Decision Making With Text-Based Search of Teaching File Repositories and Medical Ontologies: Text-Based Search of Radiology Teaching Files","volume":"8","author":"Deshpande","year":"2018","journal-title":"Int. J. Knowl. Discov. Bioinform."},{"key":"ref_31","unstructured":"NIH (2019, April 19). Openi, Available online: https:\/\/openi.nlm.nih.gov\/."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"160018","DOI":"10.1038\/sdata.2016.18","article-title":"The FAIR Guiding Principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Sci. Data"},{"key":"ref_33","unstructured":"NIH (2019, April 19). Data Science at NIH, Available online: https:\/\/datascience.nih.gov\/."},{"key":"ref_34","unstructured":"International, H.L.S (2019, April 19). Health Level Seven International. Available online: www.hl7.org."},{"key":"ref_35","unstructured":"HHS (2019, April 19). HITECH, Available online: https:\/\/www.hhs.gov\/hipaa\/for-professionals\/special-topics\/hitech-act-enforcement-interim-final-rule."},{"key":"ref_36","unstructured":"O\u2019Dowd, E. (2019, April 19). Healthcare Data Integration Continues to Challenge Entities. Available online: https:\/\/hitinfrastructure.com\/news."},{"key":"ref_37","unstructured":"Shashank, A. (2019, April 19). Why do Healthcare Organizations Still Struggle with Data Integration. Available online: http:\/\/blog.innovaccer.com\/healthcare-organizations-still-struggle-data-integration\/."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1038\/gim.2013.131","article-title":"Practical challenges in integrating genomic data into the electronic health record","volume":"15","author":"Kho","year":"2013","journal-title":"Genet. Med."},{"key":"ref_39","first-page":"4","article-title":"Challenges for privacy preservation in data integration","volume":"5","author":"Christen","year":"2014","journal-title":"J. Data Inf. Qual."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Clifton, C., Kantarcio\u01e7lu, M., Doan, A., Schadow, G., Vaidya, J., Elmagarmid, A., and Suciu, D. (2004, January 13). Privacy-preserving data integration and sharing. Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, Paris, France.","DOI":"10.1145\/1008694.1008698"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Gomez-Cabrero, D., Abugessaisa, I., Maier, D., Teschendorff, A., Merkenschlager, M., Gisel, A., Ballestar, E., Bongcam-Rudloff, E., Conesa, A., and Tegn\u00e9r, J. (2014). Data integration in the era of omics: Current and future challenges. BMC Syst Biol., 8.","DOI":"10.1186\/1752-0509-8-S2-I1"},{"key":"ref_42","unstructured":"Healthcare Information and Management Systems Society (HIMSS) (2019, April 19). What is Interoperability?. Available online: https:\/\/www.himss.org\/library\/interoperability-standards\/what-is-interoperability."},{"key":"ref_43","unstructured":"UMLS (2019, April 19). UMLS, Available online: https:\/\/www.nlm.nih.gov\/research\/umls."},{"key":"ref_44","first-page":"1595","article-title":"RadLex: A new method for indexing online educational materials","volume":"3","author":"Langlotz","year":"2006","journal-title":"Radiol. Soc. N. Am."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Masseroli, M., Mons, B., Bongcam-Rudloff, E., Ceri, S., Kel, A., Rechenmann, F., Lisacek, F., and Romano, P. (2014). Integrated Bio-Search: Challenges and trends for the integration, search and comprehensive processing of biological information. BMC Bioinform., 15.","DOI":"10.1186\/1471-2105-15-S1-S2"},{"key":"ref_46","unstructured":"Huesch, M.D. (2019, April 19). Using It or Losing It? The Case for Data Scientists Inside Health Care. Available online: https:\/\/catalyst.nejm.org\/case-data-scientists-inside-health-care\/."},{"key":"ref_47","unstructured":"(2019, April 19). EURORAD. Available online: http:\/\/www.eurorad.org\/."},{"key":"ref_48","unstructured":"NIH (2019, April 19). National Institutes of Health Chest X-ray Dataset. Available online: https:\/\/nihcc.app.box.com\/v\/ChestXray-NIHCC\/folder\/36938765345."},{"key":"ref_49","unstructured":"NIH (2019, April 19). NIH Clinical Center Provides One of the Largest Publicly Available Chest X-ray Datasets to Scientific Community, Available online: https:\/\/www.nih.gov\/news-events\/news-releases\/nih-clinical-center-provides-one-largest-publicly-available-chest-x-ray-datasets-scientific-community."},{"key":"ref_50","unstructured":"CIVM (2019, April 19). CENTER for IN VIVO MICROSCOPY (CIVM) dataset. Available online: http:\/\/www.civm.duhs.duke.edu\/devatlas\/index.html."},{"key":"ref_51","unstructured":"OpenfMRI (2019, April 19). Neuroimaging data. Available online: https:\/\/openneuro.org\/."},{"key":"ref_52","unstructured":"Richards, M. (2015). Software Architecture Patterns, O\u2019Reilly Media, Inc."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Richardson, D.J., and Wolf, A.L. (1996, January 16\u201318). Software testing at the architectural level. Proceedings of the Second International Software Architecture Workshop (ISAW-2) and International Workshop on Multiple Perspectives in Software Development (Viewpoints\u2019 96) on SIGSOFT, San Francisco, CA, USA.","DOI":"10.1145\/243327.243605"},{"key":"ref_54","unstructured":"NASA (2019, April 19). Information Integration Overview, Available online: https:\/\/ti.arc.nasa.gov\/tech\/cas\/groups\/information-integration\/."}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/4\/2\/54\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:47:00Z","timestamp":1760186820000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/4\/2\/54"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,4,20]]},"references-count":54,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2019,6]]}},"alternative-id":["data4020054"],"URL":"https:\/\/doi.org\/10.3390\/data4020054","relation":{},"ISSN":["2306-5729"],"issn-type":[{"value":"2306-5729","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,4,20]]}}}