{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,31]],"date-time":"2026-01-31T08:06:27Z","timestamp":1769846787650,"version":"3.49.0"},"reference-count":44,"publisher":"MDPI AG","issue":"15","license":[{"start":{"date-parts":[[2024,8,1]],"date-time":"2024-08-01T00:00:00Z","timestamp":1722470400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002848","name":"Chilean National Commission for Scientific and Technological Research projects IDEA FONDEF","doi-asserted-by":"publisher","award":["IT21I0019"],"award-info":[{"award-number":["IT21I0019"]}],"id":[{"id":"10.13039\/501100002848","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002848","name":"Chilean National Commission for Scientific and Technological Research projects IDEA FONDEF","doi-asserted-by":"publisher","award":["FB0008"],"award-info":[{"award-number":["FB0008"]}],"id":[{"id":"10.13039\/501100002848","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002848","name":"Chilean National Commission for Scientific and Technological Research projects IDEA FONDEF","doi-asserted-by":"publisher","award":["EQM140119"],"award-info":[{"award-number":["EQM140119"]}],"id":[{"id":"10.13039\/501100002848","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002848","name":"Chilean National Commission for Scientific and Technological Research projects IDEA FONDEF","doi-asserted-by":"publisher","award":["EQM210020"],"award-info":[{"award-number":["EQM210020"]}],"id":[{"id":"10.13039\/501100002848","id-type":"DOI","asserted-by":"publisher"}]},{"name":"ANID-Basal Project","award":["IT21I0019"],"award-info":[{"award-number":["IT21I0019"]}]},{"name":"ANID-Basal Project","award":["FB0008"],"award-info":[{"award-number":["FB0008"]}]},{"name":"ANID-Basal Project","award":["EQM140119"],"award-info":[{"award-number":["EQM140119"]}]},{"name":"ANID-Basal Project","award":["EQM210020"],"award-info":[{"award-number":["EQM210020"]}]},{"name":"FONDEQUIP","award":["IT21I0019"],"award-info":[{"award-number":["IT21I0019"]}]},{"name":"FONDEQUIP","award":["FB0008"],"award-info":[{"award-number":["FB0008"]}]},{"name":"FONDEQUIP","award":["EQM140119"],"award-info":[{"award-number":["EQM140119"]}]},{"name":"FONDEQUIP","award":["EQM210020"],"award-info":[{"award-number":["EQM210020"]}]},{"name":"joint project UTFSM-CASSACA","award":["IT21I0019"],"award-info":[{"award-number":["IT21I0019"]}]},{"name":"joint project UTFSM-CASSACA","award":["FB0008"],"award-info":[{"award-number":["FB0008"]}]},{"name":"joint project UTFSM-CASSACA","award":["EQM140119"],"award-info":[{"award-number":["EQM140119"]}]},{"name":"joint project UTFSM-CASSACA","award":["EQM210020"],"award-info":[{"award-number":["EQM210020"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>This article presents an ingestion procedure towards an interoperable repository called ALPACS (Anonymized Local Picture Archiving and Communication System). ALPACS provides services to clinical and hospital users, who can access the repository data through an Artificial Intelligence (AI) application called PROXIMITY. This article shows the automated procedure for data ingestion from the medical imaging provider to the ALPACS repository. The data ingestion procedure was successfully applied by the data provider (Hospital Cl\u00ednico de la Universidad de Chile, HCUCH) using a pseudo-anonymization algorithm at the source, thereby ensuring that the privacy of patients\u2019 sensitive data is respected. Data transfer was carried out using international communication standards for health systems, which allows for replication of the procedure by other institutions that provide medical images. Objectives: This article aims to create a repository of 33,000 medical CT images and 33,000 diagnostic reports with international standards (HL7 HAPI FHIR, DICOM, SNOMED). This goal requires devising a data ingestion procedure that can be replicated by other provider institutions, guaranteeing data privacy by implementing a pseudo-anonymization algorithm at the source, and generating labels from annotations via NLP. Methodology: Our approach involves hybrid on-premise\/cloud deployment of PACS and FHIR services, including transfer services for anonymized data to populate the repository through a structured ingestion procedure. We used NLP over the diagnostic reports to generate annotations, which were then used to train ML algorithms for content-based similar exam recovery. Outcomes: We successfully implemented ALPACS and PROXIMITY 2.0, ingesting almost 19,000 thorax CT exams to date along with their corresponding reports.<\/jats:p>","DOI":"10.3390\/s24154985","type":"journal-article","created":{"date-parts":[[2024,8,1]],"date-time":"2024-08-01T12:38:20Z","timestamp":1722515900000},"page":"4985","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["A Data Ingestion Procedure towards a Medical Images Repository"],"prefix":"10.3390","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4433-4622","authenticated-orcid":false,"given":"Mauricio","family":"Solar","sequence":"first","affiliation":[{"name":"Departamento de Inform\u00e1tica, Universidad Tecnica Federico Santa Maria, Campus Vitacura-Santiago, Vitacura 7660251, Chile"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0554-0324","authenticated-orcid":false,"given":"Victor","family":"Casta\u00f1eda","sequence":"additional","affiliation":[{"name":"DETEM, Faculty of Medicine, Universidad de Chile, Independencia-Santiago, Santiago 8380453, Chile"}]},{"given":"Ricardo","family":"\u00d1anculef","sequence":"additional","affiliation":[{"name":"Departamento de Inform\u00e1tica, Universidad Tecnica Federico Santa Maria, Campus San Joaquin-Santiago, Santiago 8940897, Chile"}]},{"given":"Lioubov","family":"Dombrovskaia","sequence":"additional","affiliation":[{"name":"Departamento de Inform\u00e1tica, Universidad Tecnica Federico Santa Maria, Campus San Joaquin-Santiago, Santiago 8940897, Chile"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3472-4130","authenticated-orcid":false,"given":"Mauricio","family":"Araya","sequence":"additional","affiliation":[{"name":"Departamento de Inform\u00e1tica, Universidad Tecnica Federico Santa Maria, Campus Casa Central-Valpara\u00edso, Valpara\u00edso 2390123, Chile"}]}],"member":"1968","published-online":{"date-parts":[[2024,8,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1136\/adc.83.1.82","article-title":"PACS (picture archiving and communication systems): Filmless radiology","volume":"83","author":"Strickland","year":"2000","journal-title":"Arch. Dis. Child."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Valente, F., Costa, C., and Silva, A. (2013). Dicoogle, a PACS featuring profiled content based image retrieval. PLoS ONE, 8.","DOI":"10.1371\/journal.pone.0061888"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"284","DOI":"10.1007\/s10278-015-9834-0","article-title":"Anatomy of an Extensible Open Source PACS","volume":"29","author":"Valente","year":"2016","journal-title":"J. Digit. Imaging"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1007\/s11548-011-0625-x","article-title":"A PACS archive architecture supported on cloud services","volume":"7","author":"Silva","year":"2012","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1007\/s11548-008-0269-7","article-title":"Indexing and retrieving DICOM data in disperse and unstructured archives","volume":"4","author":"Costa","year":"2009","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Sotomayor, C.G., Mendoza, M., Casta\u00f1eda, V., Far\u00edas, H., Molina, G., Pereira, G., H\u00e4rtel, S., Solar, M., and Araya, M. (2021). Content-Based Medical Image Retrieval and Intelligent Interactive Visual Browser for Medical Education, Research and Care. Diagnostics, 11.","DOI":"10.3390\/diagnostics11081470"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1007\/s10278-016-9903-z","article-title":"A multimodal search engine for medical imaging studies","volume":"30","author":"Pinho","year":"2017","journal-title":"J. Digit. Imaging"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/j.media.2017.09.007","article-title":"Large-scale retrieval for medical image analytics: A comprehensive review","volume":"43","author":"Li","year":"2018","journal-title":"Med. Image Anal."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_10","unstructured":"Bennani, S., Regnard, N.-E., Lassalle, L., Nguyen, T., Malandrin, C., Koulakian, H., Khafagy, P., Chassagnon, G., and Revel, M.-P. (2022, January 13\u201317). Evaluation of radiologists\u2019 performance compared to a deep learning algorithm for the detection of thoracic abnormalities on chest X-ray. Proceedings of the ECR 2022, Vienna, Austria."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"McLouth, J., Elstrott, S., Chaibi, Y., Quenet, S., Chang, P.D., Chow, D.S., and Soun, J.E. (2021). Validation of a Deep Learning Tool in the Detection of Intracranial Hemorrhage and Large Vessel Occlusion. Front. Neurol., 12.","DOI":"10.3389\/fneur.2021.656112"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1117\/12.360450","article-title":"Efficient access methods for content-based image retrieval with inverted files","volume":"Volume 3846","author":"Squire","year":"1999","journal-title":"Multimedia Storage and Archiving Systems IV"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"675","DOI":"10.1016\/j.neucom.2020.07.139","article-title":"Recent developments of content-based image retrieval (CBIR)","volume":"452","author":"Li","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.ijmedinf.2003.11.024","article-title":"A review of content-based image retrieval systems in medical applications\u2014Clinical benefits and future directions","volume":"73","author":"Michoux","year":"2004","journal-title":"Int. J. Med. Inform."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1988","DOI":"10.1016\/j.patrec.2008.03.001","article-title":"Automatic medical image annotation in ImageCLEF 2007: Overview, results, and discussion","volume":"29","author":"Deselaers","year":"2008","journal-title":"Pattern Recognit. Lett."},{"key":"ref_16","unstructured":"Rodr\u00edguez, A.F., and M\u00fcller, H. (2012, January 29). Ground truth generation in medical imaging: A crowdsourcing-based iterative approach. Proceedings of the ACM Multimedia 2012 Workshop on Crowdsourcing for Multimedia, Nara, Japan."},{"key":"ref_17","unstructured":"Jia, C., Yang, Y., Xia, Y., Chen, Y.-T., Parekh, Z., Pham, H., Le, Q., Sung, Y.-H., Li, Z., and Duerig, T. (2021, January 17\u201324). Scaling up visual and vision-language representation learning with noisy text supervision. Proceedings of the International Conference on Machine Learning 2021, Virtual."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"CIN-S14053","DOI":"10.4137\/CIN.S14053","article-title":"Medical image retrieval: A multimodal approach","volume":"13","author":"Cao","year":"2014","journal-title":"Cancer Inform."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"e584","DOI":"10.1016\/S2589-7500(22)00090-5","article-title":"Deep learning with weak annotation from diagnosis reports for detection of multiple head disorders: A prospective, multicentre study","volume":"4","author":"Guo","year":"2022","journal-title":"Lancet Digit. Health"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1002\/cpt.1736","article-title":"Big Data\u2014How to Realize the Promise","volume":"107","author":"Cave","year":"2020","journal-title":"Clin. Pharmacol. Ther."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"64","DOI":"10.28991\/esj-2019-01170","article-title":"Internet of medical things (IoMT): Acquiring and transforming data into hl7 fhir through 5g network slicing","volume":"3","author":"Mavrogioorgou","year":"2019","journal-title":"Emerg. Sci. J."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"339","DOI":"10.28991\/ESJ-2023-07-02-03","article-title":"Batch and Streaming Data Ingestion towards Creating Holistic Health Records","volume":"7","author":"Mavrogioorgou","year":"2023","journal-title":"Emerg. Sci. J."},{"key":"ref_23","unstructured":"(2024, July 31). Simplilearn. Available online: https:\/\/www.simplilearn.com\/data-ingestion-article."},{"key":"ref_24","unstructured":"(2024, July 31). Cognizant. Available online: https:\/\/www.cognizant.com\/us\/en\/glossary\/data-ingestion#:~:text=Data%20ingestion%20is%20the%20process,can%20be%20accessed%20and%20analyzed."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"3182","DOI":"10.1109\/JBHI.2020.3001518","article-title":"Disrupting Healthcare Silos: Addressing Data Volume, Velocity and Variety With a Cloud-Native Healthcare Data Ingestion Service","volume":"24","author":"Ranchal","year":"2020","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"152076","DOI":"10.1109\/ACCESS.2019.2947261","article-title":"DNA Motif Finding Method Without Protection Can Leak User Privacy","volume":"7","author":"Wu","year":"2019","journal-title":"IEEE Access"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1277","DOI":"10.1002\/spe.2812","article-title":"Flexible data anonymization using ARX\u2014Current status and challenges ahead","volume":"50","author":"Prasser","year":"2020","journal-title":"Softw. Pract. Exper."},{"key":"ref_28","unstructured":"(2024, July 31). HL7 FHIR. Available online: https:\/\/www.hl7.org\/fhir\/."},{"key":"ref_29","unstructured":"Orion Health (2024, July 31). An Intelligent Research Platform: Moving from Data Complexity to Research Excellence. White Paper. Available online: https:\/\/orionhealth.com\/global\/blog\/health-data-ingestion-in-an-on-demand-world\/."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"e230860","DOI":"10.1148\/radiol.230860","article-title":"Using AI to Improve Radiologist Performance in Detection of Abnormalities on Chest Radiographs","volume":"309","author":"Bennani","year":"2023","journal-title":"Radiology"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"68","DOI":"10.26641\/2307-0404.2023.3.288965","article-title":"Artificial intelligence effectivity in fracture detection","volume":"28","author":"Boginskis","year":"2023","journal-title":"Med. Perspekt."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"2885","DOI":"10.1007\/s00330-023-10380-1","article-title":"Commercially-available AI algorithm improves radiologists\u2019 sensitivity for wrist and hand fracture detection on X-ray, compared to a CT-based ground truth","volume":"34","author":"Jacques","year":"2023","journal-title":"Eur. Radiol."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"e209","DOI":"10.1016\/j.wneu.2021.02.134","article-title":"Assessment of an Artificial Intelligence Algorithm for Detection of Intracranial Hemorrhage","volume":"150","author":"Rava","year":"2021","journal-title":"World Neurosurg."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Abuduweili, A., Li, X., Shi, H., Xu, C.Z., and Dou, D. (2021, January 19\u201325). Adaptive consistency regularization for semi-supervised transfer learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2021, Virtual.","DOI":"10.1109\/CVPR46437.2021.00685"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"758","DOI":"10.3897\/jucs.112977","article-title":"Deep Learning techniques to process 3D chest CT","volume":"30","author":"Solar","year":"2024","journal-title":"J. Univers. Comput. Sci."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"e232561","DOI":"10.1148\/radiol.232561","article-title":"Accuracy of ChatGPT, Google Bard, and Microsoft Bing for Simplifying Radiology Reports","volume":"309","author":"Amin","year":"2023","journal-title":"Radiology"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"E913","DOI":"10.7717\/peerj-cs.913","article-title":"Negation and uncertainty detection in clinical texts written in Spanish: A deep learning-based approach","volume":"8","author":"Montenegro","year":"2022","journal-title":"Peer Comput. Sci."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Rojas, M., Dunstan, J., and Villena, F. (2022, January 14). Clinical flair: A pre-trained language model for Spanish clinical natural language processing. Proceedings of the 4th Clinical Natural Language Processing Workshop, Seattle, WA, USA.","DOI":"10.18653\/v1\/2022.clinicalnlp-1.9"},{"key":"ref_39","unstructured":"Faghri, F., Fleet, D.J., Kiros, J.R., and Fidler, S. (2017, January 4\u20137). Vse ++: Improving visual-semantic embeddings with hard negatives. Proceedings of the British Machine Vision Conference 2017, London, UK."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Molina, G., Mendoza, M., Loayza, I., N\u00fa\u00f1ez, C., Araya, M., Casta\u00f1eda, V., and Solar, M. (2022, January 20\u201321). A New Content-Based Image Retrieval System for SARS-CoV-2 Computer-Aided Diagnosis. Proceedings of the 2022 International Conference on Medical Imaging and Computer-Aided Diagnosis (MICAD 2022), Leicester, UK. Lecture Notes in Electrical Engineering.","DOI":"10.1007\/978-981-16-3880-0_33"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"542","DOI":"10.1016\/j.jmoldx.2018.07.009","article-title":"Considerations for genomic data privacy and security when working in the cloud","volume":"21","author":"Carter","year":"2019","journal-title":"J. Mol. Diagn."},{"key":"ref_42","first-page":"224","article-title":"User tests for assessing a medical image retrieval system: A pilot study. Studies in health technology and informatics","volume":"Volume 192","author":"Markonis","year":"2015","journal-title":"MEDINFO"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"774","DOI":"10.1016\/j.ijmedinf.2015.04.003","article-title":"User-oriented evaluation of a medical image retrieval system for radiologists","volume":"84","author":"Markonis","year":"2015","journal-title":"Int. J. Med. Inform."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"22589","DOI":"10.1007\/s11042-020-10173-4","article-title":"Content-based image retrieval system for HRCT lung images: Assisting radiologists in self-learning and diagnosis of Interstitial Lung Diseases","volume":"80","author":"Dash","year":"2021","journal-title":"Multimed. Tools Appl."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/15\/4985\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:28:21Z","timestamp":1760110101000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/15\/4985"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,1]]},"references-count":44,"journal-issue":{"issue":"15","published-online":{"date-parts":[[2024,8]]}},"alternative-id":["s24154985"],"URL":"https:\/\/doi.org\/10.3390\/s24154985","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,1]]}}}