{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,20]],"date-time":"2025-10-20T10:23:25Z","timestamp":1760955805993,"version":"build-2065373602"},"reference-count":49,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2025,10,18]],"date-time":"2025-10-18T00:00:00Z","timestamp":1760745600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Ministry of Economic Development of the Russian Federation","award":["000000C313925P4G0002"],"award-info":[{"award-number":["000000C313925P4G0002"]}]},{"name":"Ivannikov Institute for System Programming of the Russian Academy of Sciences","award":["139-15-2025-011"],"award-info":[{"award-number":["139-15-2025-011"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Informatics"],"abstract":"<jats:p>Background: Significant progress has been made in the field of machine learning, enabling the development of methods for automatic interpretation of medical images that provide high-quality diagnostics. However, most of these methods require access to confidential data, making them difficult to apply under strict privacy requirements. Existing privacy-preserving approaches, such as federated learning and dataset distillation, have limitations related to data access, visual interpretability, etc. Methods: This study explores the use of generative models to create synthetic medical data that preserves the statistical properties of the original data while ensuring privacy. The research is carried out on the VinDr-Mammo dataset of digital mammography images. A conditional generative method using Latent Diffusion Models (LDMs) is proposed with conditioning on diagnostic labels and lesion information. Diagnostic utility and privacy robustness are assessed via cancer classification tasks and re-identification tasks using Siamese neural networks and membership inference. Results: The generated synthetic data achieved a Fr\u00e9chet Inception Distance (FID) of 5.8, preserving diagnostic features. A model trained solely on synthetic data achieved comparable performance to one trained on real data (ROC-AUC: 0.77 vs. 0.82). Visual assessments showed that synthetic images are indistinguishable from real ones. Privacy evaluations demonstrated a low re-identification risk (e.g., mAP@R = 0.0051 on the test set), confirming the effectiveness of the privacy-preserving approach. Conclusions: The study demonstrates that privacy-preserving generative models can produce synthetic medical images with sufficient quality for diagnostic task while significantly reducing the risk of patient re-identification. This approach enables secure data sharing and model training in privacy-sensitive domains such as medical imaging.<\/jats:p>","DOI":"10.3390\/informatics12040112","type":"journal-article","created":{"date-parts":[[2025,10,20]],"date-time":"2025-10-20T08:18:54Z","timestamp":1760948334000},"page":"112","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Privacy-Preserving Synthetic Mammograms: A Generative Model Approach to Privacy-Preserving Breast Imaging Datasets"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-7831-3891","authenticated-orcid":false,"given":"Damir","family":"Shodiev","sequence":"first","affiliation":[{"name":"Ivannikov Institute for System Programming of the Russian Academy of Science, Moscow 109004, Russia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8370-6911","authenticated-orcid":false,"given":"Egor","family":"Ushakov","sequence":"additional","affiliation":[{"name":"Ivannikov Institute for System Programming of the Russian Academy of Science, Moscow 109004, Russia"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-3561-3817","authenticated-orcid":false,"given":"Arsenii","family":"Litvinov","sequence":"additional","affiliation":[{"name":"Ivannikov Institute for System Programming of the Russian Academy of Science, Moscow 109004, Russia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1145-5118","authenticated-orcid":false,"given":"Yury","family":"Markin","sequence":"additional","affiliation":[{"name":"ISP RAS Research Center for Trusted Artificial Intelligence, Moscow 109004, Russia"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,18]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Prodan, M., Paraschiv, E., and Stanciu, A. (2023). Applying deep learning methods for mammography analysis and breast cancer detection. Appl. Sci., 13.","DOI":"10.3390\/app13074272"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1016\/j.cmpb.2016.10.007","article-title":"Classification of CT brain images based on deep learning networks","volume":"138","author":"Gao","year":"2017","journal-title":"Comput. Methods Programs Biomed."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"108105","DOI":"10.1016\/j.compeleceng.2022.108105","article-title":"A deep learning approach for brain tumor classification using MRI images","volume":"101","author":"Aamir","year":"2022","journal-title":"Comput. Electr. Eng."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Ushakov, E., Naumov, A., Fomberg, V., Vishnyakova, P., Asaturova, A., Badlaeva, A., Tregubova, A., Karpulevich, E., Sukhikh, G., and Fatkhudinov, T. (2023). EndoNet: A Model for the Automatic Calculation of H-Score on Histological Slides. Informatics, 10.","DOI":"10.3390\/informatics10040090"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Ibragimov, A., Senotrusova, S., Markova, K., Karpulevich, E., Ivanov, A., Tyshchuk, E., Grebenkina, P., Stepanova, O., Sirotskaya, A., and Kovaleva, A. (2023). Deep Semantic Segmentation of Angiogenesis Images. Int. J. Mol. Sci., 24.","DOI":"10.3390\/ijms24021102"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1342","DOI":"10.1038\/s41591-018-0107-6","article-title":"Clinically applicable deep learning for diagnosis and referral in retinal disease","volume":"24","author":"Ledsam","year":"2018","journal-title":"Nat. Med."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"e314","DOI":"10.1016\/S2589-7500(20)30085-6","article-title":"Multiclass semantic segmentation and quantification of traumatic brain injury lesions on head CT using deep learning: An algorithm development and multicentre validation study","volume":"2","author":"Monteiro","year":"2020","journal-title":"Lancet Digit. Health"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1241","DOI":"10.1016\/j.drudis.2018.01.039","article-title":"The rise of deep learning in drug discovery","volume":"23","author":"Chen","year":"2018","journal-title":"Drug Discov. Today"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Xing, X., Papanastasiou, G., D\u00ecaz, O., Alberich, L.C., Osuala, R., Nan, Y., Lekadir, K., and Yang, G. (2025). Generating Synthetic Data in Cancer Research. Trustworthy AI in Cancer Imaging Research, Springer.","DOI":"10.1007\/978-3-031-89963-8_4"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1038\/s41591-018-0316-z","article-title":"A guide to deep learning in healthcare","volume":"25","author":"Esteva","year":"2019","journal-title":"Nat. Med."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"793","DOI":"10.1001\/jama.2015.292","article-title":"Sharing clinical trial data: Maximizing benefits, minimizing risk","volume":"313","author":"Lo","year":"2015","journal-title":"J. Am. Med. Assoc."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1144","DOI":"10.1186\/1471-2458-14-1144","article-title":"A systematic review of barriers to data sharing in public health","volume":"14","author":"Paul","year":"2014","journal-title":"BMC Public Health"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"575","DOI":"10.1007\/s00439-018-1919-7","article-title":"International data-sharing norms: From the OECD to the general data protection regulation (GDPR)","volume":"137","author":"Phillips","year":"2018","journal-title":"Hum. Genet."},{"key":"ref_15","unstructured":"Fan, L. (2019, January 11). Differential privacy for image publication. Proceedings of the Theory and Practice of Differential Privacy (TPDP) Workshop 2019, London, UK."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Mongkolluksamee, S., and Khonthapagdee, S. (2025\u20131, January 26). Privacy-Preserving Breast Density Classification in Mammograms Using Fuzzy C-Means and Homomorphic Encryption. Proceedings of the 2025 17th International Conference on Knowledge and Smart Technology (KST), Bangkok, Thailand.","DOI":"10.1109\/KST65016.2025.11003367"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"14851","DOI":"10.1038\/s41598-022-19045-3","article-title":"Deep learning-based patient re-identification is able to exploit the biometric nature of medical chest X-ray data","volume":"12","author":"Syben","year":"2022","journal-title":"Sci. Rep."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Hadsell, R., Chopra, S., and LeCun, Y. (2006, January 17\u201322). Dimensionality reduction by learning an invariant mapping. Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition\u2014Volume 2 (CVPR\u201906), New York, NY, USA.","DOI":"10.1109\/CVPR.2006.100"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Skandarani, Y., Jodoin, P.-M., and Lalande, A. (2023). GANs for medical image synthesis: An empirical study. J. Imaging, 9.","DOI":"10.3390\/jimaging9030069"},{"key":"ref_20","first-page":"6840","article-title":"Denoising diffusion probabilistic models","volume":"33","author":"Ho","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Angel, R.M.-D., Sam-Millan, K., Vilanova, J.C., and Mart\u2019\u0131, R. (2024). Mame: Mammographic synthetic image generation with diffusion models. Sensors, 24.","DOI":"10.3390\/s24072076"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Sun, Y., Chen, Z., Zheng, H., Deng, W., Liu, J., Min, W., Elazab, A., Wan, X., Wang, C., and Ge, R. (2025). BS-LDM: Effective Bone Suppression in High-Resolution Chest X-Ray Images with Conditional Latent Diffusion Models. IEEE J. Biomed. Health Inform.","DOI":"10.1109\/JBHI.2025.3588138"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zhu, L., Xue, Z., Jin, Z., Liu, X., He, J., Liu, Z., and Yu, L. (2023). Make-A-Volume: Leveraging Latent Diffusion Models for Cross-Modality 3D Brain MRI Synthesis. arXiv.","DOI":"10.1007\/978-3-031-43999-5_56"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"101979","DOI":"10.1016\/j.media.2021.101979","article-title":"Applying deep learning in digital breast tomosynthesis for breast cancer screening: Opportunities and challenges","volume":"70","author":"Shi","year":"2021","journal-title":"Med. Image Anal."},{"key":"ref_25","unstructured":"Ho, J., and Salimans, T. (2022). Classifier-Free Diffusion Guidance. arXiv."},{"key":"ref_26","unstructured":"Podell, D., English, Z., Lacey, K., Blattmann, A., Dockhorn, T., M\u00fcller, J., Penna, J., and Rombach, R. (2023). SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1038\/s41746-021-00507-3","article-title":"Overcoming barriers to data sharing with medical image generation: A comprehensive evaluation","volume":"4","author":"Sch","year":"2021","journal-title":"npj Digit. Med."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhang, L., Rao, A., and Agrawala, M. (2023). ControlNet: Adding Conditional Control to Text-to-Image Diffusion Models. arXiv.","DOI":"10.1109\/ICCV51070.2023.00355"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Mou, C., Wang, X., Xie, L., Wu, Y., Zhang, J., Qi, Z., Shan, Y., and Qie, X. (2023). T2I-Adapter: Learning Adapters for More Controllable Text-to-Image Diffusion Models. arXiv.","DOI":"10.1609\/aaai.v38i5.28226"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Nguyen, H.T., Nguyen, H.Q., Pham, H.H., Lam, K., Le, L.T., Dao, M., and Vu, V. (2022). VinDr-Mammo: A large-scale benchmark dataset for computer-aided diagnosis in full-field digital mammography. medRxiv.","DOI":"10.1101\/2022.03.07.22272009"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1953","DOI":"10.1038\/s41598-022-05539-7","article-title":"Federated learning and differential privacy for medical image analysis","volume":"12","author":"Adnan","year":"2022","journal-title":"Sci. Rep."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"106775","DOI":"10.1016\/j.knosys.2021.106775","article-title":"A survey on federated learning","volume":"216","author":"Zhang","year":"2021","journal-title":"Knowl.-Based Syst."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"845","DOI":"10.56553\/popets-2025-0044","article-title":"Towards privacy-preserving and fairness-aware federated learning framework","volume":"2025","author":"Bendoukha","year":"2025","journal-title":"Proc. Priv. Enhancing Technol."},{"key":"ref_34","unstructured":"(2025, August 31). Seitzer, Pytorch-fid: FID Score for PyTorch. Version 0.3.0. Available online: https:\/\/github.com\/mseitzer\/pytorch-fid."},{"key":"ref_35","unstructured":"Tan, M., and Le, Q. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Woodland, M., Castelo, A., Al Taie, M., Silva, J.A.M., Eltaher, M., Mohn, F., Shieh, A., Kundu, S., Yung, J.P., and Patel, A.B. (2024). Feature extraction for generative medical imaging evaluation: New evidence against an evolving trend. International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer.","DOI":"10.1007\/978-3-031-72390-2_9"},{"key":"ref_37","first-page":"92","article-title":"Multi-institutional Deep Learning Modeling Without Sharing Patient Data: A Feasibility Study on Brain Tumor Segmentation","volume":"Volume 11383","author":"Sheller","year":"2019","journal-title":"Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"107330","DOI":"10.1016\/j.asoc.2021.107330","article-title":"Federated learning for COVID-19 screening from Chest X-ray images","volume":"106","author":"Feki","year":"2021","journal-title":"Appl. Soft Comput."},{"key":"ref_39","first-page":"181","article-title":"Federated Learning for Breast Density Classification: A Real-World Collaborative Setting","volume":"Volume 12444","author":"Roth","year":"2020","journal-title":"Domain Adaptation and Representation Transfer and Medical Image Learning with Less Labels and Imperfect Data (DART\u2013MIA 2020)"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1038\/s42256-023-00652-2","article-title":"Federated benchmarking of medical artificial intelligence with MedPerf","volume":"5","author":"Karargyris","year":"2023","journal-title":"Nat. Mach. Intell."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"102999","DOI":"10.1016\/j.media.2023.102999","article-title":"Diffusion models in medical imaging: A comprehensive survey","volume":"91","author":"Wang","year":"2023","journal-title":"Med. Image Anal."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"7303","DOI":"10.1038\/s41598-023-34341-2","article-title":"Denoising diffusion probabilistic models for 3D medical image generation","volume":"13","author":"Khader","year":"2023","journal-title":"Sci. Rep."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"107189","DOI":"10.1016\/j.cmpb.2022.107189","article-title":"Compressed gastric image generation based on soft-label dataset distillation for medical data sharing","volume":"227","author":"Li","year":"2022","journal-title":"Comput. Methods Programs Biomed."},{"key":"ref_44","unstructured":"Li, G., Togo, R., Ogawa, T., and Haseyama, M. (2022). Dataset distillation for medical dataset sharing. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. (2022, January 18\u201324). High-resolution image synthesis with latent diffusion models. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Ibragimov, A., Senotrusova, S., Litvinov, A., Ushakov, E., Karpulevich, E., and Markin, Y. (2024, January 2\u20134). MamT4: Multi-view attention networks for mammography cancer classification. Proceedings of the 2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC), Osaka, Japan.","DOI":"10.1109\/COMPSAC61105.2024.00313"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"12495","DOI":"10.1038\/s41598-019-48995-4","article-title":"Deep learning to improve breast cancer detection on screening mammography","volume":"9","author":"Shen","year":"2019","journal-title":"Sci. Rep."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"2925","DOI":"10.1038\/s41598-023-29521-z","article-title":"Unsupervised anomaly detection with generative adversarial networks in mammography","volume":"13","author":"Park","year":"2023","journal-title":"Sci. Rep."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"106255","DOI":"10.1016\/j.bspc.2024.106255","article-title":"Gan-based data augmentation to improve breast ultrasound and mammography mass classification","volume":"94","author":"Lakshminarayanan","year":"2024","journal-title":"Biomed. Signal Process. Control."}],"container-title":["Informatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2227-9709\/12\/4\/112\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,20]],"date-time":"2025-10-20T09:01:59Z","timestamp":1760950919000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2227-9709\/12\/4\/112"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,18]]},"references-count":49,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["informatics12040112"],"URL":"https:\/\/doi.org\/10.3390\/informatics12040112","relation":{},"ISSN":["2227-9709"],"issn-type":[{"value":"2227-9709","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,18]]}}}