{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T20:49:58Z","timestamp":1777409398853,"version":"3.51.4"},"reference-count":34,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2025,3,3]],"date-time":"2025-03-03T00:00:00Z","timestamp":1740960000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"State Grid Sichuan Electric Power Company Science and Technology Program","award":["521997230014"],"award-info":[{"award-number":["521997230014"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Existing substation equipment image data augmentation models face challenges such as high dataset size requirements, difficult training processes, and insufficient condition control. This paper proposes a transformer equipment image data augmentation method based on a Stable Diffusion model. The proposed method incorporates the Low-Rank Adaptation (LoRA) concept to fine-tune the pre-trained Stable Diffusion model weights, significantly reducing training requirements while effectively integrating the essential features of transformer equipment image data. To minimize interference from complex backgrounds, the Segment Anything Model (SAM) is employed for preprocessing, thereby enhancing the quality of generated image data. The experimental results demonstrate significant improvements in evaluation metrics using the proposed method. Specifically, when implemented with the YOLOv7 model, the accuracy metric shows a 16.4 percentage point improvement compared to \u201cStandard image transformations\u201d (e.g., rotation and scaling) and a 2.3 percentage point improvement over DA-Fusion. Comparable improvements are observed in the SSD and Faster-RCNN object detection models. Notably, the model demonstrates advantages in reducing false-negative rates (higher Recall). The proposed approach successfully addresses key data augmentation challenges in transformer fault detection applications.<\/jats:p>","DOI":"10.3390\/info16030197","type":"journal-article","created":{"date-parts":[[2025,3,3]],"date-time":"2025-03-03T07:37:17Z","timestamp":1740987437000},"page":"197","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Stable Diffusion-Driven Conditional Image Augmentation for Transformer Fault Detection"],"prefix":"10.3390","volume":"16","author":[{"given":"Wenlong","family":"Liao","sequence":"first","affiliation":[{"name":"Electric Power Research Institute, Sichuan Electric Power Corporation, Chengdu 610072, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yiping","family":"Jiang","sequence":"additional","affiliation":[{"name":"Sichuan Electric Power Corporation, Chengdu 610094, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rui","family":"Liu","sequence":"additional","affiliation":[{"name":"Electric Power Research Institute, Sichuan Electric Power Corporation, Chengdu 610072, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yun","family":"Feng","sequence":"additional","affiliation":[{"name":"Electric Power Research Institute, Sichuan Electric Power Corporation, Chengdu 610072, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yu","family":"Zhang","sequence":"additional","affiliation":[{"name":"Electric Power Research Institute, Sichuan Electric Power Corporation, Chengdu 610072, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jin","family":"Hou","sequence":"additional","affiliation":[{"name":"School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jun","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,3,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1145\/3422622","article-title":"Generative adversarial networks","volume":"63","author":"Goodfellow","year":"2020","journal-title":"Commun. ACM"},{"key":"ref_2","first-page":"6840","article-title":"Denoising diffusion probabilistic models","volume":"33","author":"Ho","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_3","first-page":"150","article-title":"A Survey of Image Data Augmentation Based on Deep Learning","volume":"51","author":"Sun","year":"2024","journal-title":"Comput. Sci."},{"key":"ref_4","unstructured":"Radford, A., Metz, L., and Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv."},{"key":"ref_5","unstructured":"Antoniou, A., Storkey, A., and Edwards, H. (2017). Data augmentation generative adversarial networks. arXiv."},{"key":"ref_6","unstructured":"Mirza, M., and Osindero, S. (2014). Conditional generative adversarial nets. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Yang, H., and Zhou, Y. (2021, January 10\u201315). Ida-gan: A novel imbalanced data augmentation gan. Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy.","DOI":"10.1109\/ICPR48806.2021.9411996"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Hong, M., Choi, J., and Kim, G. (2021, January 20\u201325). Stylemix: Separating content and style for enhanced data augmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01462"},{"key":"ref_9","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2013MICCAI 2015: 18th International Conference, Munich, Germany. Proceedings Part III 18."},{"key":"ref_10","unstructured":"Song, J., Meng, C., and Ermon, S. (2020). Denoising diffusion implicit models. arXiv."},{"key":"ref_11","unstructured":"Ho, J., and Salimans, T. (2022). Classifier-free diffusion guidance. arXiv."},{"key":"ref_12","first-page":"8780","article-title":"Diffusion models beat gans on image synthesis","volume":"34","author":"Dhariwal","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_13","unstructured":"Trabucco, B., Doherty, K., Gurinas, M.A., and Salakhutdinov, R. (2023). Effective Data Augmentation with Diffusion Models. arXiv."},{"key":"ref_14","unstructured":"Gal, R., Alaluf, Y., Atzmon, Y., Patashnik, O., Bermano, A.H., Chechik, G., and Cohen-Or, D. (2022). An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., and Le, Q.V. (2019, January 15\u201320). Autoaugment: Learning augmentation strategies from data. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00020"},{"key":"ref_16","unstructured":"Lim, S., Kim, I., Kim, T., Kim, C., and Kim, S. (2019, January 8\u201314). Fast AutoAugment. Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Hataya, R., Zdenek, J., Yoshizoe, K., and Nakayama, H. (2020, January 23\u201328). Faster autoaugment: Learning augmentation strategies using backpropagation. Proceedings of the Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK. Proceedings, Part XXV 16.","DOI":"10.1007\/978-3-030-58595-2_1"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Hataya, R., Zdenek, J., Yoshizoe, K., and Nakayama, H. (2022, January 3\u20138). Meta approach to data augmentation optimization. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","DOI":"10.1109\/WACV51458.2022.00359"},{"key":"ref_19","first-page":"187","article-title":"Cross-Domain Adaptive Object Detection in Foggy Weather Based on CNN Image Augmentation","volume":"59","author":"Guo","year":"2023","journal-title":"Comput. Eng. Appl."},{"key":"ref_20","first-page":"1240","article-title":"Low-Light Image Augmentation Combining Attention Mechanism and Dual-Branch Residual Network","volume":"43","author":"Zu","year":"2023","journal-title":"Comput. Appl."},{"key":"ref_21","unstructured":"Hu, E.J., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. (2021). LoRA: Low-Rank Adaptation of Large Language Models. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., and Lo, W.-Y. (2023, January 1\u20136). Segment anything. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. (2022, January 18\u201324). High-resolution image synthesis with latent diffusion models. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Lin, H., Cheng, X., Wu, X., and Shen, D. (2022, January 18\u201322). Cat: Cross attention in vision transformer. Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan.","DOI":"10.1109\/ICME52920.2022.9859720"},{"key":"ref_25","first-page":"25278","article-title":"Laion-5b: An open large-scale dataset for training next generation image-text models","volume":"35","author":"Schuhmann","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_26","unstructured":"Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., and Clark, J. (2021, January 18\u201324). Learning transferable visual models from natural language supervision. Proceedings of the International Conference on Machine Learning, PMLR, Virtual."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The Pascal Visual Object Classes (VOC) Challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Maire, M., Belongie, S.J., Hays, J., Perona, P., Ramanan, D., Dollar, P., and Zitnick, C.L. (2014). Microsoft COCO: Common Objects in Context, Springer. CoRR; abs\/1405.0312.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_29","first-page":"5998","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Strudel, R., Garcia, R., Laptev, I., and Schmid, C. (2021, January 10\u201317). Segmenter: Transformer for semantic segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00717"},{"key":"ref_31","unstructured":"Tzutalin (2024, December 01). LabelImg. Git Code 2015. Available online: https:\/\/github.com\/tzutalin\/labelImg."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A.C. (2016, January 11\u201314). Ssd: Single shot multibox detector. Proceedings of the Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands. Proceedings, Part I 14.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_33","first-page":"9","article-title":"Faster r-cnn: Towards real-time object detection with region proposal networks","volume":"28","author":"Ren","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Bochkovskiy, A., and Liao, H.Y.M. (2023, January 17\u201324). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00721"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/3\/197\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T16:46:16Z","timestamp":1760028376000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/3\/197"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,3]]},"references-count":34,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2025,3]]}},"alternative-id":["info16030197"],"URL":"https:\/\/doi.org\/10.3390\/info16030197","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,3,3]]}}}