{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T12:59:37Z","timestamp":1780491577991,"version":"3.54.1"},"reference-count":50,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2024,1,19]],"date-time":"2024-01-19T00:00:00Z","timestamp":1705622400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Shadow removal for document images is an essential task for digitized document applications. Recent shadow removal models have been trained on pairs of shadow images and shadow-free images. However, obtaining a large, diverse dataset for document shadow removal takes time and effort. Thus, only small real datasets are available. Graphic renderers have been used to synthesize shadows to create relatively large datasets. However, the limited number of unique documents and the limited lighting environments adversely affect the network performance. This paper presents a large-scale, diverse dataset called the Synthetic Document with Diverse Shadows (SynDocDS) dataset. The SynDocDS comprises rendered images with diverse shadows augmented by a physics-based illumination model, which can be utilized to obtain a more robust and high-performance deep shadow removal network. In this paper, we further propose a Dual Shadow Fusion Network (DSFN). Unlike natural images, document images often have constant background colors requiring a high understanding of global color features for training a deep shadow removal network. The DSFN has a high global color comprehension and understanding of shadow regions and merges shadow attentions and features efficiently. We conduct experiments on three publicly available datasets, the OSR, Kligler\u2019s, and Jung\u2019s datasets, to validate our proposed method\u2019s effectiveness. In comparison to training on existing synthetic datasets, our model training on the SynDocDS dataset achieves an enhancement in the PSNR and SSIM, increasing them from 23.00 dB to 25.70 dB and 0.959 to 0.971 on average. In addition, the experiments demonstrated that our DSFN clearly outperformed other networks across multiple metrics, including the PSNR, the SSIM, and its impact on OCR performance.<\/jats:p>","DOI":"10.3390\/s24020654","type":"journal-article","created":{"date-parts":[[2024,1,19]],"date-time":"2024-01-19T09:46:39Z","timestamp":1705657599000},"page":"654","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Synthetic Document Images with Diverse Shadows for Deep Shadow Removal Networks"],"prefix":"10.3390","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1803-2153","authenticated-orcid":false,"given":"Yuhi","family":"Matsuo","sequence":"first","affiliation":[{"name":"Department of Electrical Engineering, Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Kanagawa, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7361-0027","authenticated-orcid":false,"given":"Yoshimitsu","family":"Aoki","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering, Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Kanagawa, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2024,1,19]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Bako, S., Darabi, S., Shechtman, E., Wang, J., Sunkavalli, K., and Sen, P. (2016, January 20\u201324). Removing Shadows from Images of Documents. Proceedings of the Asian Conference on Computer Vision (ACCV 2016), Taipei, Taiwan.","DOI":"10.1007\/978-3-319-54187-7_12"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Kligler, N., Katz, S., and Tal, A. (2018, January 18\u201323). Document Enhancement Using Visibility Detection. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00252"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Jung, S., Hasan, M.A., and Kim, C. (2018, January 2\u20136). Water-filling: An efficient algorithm for digitized document shadow removal. Proceedings of the Asian Conference on Computer Vision, Perth, Australia.","DOI":"10.1007\/978-3-030-20887-5_25"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Wang, B., and Chen, C.L.P. (2019, January 22\u201325). An Effective Background Estimation Method for Shadows Removal of Document Images. Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803486"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Wang, B., and Chen, C. (2020). Local Water-Filling Algorithm for Shadow Detection and Removal of Document Images. Sensors, 20.","DOI":"10.3390\/s20236929"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Lin, Y.H., Chen, W.C., and Chuang, Y.Y. (2020, January 13\u201319). Bedsr-net: A deep shadow removal network from a single document image. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01292"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wang, J., Li, X., and Yang, J. (2018, January 18\u201323). Stacked conditional generative adversarial networks for jointly learning shadow detection and shadow removal. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00192"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2795","DOI":"10.1109\/TPAMI.2019.2919616","article-title":"Direction-Aware Spatial Context Features for Shadow Detection and Removal","volume":"42","author":"Hu","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Cun, X., Pun, C.M., and Shi, C. (2020, January 7\u201312). Towards ghost-free shadow removal via dual hierarchical aggregation network and shadow matting GAN. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.","DOI":"10.1609\/aaai.v34i07.6695"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhang, L., He, Y., Zhang, Q., Liu, Z., Zhang, X., and Xiao, C. (2023, January 18\u201322). Document Image Shadow Removal Guided by Color-Aware Background. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00181"},{"key":"ref_11","unstructured":"Li, Z., Chen, X., Pun, C.M., and Cun, X. (October, January 30). High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Paris, France."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"4187","DOI":"10.1109\/TCSVT.2020.3047977","article-title":"Learning from Synthetic Shadows for Shadow Detection and Removal","volume":"31","author":"Inoue","year":"2021","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2956","DOI":"10.1109\/TPAMI.2012.214","article-title":"Paired Regions for Shadow Detection and Removal","volume":"35","author":"Guo","year":"2013","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"577","DOI":"10.1111\/j.1467-8659.2008.01155.x","article-title":"The Shadow Meets the Mask: Pyramid-Based Shadow Removal","volume":"27","author":"Shor","year":"2008","journal-title":"Comput. Graph. Forum"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"9088","DOI":"10.1109\/TPAMI.2021.3124934","article-title":"Physics-Based Shadow Image Decomposition for Shadow Removal","volume":"44","author":"Le","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Qu, L., Tian, J., He, S., Tang, Y., and Lau, R.W. (2017, January 21\u201326). Deshadownet: A multi-context embedding deep network for shadow removal. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.248"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Fu, L., Zhou, C., Guo, Q., Juefei-Xu, F., Yu, H., Feng, W., Liu, Y., and Wang, S. (2021, January 20\u201325). Auto-exposure fusion for single-image shadow removal. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01043"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Yu, F., Wang, D., Shelhamer, E., and Darrell, T. (2018, January 18\u201323). Deep layer aggregation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00255"},{"key":"ref_19","unstructured":"Hu, X., Jiang, Y., Fu, C.W., and Heng, P.A. (November, January 27). Mask-shadowgan: Learning to remove shadows from unpaired data. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1853","DOI":"10.1109\/TIP.2020.3048677","article-title":"Shadow removal by a lightness-guided network with training on unpaired data","volume":"30","author":"Liu","year":"2021","journal-title":"IEEE Trans. Image Process."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Liu, Z., Yin, H., Wu, X., Wu, Z., Mi, Y., and Wang, S. (2021, January 20\u201325). From Shadow Generation to Shadow Removal. Proceedings of the CVPR, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00489"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"336","DOI":"10.1007\/s11263-019-01228-7","article-title":"Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization","volume":"128","author":"Selvaraju","year":"2020","journal-title":"Int. J. Comput. Vis."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Sidorov, O. (2019, January 16\u201317). Conditional gans for multi-illuminant color constancy: Revolution or yet another approach?. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA.","DOI":"10.1109\/CVPRW.2019.00225"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2732407","article-title":"Learning to Remove Soft Shadows","volume":"34","author":"Gryka","year":"2015","journal-title":"ACM Trans. Graph."},{"key":"ref_25","unstructured":"Autodesk, I. (2023, October 19). Maya. Available online: https:\/\/autodesk.com\/maya."},{"key":"ref_26","unstructured":"Das, S., Sial, H.A., Ma, K., Baldrich, R., Vanrell, M., and Samaras, D. (2020, January 7\u201310). Intrinsic Decomposition of Document Images In-the-Wild. Proceedings of the 31st British Machine Vision Conference 2020, BMVC 2020, Manchester, UK."},{"key":"ref_27","unstructured":"Das, S., Ma, K., Shu, Z., Samaras, D., and Shilkrot, R. (November, January 27). Dewarpnet: Single-image document unwarping with stacked 3d and 2d regression networks. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Clausner, C., Antonacopoulos, A., and Pletschacher, S. (2017, January 9\u201315). ICDAR2017 Competition on Recognition of Documents with Complex Layouts\u2014RDCL2017. Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan.","DOI":"10.1109\/ICDAR.2017.229"},{"key":"ref_29","unstructured":"Blender Online Community (2018). Blender\u2014A 3D Modelling and Rendering Package, Stichting Blender Foundation. Blender Foundation."},{"key":"ref_30","unstructured":"Veach, E., and Guibas, L.J. (2023). Seminal Graphics Papers: Pushing the Boundaries, Association for Computing Machinery. [1st ed.]."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zharikov, I., Nikitin, P., Vasiliev, I., and Dokholyan, V. (2020, January 17\u201319). DDI-100. Proceedings of the 4th International Symposium on Computer Science and Intelligent Control, Newcastle upon Tyne, UK.","DOI":"10.1145\/3440084.3441192"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3197517.3201378","article-title":"Single-image svbrdf capture with a rendering-aware deep network","volume":"37","author":"Deschaintre","year":"2018","journal-title":"ACM Trans. Graph."},{"key":"ref_33","unstructured":"Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., and Su, H. (2023, October 19). ShapeNet: An Information-Rich 3D Model Repository, Available online: http:\/\/xxx.lanl.gov\/abs\/1512.03012."},{"key":"ref_34","unstructured":"Xiao, J., Ehinger, K.A., Oliva, A., and Torralba, A. (2012, January 16\u201321). Recognizing scene viewpoint using panoramic place representation. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3130800.3130891","article-title":"Learning to predict indoor illumination from a single image","volume":"36","author":"Gardner","year":"2017","journal-title":"ACM Trans. Graph."},{"key":"ref_36","first-page":"2","article-title":"Recovering intrinsic scene characteristics","volume":"2","author":"Barrow","year":"1978","journal-title":"Comput. Vis. Syst."},{"key":"ref_37","unstructured":"Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., and Luo, P. (2021). SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Matsuo, Y., Akimoto, N., and Aoki, Y. (2022, January 16\u201319). Document Shadow Removal with Foreground Detection Learning From Fully Synthetic Images. Proceedings of the IEEE International Conference on Image Processing (ICIP), Bordeaux, France.","DOI":"10.1109\/ICIP46576.2022.9897217"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"ImageNet Large Scale Visual Recognition Challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_40","unstructured":"Song, Y., Zhou, Y., Qian, H., and Du, X. (2022). Rethinking Performance Gains in Image Dehazing Networks. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_42","unstructured":"Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., and Huang, T.S. (November, January 27). Free-form image inpainting with gated convolution. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., and Yang, M.H. (2022, January 18\u201324). Restormer: Efficient transformer for high-resolution image restoration. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00564"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Li, X., Wang, W., Hu, X., and Yang, J. (2019, January 18\u201324). Selective kernel networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR.2019.00060"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"2011","DOI":"10.1109\/TPAMI.2019.2913372","article-title":"Squeeze-and-Excitation Networks","volume":"42","author":"Hu","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Isola, P., Zhu, J., Zhou, T., and Efros, A.A. (2017, January 21\u201326). Image-to-image translation with conditional adversarial networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.632"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Johnson, J., Alahi, A., and Li, F.-F. (2016, January 11\u201314). Perceptual losses for real-time style transfer and super-resolution. Proceedings of the Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands. Proceedings, Part II 14.","DOI":"10.1007\/978-3-319-46475-6_43"},{"key":"ref_48","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Smith, R.W. (2009, January 26\u201329). Hybrid Page Layout Analysis via Tab-Stop Detection. Proceedings of the 10th International Conference on Document Analysis and Recognition, Barcelona, Spain.","DOI":"10.1109\/ICDAR.2009.257"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Gusfield, D. (1997). Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Cambridge University Press. EBL-Schweitzer.","DOI":"10.1017\/CBO9780511574931"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/2\/654\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T13:46:07Z","timestamp":1760103967000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/2\/654"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,19]]},"references-count":50,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2024,1]]}},"alternative-id":["s24020654"],"URL":"https:\/\/doi.org\/10.3390\/s24020654","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,1,19]]}}}