{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,18]],"date-time":"2026-04-18T14:12:55Z","timestamp":1776521575625,"version":"3.51.2"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2024,8,19]],"date-time":"2024-08-19T00:00:00Z","timestamp":1724025600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,8,19]],"date-time":"2024-08-19T00:00:00Z","timestamp":1724025600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100008252","name":"Universit\u00e0 degli Studi di Udine","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100008252","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["IJDAR"],"published-print":{"date-parts":[[2025,6]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Data availability is a big concern in the field of document analysis, especially when working on tasks that require a high degree of precision when it comes to the definition of the ground truths on which to train deep learning models. A notable example is represented by the task of document layout analysis in handwritten documents, which requires pixel-precise segmentation maps to highlight the different layout components of each document page. These segmentation maps are typically very time-consuming and require a high degree of domain knowledge to be defined, as they are intrinsically characterized by the content of the text. For this reason in the present work, we explore the effects of different initialization strategies for deep learning models employed for this type of task by relying on both in-domain and cross-domain datasets for their pre-training. To test the employed models we use two publicly available datasets with heterogeneous characteristics both regarding their structure as well as the languages of the contained documents. We show how a combination of cross-domain and in-domain transfer learning approaches leads to the best overall performance of the models, as well as speeding up their convergence process.<\/jats:p>","DOI":"10.1007\/s10032-024-00497-4","type":"journal-article","created":{"date-parts":[[2024,8,19]],"date-time":"2024-08-19T14:02:36Z","timestamp":1724076156000},"page":"161-175","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["In-domain versus out-of-domain transfer learning for document layout analysis"],"prefix":"10.1007","volume":"28","author":[{"given":"Axel","family":"De Nardin","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Silvia","family":"Zottin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Claudio","family":"Piciarelli","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gian Luca","family":"Foresti","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Emanuela","family":"Colombi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,8,19]]},"reference":[{"issue":"8","key":"497_CR1","doi-asserted-by":"publisher","first-page":"5517","DOI":"10.1007\/s10462-020-09827-4","volume":"53","author":"SR Narang","year":"2020","unstructured":"Narang, S.R., Jindal, M.K., Kumar, M.: Ancient text recognition: a review. Artif. Intell. Rev. 53(8), 5517\u20135558 (2020). https:\/\/doi.org\/10.1007\/s10462-020-09827-4","journal-title":"Artif. Intell. Rev."},{"key":"497_CR2","doi-asserted-by":"publisher","unstructured":"Fischer, A., Wuthrich, M., Liwicki, M., Frinken, V., Bunke, H., Viehhauser, G., Stolz, M.: Automatic transcription of handwritten medieval documents. In: 2009 15th International Conference on Virtual Systems and Multimedia, pp. 137\u2013142 (2009). https:\/\/doi.org\/10.1109\/VSMM.2009.26","DOI":"10.1109\/VSMM.2009.26"},{"key":"497_CR3","doi-asserted-by":"publisher","unstructured":"Ni, K., Callier, P., Hatch, B.: Writer identification in noisy handwritten documents. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1177\u20131186 (2017). https:\/\/doi.org\/10.1109\/WACV.2017.136","DOI":"10.1109\/WACV.2017.136"},{"key":"497_CR4","doi-asserted-by":"publisher","unstructured":"Kiessling, B.: A modular region and text line layout analysis system. In: 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 313\u2013318 (2020). https:\/\/doi.org\/10.1109\/ICFHR2020.2020.00064","DOI":"10.1109\/ICFHR2020.2020.00064"},{"key":"497_CR5","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1007\/978-981-16-1092-9_26","volume-title":"Computer Vision and Image Processing","author":"A Minj","year":"2021","unstructured":"Minj, A., Garai, A., Mandal, S.: Text line segmentation: a FCN based approach. In: Singh, S.K., Roy, P., Raman, B., Nagabhushan, P. (eds.) Computer Vision and Image Processing, pp. 305\u2013316. Springer, Singapore (2021)"},{"issue":"4","key":"497_CR6","doi-asserted-by":"publisher","first-page":"299","DOI":"10.1007\/s10032-021-00370-8","volume":"24","author":"A Dutta","year":"2021","unstructured":"Dutta, A., Garai, A., Biswas, S., Das, A.K.: Segmentation of text lines using multi-scale CNN from warped printed and handwritten document images. Int. J. Doc. Anal. Recognit. 24(4), 299\u2013313 (2021). https:\/\/doi.org\/10.1007\/s10032-021-00370-8","journal-title":"Int. J. Doc. Anal. Recognit."},{"key":"497_CR7","doi-asserted-by":"publisher","unstructured":"Zhang, C., Ibrayim, M., Hamdulla, A.: A methodological study of document layout analysis. In: 2022 International Conference on Virtual Reality, Human-Computer Interaction and Artificial Intelligence (VRHCIAI), pp. 12\u201317 (2022). https:\/\/doi.org\/10.1109\/VRHCIAI57205.2022.00009","DOI":"10.1109\/VRHCIAI57205.2022.00009"},{"key":"497_CR8","doi-asserted-by":"publisher","unstructured":"Garz, A., Seuret, M., Simistira, F., Fischer, A., Ingold, R.: Creating ground truth for historical manuscripts with document graphs and scribbling interaction. In: 2016 12th IAPR Workshop on Document Analysis Systems (DAS), pp. 126\u2013131 (2016). https:\/\/doi.org\/10.1109\/DAS.2016.29","DOI":"10.1109\/DAS.2016.29"},{"key":"497_CR9","doi-asserted-by":"publisher","unstructured":"Simistira, F., Seuret, M., Eichenberger, N., Garz, A., Liwicki, M., Ingold, R.: DIVA-HisDB: A precisely annotated large dataset of challenging medieval manuscripts. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 471\u2013476 (2016). https:\/\/doi.org\/10.1109\/ICFHR.2016.0093","DOI":"10.1109\/ICFHR.2016.0093"},{"key":"497_CR10","doi-asserted-by":"publisher","unstructured":"Bukhari, S.S., Breuel, T.M., Asi, A., El-Sana, J.: Layout analysis for Arabic historical document images using machine learning. In: 2012 International Conference on Frontiers in Handwriting Recognition, pp. 639\u2013644 (2012). https:\/\/doi.org\/10.1109\/ICFHR.2012.227","DOI":"10.1109\/ICFHR.2012.227"},{"key":"497_CR11","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-023-09356-5","author":"S Zottin","year":"2024","unstructured":"Zottin, S., De Nardin, A., Colombi, E., Piciarelli, C., Pavan, F., Foresti, G.L.: U-DIADS-Bib: a full and few-shot pixel-precise dataset for document layout analysis of ancient manuscripts. Neural Comput. Appl. (2024). https:\/\/doi.org\/10.1007\/s00521-023-09356-5","journal-title":"Neural Comput. Appl."},{"issue":"10","key":"497_CR12","doi-asserted-by":"publisher","first-page":"2350052","DOI":"10.1142\/S0129065723500521","volume":"33","author":"A De Nardin","year":"2023","unstructured":"De Nardin, A., Zottin, S., Piciarelli, C., Colombi, E., Foresti, G.L.: Few-shot pixel-precise document layout segmentation via dynamic instance generation and local thresholding. Int. J. Neural Syst. 33(10), 2350052 (2023). https:\/\/doi.org\/10.1142\/S0129065723500521","journal-title":"Int. J. Neural Syst."},{"key":"497_CR13","doi-asserted-by":"publisher","unstructured":"Droby, A., Barakat, B.K., Madi, B., Alaasam, R., El-Sana, J.: Unsupervised deep learning for handwritten page segmentation. In: 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR), Dortmund, Germany, pp. 240\u2013245 (2020). https:\/\/doi.org\/10.1109\/ICFHR2020.2020.00052","DOI":"10.1109\/ICFHR2020.2020.00052"},{"key":"497_CR14","doi-asserted-by":"publisher","unstructured":"De\u00a0Nardin, A., Zottin, S., Piciarelli, C., Colombi, E., Foresti, G.L.: A one-shot learning approach to document layout segmentation of ancient Arabic manuscripts. In: 2024 IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 8112\u20138121 (2024). https:\/\/doi.org\/10.1109\/WACV57701.2024.00794","DOI":"10.1109\/WACV57701.2024.00794"},{"issue":"1","key":"497_CR15","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1007\/s10032-021-00362-8","volume":"24","author":"S Tarride","year":"2021","unstructured":"Tarride, S., Lemaitre, A., Co\u00fcasnon, B., Tardivel, S.: Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples. Int J Doc. Anal. Recognit. 24(1), 77\u201396 (2021). https:\/\/doi.org\/10.1007\/s10032-021-00362-8","journal-title":"Int J Doc. Anal. Recognit."},{"key":"497_CR16","doi-asserted-by":"publisher","unstructured":"De\u00a0Nardin, A., Zottin, S., Paier, M., Foresti, G.L., Colombi, E., Piciarelli, C.: Efficient few-shot learning for pixel-precise handwritten document layout analysis. In: Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, Hawaii, pp. 3680\u20133688 (2023). https:\/\/doi.org\/10.1109\/WACV56688.2023.00367","DOI":"10.1109\/WACV56688.2023.00367"},{"key":"497_CR17","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2019.103853","volume":"93","author":"X Li","year":"2020","unstructured":"Li, X., Grandvalet, Y., Davoine, F., Cheng, J., Cui, Y., Zhang, H., Belongie, S., Tsai, Y.-H., Yang, M.-H.: Transfer learning in computer vision tasks: remember where you come from. Image Vis. Comput. 93, 103853 (2020). https:\/\/doi.org\/10.1016\/j.imavis.2019.103853","journal-title":"Image Vis. Comput."},{"issue":"1","key":"497_CR18","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1109\/JPROC.2020.3004555","volume":"109","author":"F Zhuang","year":"2021","unstructured":"Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., He, Q.: A comprehensive survey on transfer learning. Proc. IEEE 109(1), 43\u201376 (2021). https:\/\/doi.org\/10.1109\/JPROC.2020.3004555","journal-title":"Proc. IEEE"},{"issue":"5","key":"497_CR19","doi-asserted-by":"publisher","first-page":"1299","DOI":"10.1109\/TMI.2016.2535302","volume":"35","author":"N Tajbakhsh","year":"2016","unstructured":"Tajbakhsh, N., Shin, J.Y., Gurudu, S.R., Hurst, R.T., Kendall, C.B., Gotway, M.B., Liang, J.: Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE Trans. Med. Imaging 35(5), 1299\u20131312 (2016). https:\/\/doi.org\/10.1109\/TMI.2016.2535302","journal-title":"IEEE Trans. Med. Imaging"},{"issue":"1","key":"497_CR20","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1016\/j.bbe.2021.11.004","volume":"42","author":"P Kora","year":"2022","unstructured":"Kora, P., Ooi, C.P., Faust, O., Raghavendra, U., Gudigar, A., Chan, W.Y., Meenakshi, K., Swaraja, K., Plawiak, P., Rajendra Acharya, U.: Transfer learning techniques for medical image analysis: a review. Biocybern. Biomed. Eng. 42(1), 79\u2013107 (2022). https:\/\/doi.org\/10.1016\/j.bbe.2021.11.004","journal-title":"Biocybern. Biomed. Eng."},{"key":"497_CR21","doi-asserted-by":"publisher","unstructured":"Boyd, A., Czajka, A., Bowyer, K.: Deep learning-based feature extraction in iris recognition: use existing models, fine-tune or train from scratch? In: 2019 IEEE 10th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp. 1\u20139 (2019). https:\/\/doi.org\/10.1109\/BTAS46853.2019.9185978","DOI":"10.1109\/BTAS46853.2019.9185978"},{"key":"497_CR22","doi-asserted-by":"publisher","unstructured":"Abdalla, A., Cen, H., Wan, L., Rashid, R., Weng, H., Zhou, W., He, Y.: Fine-tuning convolutional neural network with transfer learning for semantic segmentation of ground-level oilseed rape images in a field with high weed pressure. Comput. Electron. Agric. 167, 105091 (2019). https:\/\/doi.org\/10.1016\/j.compag.2019.105091","DOI":"10.1016\/j.compag.2019.105091"},{"key":"497_CR23","doi-asserted-by":"publisher","unstructured":"Tercan, H., Guajardo, A., Meisen, T.: Industrial transfer learning: boosting machine learning in production. In: 2019 IEEE 17th International Conference on Industrial Informatics (INDIN), vol. 1, pp. 274\u2013279 (2019). https:\/\/doi.org\/10.1109\/INDIN41052.2019.8972099","DOI":"10.1109\/INDIN41052.2019.8972099"},{"issue":"4","key":"497_CR24","doi-asserted-by":"publisher","first-page":"1278","DOI":"10.3390\/s21041278","volume":"21","author":"J Hua","year":"2021","unstructured":"Hua, J., Zeng, L., Li, G., Ju, Z.: Learning for a robot: deep reinforcement learning, imitation learning, transfer learning. Sensors 21(4), 1278 (2021). https:\/\/doi.org\/10.3390\/s21041278","journal-title":"Sensors"},{"key":"497_CR25","doi-asserted-by":"publisher","unstructured":"Studer, L., Alberti, M., Pondenkandath, V., Goktepe, P., Kolonko, T., Fischer, A., Liwicki, M., Ingold, R.: A comprehensive study of imagenet pre-training for historical document image analysis. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 720\u2013725 (2019). https:\/\/doi.org\/10.1109\/ICDAR.2019.00120","DOI":"10.1109\/ICDAR.2019.00120"},{"key":"497_CR26","doi-asserted-by":"publisher","unstructured":"De\u00a0Nardin, A., Zottin, S., Colombi, E., Piciarelli, C., Foresti, G.L.: Is imagenet always the best option? An overview on transfer learning strategies for document layout analysis. In: Foresti, G.L., Fusiello, A., Hancock, E. (eds.) Image Analysis and Processing\u2014ICIAP 2023 Workshops, pp. 489\u2013499. Springer, Cham (2024). https:\/\/doi.org\/10.1007\/978-3-031-51026-7_41","DOI":"10.1007\/978-3-031-51026-7_41"},{"key":"497_CR27","unstructured":"Chen, L., Papapandreou, G., Schroff, F., Hartwig, A.: Rethinking Atrous convolution for semantic image segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR). IEEE\/CVF, vol. 6 (2017)"},{"key":"497_CR28","unstructured":"Raghu, M., Zhang, C., Kleinberg, J., Bengio, S.: Transfusion: Understanding transfer learning for medical imaging. In: Wallach, H., Larochelle, H., Beygelzimer, A., Alch\u00e9-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32 (2019). https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2019\/file\/eb1e78328c46506b46a4ac4a1e378b91-Paper.pdf"},{"key":"497_CR29","doi-asserted-by":"publisher","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248\u2013255 (2009).https:\/\/doi.org\/10.1109\/CVPR.2009.5206848","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"497_CR30","doi-asserted-by":"publisher","unstructured":"Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: Computer Vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp. 740\u2013755. Springer (2014). https:\/\/doi.org\/10.1007\/978-3-319-10602-1_48","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"497_CR31","doi-asserted-by":"publisher","unstructured":"Garai, A., Biswas, S., Mandal, S., Chaudhuri, B.B.: Automatic rectification of warped Bangla document images. IET Image Processing 14(1), 74\u201383 (2020). https:\/\/doi.org\/10.1049\/iet-ipr.2019.0831https:\/\/ietresearch.onlinelibrary.wiley.com\/doi\/pdf\/10.1049\/iet-ipr.2019.0831","DOI":"10.1049\/iet-ipr.2019.0831"},{"key":"497_CR32","doi-asserted-by":"publisher","unstructured":"Garai, A., Biswas, S., Mandal, S., Chaudhuri, B.B.: Automatic dewarping of camera captured born-digital Bangla document images. In: 2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR), pp. 1\u20136 (2017). https:\/\/doi.org\/10.1109\/ICAPR.2017.8593157","DOI":"10.1109\/ICAPR.2017.8593157"}],"container-title":["International Journal on Document Analysis and Recognition (IJDAR)"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10032-024-00497-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10032-024-00497-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10032-024-00497-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,23]],"date-time":"2025-05-23T06:43:43Z","timestamp":1747982623000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10032-024-00497-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,19]]},"references-count":32,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,6]]}},"alternative-id":["497"],"URL":"https:\/\/doi.org\/10.1007\/s10032-024-00497-4","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-4414436\/v1","asserted-by":"object"}]},"ISSN":["1433-2833","1433-2825"],"issn-type":[{"value":"1433-2833","type":"print"},{"value":"1433-2825","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,19]]},"assertion":[{"value":"13 May 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 July 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 August 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 August 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no Conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}