{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,21]],"date-time":"2025-12-21T07:11:43Z","timestamp":1766301103224,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":22,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,9,5]],"date-time":"2021-09-05T00:00:00Z","timestamp":1630800000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,9,5]]},"DOI":"10.1145\/3476887.3476910","type":"proceedings-article","created":{"date-parts":[[2021,11,1]],"date-time":"2021-11-01T04:05:14Z","timestamp":1635739514000},"page":"7-12","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Mixed Model OCR Training on Historical Latin Script for Out-of-the-Box Recognition and Finetuning"],"prefix":"10.1145","author":[{"given":"Christian","family":"Reul","sequence":"first","affiliation":[{"name":"University of Wuerzburg, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christoph","family":"Wick","sequence":"additional","affiliation":[{"name":"Planet AI GmbH, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Maximilian","family":"Noeth","sequence":"additional","affiliation":[{"name":"University of Wuerzburg, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andreas","family":"Buettner","sequence":"additional","affiliation":[{"name":"University of Wuerzburg, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Maximilian","family":"Wehner","sequence":"additional","affiliation":[{"name":"University of Wuerzburg, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Uwe","family":"Springmann","sequence":"additional","affiliation":[{"name":"LMU Munich, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,10,31]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"International Workshop on Camera-Based Document Analysis and Recognition. Springer, 139\u2013149","author":"Afzal Muhammad\u00a0Zeshan","year":"2013","unstructured":"Muhammad\u00a0Zeshan Afzal , Martin Kr\u00e4mer , Syed\u00a0Saqib Bukhari , Mohammad\u00a0Reza Yousefi , Faisal Shafait , and Thomas\u00a0 M Breuel . 2013 . Robust binarization of stereo and monocular document images using percentile filter . In International Workshop on Camera-Based Document Analysis and Recognition. Springer, 139\u2013149 . Muhammad\u00a0Zeshan Afzal, Martin Kr\u00e4mer, Syed\u00a0Saqib Bukhari, Mohammad\u00a0Reza Yousefi, Faisal Shafait, and Thomas\u00a0M Breuel. 2013. Robust binarization of stereo and monocular document images using percentile filter. In International Workshop on Camera-Based Document Analysis and Recognition. Springer, 139\u2013149."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2017.12"},{"key":"e_1_3_2_1_3_1","volume-title":"High-Performance OCR for Printed English and Fraktur Using LSTM Networks. 12th International Conference on Document Analysis and Recognition (2013","author":"Breuel M.","year":"2013","unstructured":"T.\u00a0 M. Breuel , A. Ul-Hasan , M.\u00a0 A. Al-Azawi , and F. Shafait . 2013 . High-Performance OCR for Printed English and Fraktur Using LSTM Networks. 12th International Conference on Document Analysis and Recognition (2013 ), 683\u2013687. https:\/\/doi.org\/10.1109\/ICDAR. 2013 .140 T.\u00a0M. Breuel, A. Ul-Hasan, M.\u00a0A. Al-Azawi, and F. Shafait. 2013. High-Performance OCR for Printed English and Fraktur Using LSTM Networks. 12th International Conference on Document Analysis and Recognition (2013), 683\u2013687. https:\/\/doi.org\/10.1109\/ICDAR.2013.140"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2018.08.011"},{"key":"e_1_3_2_1_5_1","volume-title":"symposium on Generative and Component-Based Software Engineering, Young Researchers Workshop, Vol.\u00a010","author":"Duret-Lutz Alexandre","year":"2000","unstructured":"Alexandre Duret-Lutz . 2000 . Olena: a component-based platform for image processing, mixing generic, generative and OO programming . In symposium on Generative and Component-Based Software Engineering, Young Researchers Workshop, Vol.\u00a010 . Citeseer. Alexandre Duret-Lutz. 2000. Olena: a component-based platform for image processing, mixing generic, generative and OO programming. In symposium on Generative and Component-Based Software Engineering, Young Researchers Workshop, Vol.\u00a010. Citeseer."},{"volume-title":"OCR17: Ground Truth and Models for 17th c. French Prints (and hopefully more). (2020)","author":"Gabay Simon","key":"e_1_3_2_1_6_1","unstructured":"Simon Gabay , Thibault Cl\u00e9rice , and Christian Reul . 2020. OCR17: Ground Truth and Models for 17th c. French Prints (and hopefully more). (2020) . Simon Gabay, Thibault Cl\u00e9rice, and Christian Reul. 2020. OCR17: Ground Truth and Models for 17th c. French Prints (and hopefully more). (2020)."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143891"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3322905.3322917"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2010.72"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.3390\/app9224853"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"crossref","first-page":"3","DOI":"10.21248\/jlcl.33.2018.216","article-title":"Improving OCR Accuracy on Early Printed Books by combining Pretraining, Voting, and Active Learning","volume":"33","author":"Reul Christian","year":"2018","unstructured":"Christian Reul , Uwe Springmann , Christoph Wick , and Frank Puppe . 2018 . Improving OCR Accuracy on Early Printed Books by combining Pretraining, Voting, and Active Learning . JLCL: Special Issue on Automatic Text and Layout Recognition 33 , 1(2018), 3 \u2013 24 . https:\/\/jlcl.org\/content\/2-allissues\/2-heft1-2018\/jlcl_2018-1_1.pdf Christian Reul, Uwe Springmann, Christoph Wick, and Frank Puppe. 2018. Improving OCR Accuracy on Early Printed Books by combining Pretraining, Voting, and Active Learning. JLCL: Special Issue on Automatic Text and Layout Recognition 33, 1(2018), 3\u201324. https:\/\/jlcl.org\/content\/2-allissues\/2-heft1-2018\/jlcl_2018-1_1.pdf","journal-title":"JLCL: Special Issue on Automatic Text and Layout Recognition"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/DAS.2018.30"},{"key":"e_1_3_2_1_13_1","volume-title":"State of the Art Optical Character Recognition of 19th Century Fraktur Scripts using Open Source Engines. DHd 2019 Digital Humanities: multimedial & multimodal","author":"Reul Christian","year":"2019","unstructured":"Christian Reul , Uwe Springmann , Christoph Wick , and Frank Puppe . 2019. State of the Art Optical Character Recognition of 19th Century Fraktur Scripts using Open Source Engines. DHd 2019 Digital Humanities: multimedial & multimodal ( 2019 ). https:\/\/doi.org\/10.5281\/zenodo.2596095 Christian Reul, Uwe Springmann, Christoph Wick, and Frank Puppe. 2019. State of the Art Optical Character Recognition of 19th Century Fraktur Scripts using Open Source Engines. DHd 2019 Digital Humanities: multimedial & multimodal (2019). https:\/\/doi.org\/10.5281\/zenodo.2596095"},{"key":"e_1_3_2_1_14_1","volume-title":"Digitizing Latin incunabula: Challenges, methods, and possibilities. Digital Humanities Quarterly 3, 1","author":"Rydberg-Cox A","year":"2009","unstructured":"Jeffrey\u00a0 A Rydberg-Cox . 2009. Digitizing Latin incunabula: Challenges, methods, and possibilities. Digital Humanities Quarterly 3, 1 ( 2009 ). http:\/\/digitalhumanities.org:8081\/dhq\/vol\/3\/1\/000027\/000027.html Jeffrey\u00a0A Rydberg-Cox. 2009. Digitizing Latin incunabula: Challenges, methods, and possibilities. Digital Humanities Quarterly 3, 1 (2009). http:\/\/digitalhumanities.org:8081\/dhq\/vol\/3\/1\/000027\/000027.html"},{"key":"e_1_3_2_1_15_1","unstructured":"Minjoon Seo Aniruddha Kembhavi Ali Farhadi and Hannaneh Hajishirzi. 2016. Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603(2016). https:\/\/arxiv.org\/abs\/1611.01603  Minjoon Seo Aniruddha Kembhavi Ali Farhadi and Hannaneh Hajishirzi. 2016. Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603(2016). https:\/\/arxiv.org\/abs\/1611.01603"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352631.3352640"},{"key":"e_1_3_2_1_17_1","unstructured":"Uwe Springmann Florian Fink and Klaus\u00a0U. Schulz. 2016. Automatic quality evaluation and (semi-) automatic improvement of mixed models for OCR on historical documents. arXiv preprint arXiv:1606.05157(2016). https:\/\/arxiv.org\/abs\/1606.05157  Uwe Springmann Florian Fink and Klaus\u00a0U. Schulz. 2016. Automatic quality evaluation and (semi-) automatic improvement of mixed models for OCR on historical documents. arXiv preprint arXiv:1606.05157(2016). https:\/\/arxiv.org\/abs\/1606.05157"},{"key":"e_1_3_2_1_18_1","volume-title":"OCR of historical printings with an application to building diachronic corpora: A case study using the RIDGES herbal corpus. Digital Humanities Quarterly 11, 2","author":"Springmann Uwe","year":"2017","unstructured":"Uwe Springmann and Anke L\u00fcdeling . 2017. OCR of historical printings with an application to building diachronic corpora: A case study using the RIDGES herbal corpus. Digital Humanities Quarterly 11, 2 ( 2017 ). http:\/\/www.digitalhumanities.org\/dhq\/vol\/11\/2\/000288\/000288.html Uwe Springmann and Anke L\u00fcdeling. 2017. OCR of historical printings with an application to building diachronic corpora: A case study using the RIDGES herbal corpus. Digital Humanities Quarterly 11, 2 (2017). http:\/\/www.digitalhumanities.org\/dhq\/vol\/11\/2\/000288\/000288.html"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2595188.2595205"},{"key":"e_1_3_2_1_20_1","volume-title":"Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin. JLCL: Special Issue on Automatic Text and Layout Recognition","author":"Springmann Uwe","year":"2019","unstructured":"Uwe Springmann , Christian Reul , Stefanie Dipper , and Johannes Baiter . 2019. Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin. JLCL: Special Issue on Automatic Text and Layout Recognition ( 2019 ). https:\/\/jlcl.org\/content\/2-allissues\/2-heft1-2018\/jlcl_2018-1_5.pdf Uwe Springmann, Christian Reul, Stefanie Dipper, and Johannes Baiter. 2019. Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin. JLCL: Special Issue on Automatic Text and Layout Recognition (2019). https:\/\/jlcl.org\/content\/2-allissues\/2-heft1-2018\/jlcl_2018-1_5.pdf"},{"key":"e_1_3_2_1_21_1","volume-title":"Can we build language-independent OCR using LSTM networks?Proceedings of the 4th International Workshop on Multilingual OCR","author":"Ul-Hasan Adnan","year":"2013","unstructured":"Adnan Ul-Hasan and Thomas\u00a0 M Breuel . 2013. Can we build language-independent OCR using LSTM networks?Proceedings of the 4th International Workshop on Multilingual OCR ( 2013 ), 9. https:\/\/doi.org\/10.1145\/2505377.2505394 Adnan Ul-Hasan and Thomas\u00a0M Breuel. 2013. Can we build language-independent OCR using LSTM networks?Proceedings of the 4th International Workshop on Multilingual OCR (2013), 9. https:\/\/doi.org\/10.1145\/2505377.2505394"},{"key":"e_1_3_2_1_22_1","volume-title":"Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition. Digital Humanities Quarterly 14, 2","author":"Wick Christoph","year":"2020","unstructured":"Christoph Wick , Christian Reul , and Frank Puppe . 2020. Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition. Digital Humanities Quarterly 14, 2 ( 2020 ). http:\/\/www.digitalhumanities.org\/dhq\/vol\/14\/2\/000451\/000451.html Christoph Wick, Christian Reul, and Frank Puppe. 2020. Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition. Digital Humanities Quarterly 14, 2 (2020). http:\/\/www.digitalhumanities.org\/dhq\/vol\/14\/2\/000451\/000451.html"}],"event":{"name":"HIP '21: The 6th International Workshop on Historical Document Imaging and Processing","acronym":"HIP '21","location":"Lausanne Switzerland"},"container-title":["The 6th International Workshop on Historical Document Imaging and Processing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3476887.3476910","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3476887.3476910","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:45Z","timestamp":1750188645000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3476887.3476910"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,5]]},"references-count":22,"alternative-id":["10.1145\/3476887.3476910","10.1145\/3476887"],"URL":"https:\/\/doi.org\/10.1145\/3476887.3476910","relation":{},"subject":[],"published":{"date-parts":[[2021,9,5]]},"assertion":[{"value":"2021-10-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}