{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T14:15:20Z","timestamp":1777644920398,"version":"3.51.4"},"reference-count":62,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2023,3,17]],"date-time":"2023-03-17T00:00:00Z","timestamp":1679011200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,3,17]],"date-time":"2023-03-17T00:00:00Z","timestamp":1679011200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["IJDAR"],"published-print":{"date-parts":[[2023,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Censuses are structured documents of great value for social and demographic history, which became widespread from the nineteenth century on. However, the plurality of formats and the natural variability of historical data make their extraction arduous and often lead to ungeneric recognition algorithms. We propose an end-to-end processing pipeline, based on optimization, in an attempt to reduce the number of free parameters. The layout analysis is based on semantic segmentation using neural networks for a generic recognition of the explicit column structure. The implicit row structure is deduced directly from the position of the text segments. The handwritten text detection is complemented by an intelligent framing method which significantly improves the quality of the HTR. In the end, we propose to combine several post-correction approaches, neural networks, and language models, to further improve the performance. Ultimately, our flexible methods make it possible to accurately detect more than 98% of the columns and 88% of the rows, despite the lack of graphical separator and the diversity of formats. Thanks to various reframing and post-correction strategies, HTR results reach the excellent performance of 3.44% character error rate on these noisy nineteenth century data. In total, more than 18,831 pages were extracted in 72 censuses over a century. This large historical dataset, as well as training data, is made open-access and released along with this article.<\/jats:p>","DOI":"10.1007\/s10032-023-00428-9","type":"journal-article","created":{"date-parts":[[2023,3,17]],"date-time":"2023-03-17T12:02:56Z","timestamp":1679054576000},"page":"419-432","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["An end-to-end pipeline for historical censuses processing"],"prefix":"10.1007","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9138-6727","authenticated-orcid":false,"given":"R\u00e9mi","family":"Petitpierre","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marion","family":"Kramer","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7172-2495","authenticated-orcid":false,"given":"Lucas","family":"Rappo","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,3,17]]},"reference":[{"key":"428_CR1","doi-asserted-by":"crossref","unstructured":"Ruggles, S.: (2014) Big Microdata for Population Research. Demography 51(1):287\u2013297. https:\/\/www.jstor.org\/stable\/42919999","DOI":"10.1007\/s13524-013-0240-2"},{"issue":"3","key":"428_CR2","doi-asserted-by":"publisher","first-page":"398","DOI":"10.1080\/1031461X.2016.1208258","volume":"47","author":"L Williams","year":"2016","unstructured":"Williams, L., Godfrey, B.: Bringing the prisoner into view: english and Welsh census data and the Victorian prison population. Australian Historical Stud. 47(3), 398\u2013413 (2016). https:\/\/doi.org\/10.1080\/1031461X.2016.1208258","journal-title":"Australian Historical Stud."},{"key":"428_CR3","unstructured":"Municipalit\u00e9, de Lausanne AVL RB 14-023. Proc\u00e8s-verbaux de la Municipalit\u00e9 de Lausanne. p 376 (1828)"},{"key":"428_CR4","doi-asserted-by":"publisher","unstructured":"Sibade, C., Retornaz, T., Nion, T., et\u00a0al.: Automatic indexing of French handwritten census registers for probate geneaology. In: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing. Association for Computing Machinery, New York, NY, USA, HIP \u201911, pp 51\u201358, (2011) https:\/\/doi.org\/10.1145\/2037342.2037352","DOI":"10.1145\/2037342.2037352"},{"key":"428_CR5","doi-asserted-by":"publisher","unstructured":"Nion, T., Menasri, F., Louradour, J., et\u00a0al.: Handwritten information extraction from historical census documents. In: 2013 12th International Conference on Document Analysis and Recognition, pp 822\u2013826, (2013) https:\/\/doi.org\/10.1109\/ICDAR.2013.168","DOI":"10.1109\/ICDAR.2013.168"},{"key":"428_CR6","doi-asserted-by":"publisher","unstructured":"Clawson, R., Bauer, K., Chidester, G., et al.: Automated recognition and extraction of tabular fields for the indexing of census records. In: Zanibbi R, Coa\u0169snon B (eds) Document Recognition and Retrieval XX, International Society for Optics and Photonics, vol 8658. SPIE, pp 170 \u2013 180, (2013) https:\/\/doi.org\/10.1117\/12.2004788","DOI":"10.1117\/12.2004788"},{"key":"428_CR7","doi-asserted-by":"crossref","unstructured":"Pedersen, B.R., Holsb\u00f8, Andersen, T., et al Lessons learned developing and using a machine learning model to automatically transcribe 2.3 million handwritten occupation codes. Histor Life Course Stud 12:87 (2022) https:\/\/doi.org\/10.51964\/hlcs11331","DOI":"10.51964\/hlcs11331"},{"key":"428_CR8","unstructured":"Andr\u00e9s Moreno, J.: Search and information extraction in handwritten tables. Universitat Polit\u00e8cnica de Val\u00e8ncia, Master thesis (2021)"},{"key":"428_CR9","doi-asserted-by":"publisher","unstructured":"Lang, E., Puigcerver, J., Toselli, A.H., et al.: Probabilistic indexing and search for information extraction on handwritten german parish records. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), IEEE, pp 44\u201349, (2018) https:\/\/doi.org\/10.1109\/ICFHR-2018.2018.00017","DOI":"10.1109\/ICFHR-2018.2018.00017"},{"key":"428_CR10","doi-asserted-by":"publisher","unstructured":"Simistira, F., Bouillon, M., Seuret, M., et al.: ICDAR2017 Competition on Layout Analysis for Challenging Medieval Manuscripts. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp 1361\u20131370, (2017) https:\/\/doi.org\/10.1109\/ICDAR.2017.223, ISSN: 2379-2140","DOI":"10.1109\/ICDAR.2017.223"},{"key":"428_CR11","doi-asserted-by":"publisher","unstructured":"Diem, M., Kleber, F., Fiel, S., et al.: cBAD: ICDAR2017 Competition on baseline detection. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp 1355\u20131360, (2017) https:\/\/doi.org\/10.1109\/ICDAR.2017.222, ISSN: 2379-2140","DOI":"10.1109\/ICDAR.2017.222"},{"key":"428_CR12","doi-asserted-by":"publisher","unstructured":"Gr\u00fcning, T., Leifert, G., StrauSS, T., et al.: A two-stage method for text line detection in historical documents. International Journal on Document Analysis and Recognition (IJDAR) 22(3), 285\u2013302 (2019). https:\/\/doi.org\/10.1007\/s10032-019-00332-1","DOI":"10.1007\/s10032-019-00332-1"},{"key":"428_CR13","doi-asserted-by":"publisher","unstructured":"Guerry, C., Co\u00fcasnon, B., Lemaitre, A.: combination of deep learning and syntactical approaches for the interpretation of interactions between text-lines and tabular structures in handwritten documents. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp 858\u2013863, (2019). https:\/\/doi.org\/10.1109\/ICDAR.2019.00142, ISSN: 2379-2140","DOI":"10.1109\/ICDAR.2019.00142"},{"key":"428_CR14","doi-asserted-by":"publisher","unstructured":"Co\u00fcasnon, B., Lemaitre, A.: Recognition of tables and forms. In: Doermann D, Tombre K (eds) Handbook of Document Image Processing and Recognition. Springer, London, p 647\u2013677, (2014). https:\/\/doi.org\/10.1007\/978-0-85729-859-1_20","DOI":"10.1007\/978-0-85729-859-1_20"},{"key":"428_CR15","unstructured":"Clinchant S., D\u00e9jean, H., Meunier, J.L., et\u00a0al.: Comparing machine learning approaches for table recognition in historical register books. arxiv, (2019). https:\/\/arxiv.org\/abs\/1906.11901"},{"key":"428_CR16","doi-asserted-by":"publisher","unstructured":"D\u00e9jean, H., Meunier, J.L.: Table rows segmentation. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp 461\u2013466, (2019) https:\/\/doi.org\/10.1109\/ICDAR.2019.00080, ISSN: 2379-2140","DOI":"10.1109\/ICDAR.2019.00080"},{"key":"428_CR17","doi-asserted-by":"publisher","unstructured":"Schreiber, S., Agne, S., Wolf, I., et al.: DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp 1162\u20131167, (2017) https:\/\/doi.org\/10.1109\/ICDAR.2017.192, ISSN: 2379-2140","DOI":"10.1109\/ICDAR.2017.192"},{"key":"428_CR18","doi-asserted-by":"publisher","DOI":"10.1007\/s11036-021-01759-9","author":"A Zucker","year":"2021","unstructured":"Zucker, A., Belkada, Y., Vu, H., et al.: ClusTi: clustering method for table structure recognition in scanned images. Mobile Netw. Appl. (2021). https:\/\/doi.org\/10.1007\/s11036-021-01759-9","journal-title":"Mobile Netw. Appl."},{"issue":"100","key":"428_CR19","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1016\/j.bdr.2021.100195","volume":"24","author":"X Liang","year":"2021","unstructured":"Liang, X., Cheddad, A., Hall, J.: Comparative study of layout analysis of tabulated historical documents. Big Data Res. 24(100), 195 (2021). https:\/\/doi.org\/10.1016\/j.bdr.2021.100195","journal-title":"Big Data Res."},{"key":"428_CR20","doi-asserted-by":"publisher","unstructured":"Breuel, T.M.: The OCRopus open source OCR system. In: Document Recognition and Retrieval XV, vol 6815. International Society for Optics and Photonics, p 68150F, (2008) https:\/\/doi.org\/10.1117\/12.783598","DOI":"10.1117\/12.783598"},{"key":"428_CR21","doi-asserted-by":"crossref","unstructured":"Shen, Z., Zhang, R., Dell, M., et al.: LayoutParser: a unified toolkit for deep learning based document image analysis. (2021) arXiv, https:\/\/arxiv.org\/abs\/2103.15348","DOI":"10.1007\/978-3-030-86549-8_9"},{"key":"428_CR22","doi-asserted-by":"publisher","unstructured":"Oliveira, S.A., Seguin, B., Kaplan, F.: dhSegment: A generic deep-learning approach for document segmentation. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp 7\u201312, (2018) https:\/\/doi.org\/10.1109\/ICFHR-2018.2018.00011","DOI":"10.1109\/ICFHR-2018.2018.00011"},{"key":"428_CR23","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3431\u20133440, (2015). https:\/\/www.cv-foundation.org\/openaccess\/content_cvpr_2015\/html\/Long_Fully_Convolutional_Networks_2015_CVPR_paper.html","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"428_CR24","unstructured":"Li, M., Cui, L., Huang, S., et\u00a0al.: TableBank: table benchmark for image-based table detection and recognition. In: Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, pp 1918\u20131925, (2020). https:\/\/www.aclweb.org\/anthology\/2020.lrec-1.236"},{"key":"428_CR25","doi-asserted-by":"publisher","unstructured":"de\u00a0Sousa\u00a0Neto, AF., Bezerra, BLD., Toselli, AH., et\u00a0al.: HTR-Flor++: a handwritten text recognition system based on a pipeline of optical and language models. In: Proceedings of the ACM Symposium on Document Engineering 2020. Association for Computing Machinery, New York, NY, USA, DocEng \u201920, pp 1\u20134, (2020). https:\/\/doi.org\/10.1145\/3395027.3419603","DOI":"10.1145\/3395027.3419603"},{"key":"428_CR26","doi-asserted-by":"publisher","unstructured":"Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp 67\u201372, (2017).https:\/\/doi.org\/10.1109\/ICDAR.2017.20, ISSN: 2379-2140","DOI":"10.1109\/ICDAR.2017.20"},{"key":"428_CR27","doi-asserted-by":"publisher","unstructured":"Bluche, T., Messina, R.: Gated convolutional recurrent neural networks for multilingual handwriting recognition. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp 646\u2013651, (2017). https:\/\/doi.org\/10.1109\/ICDAR.2017.111, ISSN: 2379-2140","DOI":"10.1109\/ICDAR.2017.111"},{"key":"428_CR28","doi-asserted-by":"publisher","unstructured":"Aradillas\u00a0Jaramillo, JC., Murillo-Fuentes, JJ., M.\u00a0Olmos, P.: Boosting handwriting text recognition in small databases with transfer learning. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp 429\u2013434, (2018). https:\/\/doi.org\/10.1109\/ICFHR-2018.2018.00081","DOI":"10.1109\/ICFHR-2018.2018.00081"},{"key":"428_CR29","doi-asserted-by":"crossref","unstructured":"Rigaud, C., Doucet, A., Coustaty, M., et\u00a0al.: ICDAR 2019 competition on post-OCR text correction. In: 15th International Conference on Document Analysis and Recognition, Sydney, Australia, pp 1588\u20131593, (2019) . https:\/\/hal.archives-ouvertes.fr\/hal-02304334","DOI":"10.1109\/ICDAR.2019.00255"},{"key":"428_CR30","unstructured":"Levenshtein, VI., et\u00a0al.: Binary codes capable of correcting deletions, insertions, and reversals. In: Soviet physics doklady, Soviet Union, pp 707\u2013710 (1966)"},{"key":"428_CR31","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1371\/journal.pone.0220219","volume":"15","author":"S Bell","year":"2020","unstructured":"Bell, S., Marlow, T., Wombacher, K., et al.: Automated data extraction from historical city directories: the rise and fall of mid-century gas stations in providence, ri. PLOS One 15, 8 (2020). https:\/\/doi.org\/10.1371\/journal.pone.0220219","journal-title":"PLOS One"},{"key":"428_CR32","unstructured":"Haldar, R., Mukhopadhyay, D.: Levenshtein distance technique in dictionary lookup methods: a improved approach. (2011)http:\/\/arxiv.org\/abs\/1101.1232"},{"issue":"1","key":"428_CR33","doi-asserted-by":"publisher","first-page":"28","DOI":"10.3808\/jei.201700381","volume":"34","author":"D Berenbaum","year":"2019","unstructured":"Berenbaum, D., Deighan, D., Marlow, T., et al.: Mining spatio-temporal data on industrialization from historical registries. J. Environ. Inf. 34(1), 28\u201334 (2019). https:\/\/doi.org\/10.3808\/jei.201700381","journal-title":"J. Environ. Inf."},{"key":"428_CR34","doi-asserted-by":"publisher","unstructured":"H\u00e4l\u00e4inen, M., Hengchen, S.: From the Paft to the Fiiture: a Fully automatic NMT and word embeddings method for OCR post-correction. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019). INCOMA Ltd., Varna, Bulgaria, pp 431\u2013436, (2019) .https:\/\/doi.org\/10.26615\/978-954-452-056-4_051","DOI":"10.26615\/978-954-452-056-4_051"},{"key":"428_CR35","unstructured":"Hakala, K., Vesanto, A., Miekka, N., et\u00a0al.: Leveraging text repetitions and denoising autoencoders in ocr post-correction. (2019) arXiv, https:\/\/arxiv.org\/abs\/1906.10907"},{"key":"428_CR36","doi-asserted-by":"publisher","first-page":"21","DOI":"10.3390\/app10217711","volume":"10","author":"AF de Sousa Neto","year":"2020","unstructured":"de Sousa Neto, A.F., Bezerra, B.L.D.: Toselli AH Towards the natural language processing as spelling correction for offline handwritten text recognition systems. Appl. Sci. 10, 21 (2020). https:\/\/doi.org\/10.3390\/app10217711","journal-title":"Appl. Sci."},{"key":"428_CR37","doi-asserted-by":"publisher","unstructured":"Kissos I, Dershowitz N.: OCR Error Correction Using Character Correction and Feature-Based Word Classification. In: 2016 12th IAPR Workshop on Document Analysis Systems (DAS), pp 198\u2013203, (2016)https:\/\/doi.org\/10.1109\/DAS.2016.44","DOI":"10.1109\/DAS.2016.44"},{"key":"428_CR38","unstructured":"Mikolov, T., Chen, K., Corrado, G., et\u00a0al. Efficient estimation of word representations in vector space. arXiv, (2013) https:\/\/arxiv.org\/abs\/1301.3781"},{"key":"428_CR39","unstructured":"Devlin, J., Chang, M., Lee, K., et\u00a0al. BERT: pre-training of deep bidirectional transformers for language understanding. (2018) http:\/\/arxiv.org\/abs\/8100.4805"},{"key":"428_CR40","doi-asserted-by":"crossref","unstructured":"Roy, A., Ghosh, S., Ghosh, K., et\u00a0al. An unsupervised normalization algorithm for noisy text: a case study for information retrieval and stance detection. (2021) http:\/\/arxiv.org\/abs\/2101.03303","DOI":"10.1145\/3418036"},{"key":"428_CR41","doi-asserted-by":"publisher","unstructured":"Cao, H., Rawls, S., Natarajan, P. 1990 us census form recognition using ctc network, wfst language model, and surname correction. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp 977\u2013982, (2017) https:\/\/doi.org\/10.1109\/ICDAR.2017.163","DOI":"10.1109\/ICDAR.2017.163"},{"issue":"1","key":"428_CR42","doi-asserted-by":"publisher","first-page":"62","DOI":"10.1109\/TSMC.1979.4310076","volume":"9","author":"N Otsu","year":"1979","unstructured":"Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst., Man, Cybernet. 9(1), 62\u201366 (1979). https:\/\/doi.org\/10.1109\/TSMC.1979.4310076","journal-title":"IEEE Trans. Syst., Man, Cybernet."},{"key":"428_CR43","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., Brox, T. U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, Springer, pp 234\u2013241 (2015)","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"428_CR44","unstructured":"Kingma, D.P., Ba, J. Adam: a method for stochastic optimization. http:\/\/arxiv.org\/abs\/1412.6980 (2014)"},{"key":"428_CR45","doi-asserted-by":"publisher","first-page":"474","DOI":"10.1016\/B978-0-12-336156-1.50061-6","volume":"4","author":"K Zuiderveld","year":"1994","unstructured":"Zuiderveld, K.: Contrast limited adaptive histogram equalization. Gr. Gems 4, 474\u2013485 (1994)","journal-title":"Gr. Gems"},{"key":"428_CR46","first-page":"87","volume":"24","author":"J Bergstra","year":"2011","unstructured":"Bergstra, J., Bardenet, R., Bengio, Y., et al.: Algorithms for hyper-parameter optimization. Adv. Neural Inf. Process. Syst. 24, 87 (2011)","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"428_CR47","unstructured":"(2020) Computer vision annotation tool. https:\/\/cvat.org\/"},{"key":"428_CR48","doi-asserted-by":"publisher","unstructured":"Rappo, L., Petitpierre, R., Kramer, M.: Lausanne Historical Censuses Dataset HTR 35k (2023). https:\/\/doi.org\/10.5281\/zenodo.7711178","DOI":"10.5281\/zenodo.7711178"},{"key":"428_CR49","doi-asserted-by":"publisher","unstructured":"S\u00e1nchez JA (2016) Bentham Dataset R0. https:\/\/doi.org\/10.5281\/zenodo.44519","DOI":"10.5281\/zenodo.44519"},{"key":"428_CR50","unstructured":"(2021) Register of Swiss Surnames. https:\/\/hls-dhs-dss.ch\/famn\/?lg=e"},{"key":"428_CR51","unstructured":"(2021) History of Work. https:\/\/historyofwork.iisg.nl\/search.php"},{"key":"428_CR52","unstructured":"(2021) Fichier des pr\u00e9noms Etat civil. https:\/\/www.insee.fr\/fr\/statistiques\/2540004?sommaire=4767262"},{"key":"428_CR53","unstructured":"Garbe, W.: 1000x faster spelling correction algorithm. (2012) https:\/\/towardsdatascience.com\/symspellcompound-10ec8f467c9b"},{"key":"428_CR54","unstructured":"Norvig, P.: How to write a spelling corrector. (2007) http:\/\/norvig.com\/spell-correct.html"},{"key":"428_CR55","doi-asserted-by":"crossref","unstructured":"Stefanovi\u010d, P., Kurasova, O., \u0160trimaitis, R.: The n-grams based text similarity detection approach using self-organizing maps and similarity measures. Appl. Sci. 9(9):1870 (2019)","DOI":"10.3390\/app9091870"},{"key":"428_CR56","unstructured":"Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. http:\/\/arxiv.org\/abs\/1409.0473"},{"key":"428_CR57","doi-asserted-by":"crossref","unstructured":"Luong, M.T., Pham, H., Manning, CD.: Effective approaches to attention-based neural machine translation. (2015) http:\/\/arxiv.org\/abs\/1508.04025","DOI":"10.18653\/v1\/D15-1166"},{"key":"428_CR58","unstructured":"de\u00a0Sousa\u00a0Neto, AF. arthurflor23\/spelling-correction. (2020) https:\/\/github.com\/arthurflor23\/spelling-correction"},{"key":"428_CR59","doi-asserted-by":"publisher","unstructured":"Colutto, S., Kahle, P., Guenter, H., et\u00a0al.: Transkribus. a platform for automated text recognition and searching of historical documents. In: 2019 15th International Conference on eScience (eScience), pp 463\u2013466, (2019) https:\/\/doi.org\/10.1109\/eScience.2019.00060","DOI":"10.1109\/eScience.2019.00060"},{"key":"428_CR60","doi-asserted-by":"publisher","unstructured":"Priambada, S., Widyantoro, DH.: Levensthein distance as a post-process to improve the performance of OCR in written road signs. In: 2017 Second International Conference on Informatics and Computing (ICIC), pp 1\u20136, (2017) https:\/\/doi.org\/10.1109\/IAC.2017.8280534","DOI":"10.1109\/IAC.2017.8280534"},{"key":"428_CR61","unstructured":"Corps de police (1898) Recensements communaux pour 1804-1813 et 1832-1898. https:\/\/vidy-archives.lausanne.ch\/adm-c1-rc-106"},{"key":"428_CR62","doi-asserted-by":"publisher","unstructured":"Petitpierre, R., Kramer, M., Rappo, L., et al.: 1805-1898 Census Records of Lausanne: a Long Digital Dataset for Demographic History (2023). https:\/\/doi.org\/10.5281\/zenodo.7711640","DOI":"10.5281\/zenodo.7711640"}],"container-title":["International Journal on Document Analysis and Recognition (IJDAR)"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10032-023-00428-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10032-023-00428-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10032-023-00428-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,17]],"date-time":"2023-10-17T09:14:02Z","timestamp":1697534042000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10032-023-00428-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,17]]},"references-count":62,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,12]]}},"alternative-id":["428"],"URL":"https:\/\/doi.org\/10.1007\/s10032-023-00428-9","relation":{},"ISSN":["1433-2833","1433-2825"],"issn-type":[{"value":"1433-2833","type":"print"},{"value":"1433-2825","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,17]]},"assertion":[{"value":"23 August 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 January 2023","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 February 2023","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 March 2023","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare they have no competing financial or non-financial interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}