{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T14:10:26Z","timestamp":1776953426732,"version":"3.51.4"},"reference-count":64,"publisher":"Association for Computing Machinery (ACM)","issue":"2","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Comput. Cult. Herit."],"published-print":{"date-parts":[[2025,6,30]]},"abstract":"<jats:p>Twenty-five hundred years ago, the \u201cpaperwork\u201d of the Achaemenid Empire was recorded on clay tablets. In 1933, archaeologists from the University of Chicago\u2019s Institute for the Study of Ancient Cultures (ISAC, formerly Oriental Institute) found tens of thousands of these tablets and fragments during the excavation of Persepolis. Many of these tablets have been painstakingly photographed and annotated by expert cuneiformists, and now provide a rich dataset consisting of over 5,000 annotated tablet images and 100,000 cuneiform sign bounding boxes encoding the Elamite language. We leverage this dataset to develop DeepScribe, the first computer vision pipeline capable of localizing Elamite cuneiform signs and providing suggestions for the identity of each sign. We investigate the difficulty of learning subtasks relevant to Elamite cuneiform tablet transcription on ground-truth data, finding that a RetinaNet object detector achieves a localization mAP of 0.78 and a ResNet classifier achieves a top-5 sign classification accuracy of 0.89. The end-to-end pipeline achieves a top-5 classification accuracy of 0.80. As part of the classification module, DeepScribe groups cuneiform signs into morphological clusters. We consider how this automatic clustering approach differs from the organization of standard, printed sign lists and what we learn from it. These components, trained individually, are sufficient to produce a system that can analyze photos of cuneiform tablets from the Achaemenid period and provide useful transliteration suggestions to researchers. We evaluate the model\u2019s end-to-end performance on locating and classifying signs, providing a roadmap to a linguistically aware transliteration system, then consider the model\u2019s potential utility when applied to other periods of cuneiform writing.<\/jats:p>","DOI":"10.1145\/3716850","type":"journal-article","created":{"date-parts":[[2025,3,7]],"date-time":"2025-03-07T19:46:46Z","timestamp":1741376806000},"page":"1-32","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["DeepScribe: Localization and Classification of Elamite Cuneiform Signs via Deep Learning"],"prefix":"10.1145","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5812-2831","authenticated-orcid":false,"given":"Edward C.","family":"Williams","sequence":"first","affiliation":[{"name":"Independent Researcher, Los Angeles, California, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5102-9258","authenticated-orcid":false,"given":"Grace","family":"Su","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Columbia University, New York, New York, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0695-6797","authenticated-orcid":false,"given":"Sandra R.","family":"Schloen","sequence":"additional","affiliation":[{"name":"Forum for Digital Culture, University of Chicago, Chicago, Illinois, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6344-080X","authenticated-orcid":false,"given":"Miller","family":"Prosser","sequence":"additional","affiliation":[{"name":"Forum for Digital Culture, University of Chicago, Chicago, Illinois, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-5872-4607","authenticated-orcid":false,"given":"Susanne","family":"Paulus","sequence":"additional","affiliation":[{"name":"Institute for the Study of Ancient Cultures, University of Chicago, Chicago, Illinois, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6968-4090","authenticated-orcid":false,"given":"Sanjay","family":"Krishnan","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Chicago, Chicago, Illinois, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,6,21]]},"reference":[{"key":"e_1_3_3_2_2","unstructured":"Colaboratory. 2022. Colaboratory. Retrieved March 24 2022 from https:\/\/research.google.com\/colaboratory\/faq.html"},{"key":"e_1_3_3_3_2","first-page":"201","article-title":"Der Umfang des keilschriftlichen Textkorpus\u2019","volume":"42","author":"Gro\u00dfes Fach Altorientalistik","year":"2010","unstructured":"Gro\u00dfes Fach Altorientalistik. 2010. Der Umfang des keilschriftlichen Textkorpus\u2019. Mitteilungen der Deutschen Orient-Gesellschaft zu Berlin 42 (2010), 201.","journal-title":"Mitteilungen der Deutschen Orient-Gesellschaft zu Berlin"},{"key":"e_1_3_3_4_2","unstructured":"Th\u00e9odore Bluche. 2016. Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. arXiv:1604.08352. Retrieved from http:\/\/arxiv.org\/abs\/1604.08352"},{"key":"e_1_3_3_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2015.7333777"},{"key":"e_1_3_3_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/DAS.2018.56"},{"key":"e_1_3_3_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICFHR2020.2020.00053"},{"key":"e_1_3_3_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3491239"},{"key":"e_1_3_3_9_2","volume":"305","author":"Borger Rykle","year":"2004","unstructured":"Rykle Borger. 2004. Mesopotamisches Zeichenlexikon, Vol. 305. Ugarit-Verlag.","journal-title":"Mesopotamisches Zeichenlexikon"},{"key":"e_1_3_3_10_2","doi-asserted-by":"crossref","first-page":"3121","DOI":"10.1109\/ICPR.2010.764","volume-title":"Proceedings of the 2010 20th International Conference on Pattern Recognition","author":"Henning Brodersen Kay","year":"2010","unstructured":"Kay Henning Brodersen, Cheng Soon Ong, Klaas Enno Stephan, and Joachim M. Buhmann. 2010. The balanced accuracy and its posterior distribution. In Proceedings of the 2010 20th International Conference on Pattern Recognition. IEEE, 3121\u20133124."},{"key":"e_1_3_3_11_2","first-page":"108","volume-title":"Proceedings of the ECML PKDD Workshop: Languages for Data Mining and Machine Learning","author":"Buitinck Lars","year":"2013","unstructured":"Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, et al. 2013. API design for machine learning software: Experiences from the scikit-learn project. In Proceedings of the ECML PKDD Workshop: Languages for Data Mining and Machine Learning, 108\u2013122."},{"key":"e_1_3_3_12_2","volume-title":"Persepolis Treasury Tablets","author":"Cameron George Glenn","year":"1948","unstructured":"George Glenn Cameron. 1948. Persepolis Treasury Tablets. Oriental Institute Publications."},{"key":"e_1_3_3_13_2","doi-asserted-by":"crossref","unstructured":"Nicolas Carion Francisco Massa Gabriel Synnaeve Nicolas Usunier Alexander Kirillov and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. arXiv:2005.12872. Retrieved from https:\/\/arxiv.org\/abs\/2005.12872","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"e_1_3_3_14_2","doi-asserted-by":"publisher","DOI":"10.5555\/1756006.1859921"},{"key":"e_1_3_3_15_2","first-page":"74","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV)","author":"Chen Zhe","year":"2018","unstructured":"Zhe Chen, Shaoli Huang, and Dacheng Tao. 2018. Context refinement for object detection. In Proceedings of the European Conference on Computer Vision (ECCV), 74\u201389."},{"key":"e_1_3_3_16_2","doi-asserted-by":"crossref","unstructured":"Jonathan Chung and Thomas Delteil. 2019. A computationally efficient pipeline approach to full page offline handwritten text recognition. arXiv:1910.00663. Retrieved from http:\/\/arxiv.org\/abs\/1910.00663","DOI":"10.1109\/ICDARW.2019.40078"},{"key":"e_1_3_3_17_2","doi-asserted-by":"publisher","DOI":"10.1137\/070710111"},{"key":"e_1_3_3_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/eScience.2019.00060"},{"key":"e_1_3_3_19_2","unstructured":"CDLI Contributors. 2022. The Cuneiform Digital Library Initiative. Retrieved from https:\/\/cdli.mpiwg-berlin.mpg.de"},{"key":"e_1_3_3_20_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0243039"},{"key":"e_1_3_3_21_2","first-page":"42","article-title":"Der Aufbau des Syllabars \u201cProto-Ea\u201d","author":"Edzard Dietz Otto","year":"1982","unstructured":"Dietz Otto Edzard and Igor Michajlovich Diakonoff. 1982. Der Aufbau des Syllabars \u201cProto-Ea\u201d. Societies and Languages of the Ancient Near East. Studies in Honour of Igor Michailovitch Diakonoff (1982), 42\u201361.","journal-title":"Societies and Languages of the Ancient Near East. Studies in Honour of Igor Michailovitch Diakonoff"},{"key":"e_1_3_3_22_2","first-page":"16","volume-title":"Proceedings of 4th Conference on Scientific Computing and Cultural Heritage","author":"Fisseler D.","year":"2013","unstructured":"D. Fisseler, F. Weichert, G. M\u00fcller, and M. Cammarosano. 2013. Towards an interactive and automated script feature analysis of 3D scanned cuneiform tablets. In Proceedings of 4th Conference on Scientific Computing and Cultural Heritage, 16."},{"key":"e_1_3_3_23_2","doi-asserted-by":"publisher","DOI":"10.1080\/14786440109462720"},{"key":"e_1_3_3_24_2","volume-title":"Seals on the Persepolis Fortification Tablets. Vol. 1, Images of Heroic Encounter","author":"Garrison Mark B.","year":"2001","unstructured":"Mark B. Garrison and Margaret Cool Root. 2001. Seals on the Persepolis Fortification Tablets. Vol. 1, Images of Heroic Encounter. The Oriental Institute of the University of Chicago, Chicago."},{"key":"e_1_3_3_25_2","first-page":"2672","article-title":"Generative adversarial nets","volume":"27","author":"Goodfellow Ian","year":"2014","unstructured":"Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems, Vol. 27, 2672\u20132680.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_26_2","doi-asserted-by":"crossref","unstructured":"Alex Graves Santiago Fernandez and Juergen Schmidhuber. 2007. Multi-dimensional recurrent neural networks. arXiv:0705.2011. Retrieved from https:\/\/arxiv.org\/abs\/0705.2011","DOI":"10.1007\/978-3-540-74690-4_56"},{"key":"e_1_3_3_27_2","volume":"92","author":"Hallock Richard T.","year":"1969","unstructured":"Richard T. Hallock. 1969. Persepolis Fortification Tablets. Oriental Institute Publications, Vol. 92. University of Chicago Press.","journal-title":"Persepolis Fortification Tablets"},{"key":"e_1_3_3_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/cvpr.2016.90"},{"key":"e_1_3_3_29_2","doi-asserted-by":"publisher","DOI":"10.1093\/oxfordhb\/9780199733309.013.0019"},{"key":"e_1_3_3_30_2","doi-asserted-by":"publisher","DOI":"10.5334\/johd.46"},{"key":"e_1_3_3_31_2","doi-asserted-by":"crossref","unstructured":"Armand Joulin Laurens van der Maaten Allan Jabri and Nicolas Vasilache. 2015. Learning visual features from large weakly supervised data. arXiv:1511.02251. Retrieved from https:\/\/arxiv.org\/abs\/1511.02251","DOI":"10.1007\/978-3-319-46478-7_5"},{"key":"e_1_3_3_32_2","doi-asserted-by":"publisher","DOI":"10.1093\/oxfordjournals.pan.a004868"},{"key":"e_1_3_3_33_2","volume-title":"Proceedings of the 3rd International Conference on Learning Representations (ICLR \u201915)","author":"Kingma Diederik P.","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR \u201915)."},{"key":"e_1_3_3_34_2","unstructured":"Nils M. Kriege Matthias Fey Denis Fisseler Petra Mutzel and Frank Weichert. 2018. Recognizing cuneiform signs using graph based methods. arXiv:1802.05908. Retrieved from http:\/\/arxiv.org\/abs\/1802.05908"},{"key":"e_1_3_3_35_2","volume-title":"Advances in Neural Information Processing Systems","volume":"25","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger (Eds.), Vol. 25, Curran Associates, Inc. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2012\/file\/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf"},{"key":"e_1_3_3_36_2","doi-asserted-by":"publisher","DOI":"10.1016\/0031-3203(92)90024-D"},{"key":"e_1_3_3_37_2","first-page":"282","volume-title":"Proceedings of the Eighteenth International Conference on Machine Learning (ICML \u201901)","author":"Lafferty John D.","year":"2001","unstructured":"John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML \u201901). Morgan Kaufmann Publishers Inc., San Francisco, CA, 282\u2013289."},{"key":"e_1_3_3_38_2","unstructured":"Yann LeCun and Corinna Cortes. 2010. MNIST Handwritten Digit Database. Retrieved from http:\/\/yann.lecun.com\/exdb\/mnist\/"},{"key":"e_1_3_3_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/DAS.2012.45"},{"key":"e_1_3_3_40_2","unstructured":"Tsung-Yi Lin Priya Goyal Ross B. Girshick Kaiming He and Piotr Doll\u00e1r. 2017. Focal loss for dense object detection. arXiv:1708.02002. Retrieved from http:\/\/arxiv.org\/abs\/1708.02002"},{"key":"e_1_3_3_41_2","unstructured":"Tsung-Yi Lin Michael Maire Serge Belongie Lubomir Bourdev Ross Girshick James Hays Pietro Perona Deva Ramanan C. Lawrence Zitnick and Piotr Doll\u00e1r. 2014. Microsoft COCO: Common objects in context. arxiv:1405.0312. Retrieved from https:\/\/arxiv.org\/abs\/1405.0312"},{"key":"e_1_3_3_42_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"e_1_3_3_43_2","author":"TorchVision Maintainers and Contributors","year":"2016","unstructured":"TorchVision Maintainers and Contributors. 2016. TorchVision: PyTorch\u2019s Computer Vision Library.","journal-title":"TorchVision: PyTorch\u2019s Computer Vision Library"},{"key":"e_1_3_3_44_2","doi-asserted-by":"publisher","DOI":"10.1007\/s100320200071"},{"key":"e_1_3_3_45_2","doi-asserted-by":"publisher","DOI":"10.31274\/etd-180810-3375"},{"key":"e_1_3_3_46_2","unstructured":"Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga et al. 2019. PyTorch: An imperative style high-performance deep learning library. In Advances in Neural Information Processing Systems 32. H. Wallach H. Larochelle A. Beygelzimer F. d'Alch\u00e9-Buc E. Fox and R. Garnett (Eds.) Curran Associates Inc. 8024\u20138035. Retrieved from http:\/\/papers.neurips.cc\/paper\/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf"},{"key":"e_1_3_3_47_2","doi-asserted-by":"publisher","DOI":"10.3758\/s13423-014-0585-6"},{"key":"e_1_3_3_48_2","article-title":"Applications and explanations of Zipf\u2019s law","author":"Powers David M. W.","year":"1998","unstructured":"David M. W. Powers. 1998. Applications and explanations of Zipf\u2019s law. In New Methods in Language Processing and Computational Natural Language Learning.","journal-title":"New Methods in Language Processing and Computational Natural Language Learning"},{"key":"e_1_3_3_49_2","doi-asserted-by":"publisher","DOI":"10.1017\/irq.2016.10"},{"key":"e_1_3_3_50_2","unstructured":"Shaoqing Ren Kaiming He Ross B. Girshick and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. arXiv:1506.01497. Retrieved from http:\/\/arxiv.org\/abs\/1506.01497"},{"key":"e_1_3_3_51_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10032-018-0304-3"},{"key":"e_1_3_3_52_2","doi-asserted-by":"publisher","DOI":"10.1145\/3352631.3352632"},{"key":"e_1_3_3_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDAR.2007.4376991"},{"key":"e_1_3_3_54_2","unstructured":"Matt Stolper. 2007. Persepolis Fortification Archive Project. Technical Report. The Oriental Institute at the University of Chicago. Retrieved from https:\/\/oi.uchicago.edu\/about\/annual-reports\/oriental-institute-2006-2007-annual-report"},{"key":"e_1_3_3_55_2","unstructured":"Matthew Stolper. 2007. The Persepolis Fortification Tablets. Technical Report. The Oriental Institute at the University of Chicago. Retrieved from https:\/\/oi.uchicago.edu\/sites\/oi.uchicago.edu\/files\/uploads\/shared\/docs\/nn192.pdf"},{"key":"e_1_3_3_56_2","doi-asserted-by":"crossref","unstructured":"Chen Sun Abhinav Shrivastava Saurabh Singh and Abhinav Gupta. 2017. Revisiting unreasonable effectiveness of data in deep learning era. arXiv:1707.02968. Retrieved from http:\/\/arxiv.org\/abs\/1707.02968","DOI":"10.1109\/ICCV.2017.97"},{"key":"e_1_3_3_57_2","unstructured":"Steve Tinney and Eleanor Robson. 2019. About Oracc: Essentials for Oracc Users. Retrieved from http:\/\/oracc.museum.upenn.edu\/doc\/about\/"},{"issue":"86","key":"e_1_3_3_58_2","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"van der Maaten Laurens","year":"2008","unstructured":"Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579\u20132605. Retrieved from http:\/\/jmlr.org\/papers\/v9\/vandermaaten08a.html","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_3_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISPA.2001.938625"},{"key":"e_1_3_3_60_2","unstructured":"Ross Wightman. 2021. Pytorch Image Models (timm). Retrieved from https:\/\/fastai.github.io\/timmdocs\/"},{"key":"e_1_3_3_61_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01231-1_23"},{"key":"e_1_3_3_62_2","unstructured":"Yuxin Wu Alexander Kirillov Francisco Massa Wan-Yen Lo and Ross Girshick. 2019. Detectron2. Retrieved from https:\/\/github.com\/facebookresearch\/detectron2"},{"key":"e_1_3_3_63_2","unstructured":"Tai-Ling Yuan Zhe Zhu Kun Xu Cheng-Jun Li and Shi-Min Hu. 2018. Chinese text in the wild. arXiv:1803.00085. Retrieved from https:\/\/arxiv.org\/abs\/1803.00085"},{"key":"e_1_3_3_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/jproc.2021.3054390"},{"key":"e_1_3_3_65_2","unstructured":"Zhao Zhou Shufan Wu Shuchen Kong Yingbin Zheng Hao Ye Luhui Chen and Jian Pu. 2019. Curve text detection with local segmentation network and curve connection. arXiv:1903.09837. Retrieved from http:\/\/arxiv.org\/abs\/1903.09837"}],"container-title":["Journal on Computing and Cultural Heritage"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3716850","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,21]],"date-time":"2025-06-21T11:08:09Z","timestamp":1750504089000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3716850"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,21]]},"references-count":64,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,6,30]]}},"alternative-id":["10.1145\/3716850"],"URL":"https:\/\/doi.org\/10.1145\/3716850","relation":{},"ISSN":["1556-4673","1556-4711"],"issn-type":[{"value":"1556-4673","type":"print"},{"value":"1556-4711","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,21]]},"assertion":[{"value":"2022-12-22","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-01-27","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-06-21","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}