{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,9]],"date-time":"2026-07-09T01:12:08Z","timestamp":1783559528501,"version":"3.55.0"},"publisher-location":"Berlin, Heidelberg","reference-count":35,"publisher":"Springer Berlin Heidelberg","isbn-type":[{"value":"9783662722428","type":"print"},{"value":"9783662722435","type":"electronic"}],"license":[{"start":{"date-parts":[[2025,10,4]],"date-time":"2025-10-04T00:00:00Z","timestamp":1759536000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,10,4]],"date-time":"2025-10-04T00:00:00Z","timestamp":1759536000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Engineering diagrams are vital documents in many industries. Historically stored as image data, conversion of such diagrams into modern formats is required for further use and adaptation. Therefore, research towards automated digitization has gained traction. To recognize symbols in the diagrams, recent studies rely on supervised learning, but large labeled datasets are difficult to acquire in industry settings. In this paper, we present a self-supervised approach towards automated recognition of engineering diagram symbols. We validate the method on diagrams from the building sector, where they are used for technical plant planning, installation, and monitoring. The method makes use of diagram legends, which show prototypical examples of the symbols occurring in the diagram. As the legend entries are unique, they can be used to learn embeddings through contrastive learning for a self-supervised classification of diagram symbols. The method circumvents most of the labeling efforts: all symbols are extracted from the set of diagrams with a symbol region detector trained on a synthetic dataset. Then, we train a symbol encoder by contrasting the symbols found inside the legends with each other. The encoder is subsequently used in a matching procedure that classifies unknown diagram symbols by comparing them to the legend examples. Furthermore, it can recognize when symbols do not appear in the legend at all. Generalizing beyond variations in diagram drawing style, this matching procedure achieves over 80% accuracy. The results demonstrate the potential of legends for engineering diagram digitization without the need to invest in labeled datasets.<\/jats:p>","DOI":"10.1007\/978-3-662-72243-5_23","type":"book-chapter","created":{"date-parts":[[2025,10,3]],"date-time":"2025-10-03T12:14:41Z","timestamp":1759493681000},"page":"403-421","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Legend-Informed Symbol Recognition in\u00a0Engineering Diagrams with\u00a0Self-supervised Learning"],"prefix":"10.1007","author":[{"given":"Antonia","family":"Hain","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Simon","family":"G\u00f6lzh\u00e4user","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Nicolas","family":"R\u00e9hault","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Thomas","family":"Brox","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Matthias","family":"Demant","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2025,10,4]]},"reference":[{"key":"23_CR1","doi-asserted-by":"crossref","unstructured":"Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: CVPR, pp. 9365\u20139374 (2019)","DOI":"10.1109\/CVPR.2019.00959"},{"key":"23_CR2","doi-asserted-by":"crossref","unstructured":"Caron, M., et al.: Emerging properties in self-supervised vision transformers. In: ICCV (2021)","DOI":"10.1109\/ICCV48922.2021.00951"},{"key":"23_CR3","unstructured":"Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: III, H.D., Singh, A. (eds.) ICML. Proceedings of Machine Learning Research, vol.\u00a0119, pp. 1597\u20131607 (2020)"},{"key":"23_CR4","doi-asserted-by":"crossref","unstructured":"Dutta, A., Zisserman, A.: The VIA annotation software for images, audio and video. In: ACM Multimedia, pp. 2276\u20132279 (2019)","DOI":"10.1145\/3343031.3350535"},{"key":"23_CR5","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1016\/j.neunet.2020.05.025","volume":"129","author":"E Elyan","year":"2020","unstructured":"Elyan, E., Jamieson, L., Ali-Gombe, A.: Deep learning for symbols detection and classification in engineering drawings. Neural Netw. 129, 91\u2013102 (2020)","journal-title":"Neural Netw."},{"key":"23_CR6","unstructured":"Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K. (eds.) NIPS, vol.\u00a027, pp. 2672\u20132680 (2014)"},{"key":"23_CR7","doi-asserted-by":"crossref","unstructured":"Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: CVPR, vol.\u00a02, pp. 1735\u20131742 (2006)","DOI":"10.1109\/CVPR.2006.100"},{"key":"23_CR8","unstructured":"Hamilton, M., Zhang, Z., Hariharan, B., Snavely, N., Freeman, W.T.: Unsupervised semantic segmentation by distilling feature correspondences. In: ICLR (2022)"},{"key":"23_CR9","doi-asserted-by":"crossref","unstructured":"He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: CVPR, pp. 9729\u20139738 (2020)","DOI":"10.1109\/CVPR42600.2020.00975"},{"key":"23_CR10","doi-asserted-by":"crossref","unstructured":"Howard, A., et al.: Searching for mobilenetv3. In: ICCV, pp. 1314\u20131324 (2019)","DOI":"10.1109\/ICCV.2019.00140"},{"key":"23_CR11","unstructured":"IEA: Energy systems buildings database (2024). https:\/\/www.iea.org\/energy-system\/buildings. Accessed 11 Mar 2025"},{"key":"23_CR12","doi-asserted-by":"publisher","first-page":"136","DOI":"10.1007\/s10462-024-10779-2","volume":"57","author":"L Jamieson","year":"2024","unstructured":"Jamieson, L., Moreno-Garc\u00eda, C., Elyan, E.: A review of deep learning methods for digitisation of complex documents and engineering diagrams. Artif. Intell. Rev. 57, 136 (2024)","journal-title":"Artif. Intell. Rev."},{"key":"23_CR13","doi-asserted-by":"crossref","unstructured":"Joy, J., Mounsef, J.: Automation of material takeoff using computer vision. In: IAICT, pp. 196\u2013200 (2021)","DOI":"10.1109\/IAICT52856.2021.9532514"},{"key":"23_CR14","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2021.115337","volume":"183","author":"H Kim","year":"2021","unstructured":"Kim, H., et al.: Deep-learning-based recognition of symbols and texts at an industrially applicable level from images of high-density piping and instrumentation diagrams. Expert Syst. Appl. 183, 115337 (2021)","journal-title":"Expert Syst. Appl."},{"key":"23_CR15","unstructured":"Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2015)"},{"key":"23_CR16","unstructured":"Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML (2015)"},{"key":"23_CR17","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., et al.: Microsoft coco: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV, pp. 740\u2013755 (2014)","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"23_CR18","doi-asserted-by":"crossref","unstructured":"Lowe, D.: Object recognition from local scale-invariant features. In: ICCV, vol.\u00a02, pp. 1150\u20131157 (1999)","DOI":"10.1109\/ICCV.1999.790410"},{"issue":"6","key":"23_CR19","doi-asserted-by":"publisher","first-page":"1695","DOI":"10.1007\/s00521-018-3583-1","volume":"31","author":"CF Moreno-Garc\u00eda","year":"2019","unstructured":"Moreno-Garc\u00eda, C.F., Elyan, E., Jayne, C.: New trends on digitisation of complex engineering drawings. Neural Comput. Appl. 31(6), 1695\u20131712 (2019)","journal-title":"Neural Comput. Appl."},{"key":"23_CR20","unstructured":"OpenAI: Hello gpt-4o (2024). https:\/\/openai.com\/index\/hello-gpt-4o\/. Accessed 11 Mar 2025"},{"key":"23_CR21","doi-asserted-by":"crossref","unstructured":"Paliwal, S., Jain, A., Sharma, M., Vig, L.: Digitize-pid: automatic digitization of piping and instrumentation diagrams. In: Gupta, M., Ramakrishnan, G. (eds.) PAKDD, pp. 168\u2013180 (2021)","DOI":"10.1007\/978-3-030-75015-2_17"},{"key":"23_CR22","doi-asserted-by":"crossref","unstructured":"Paliwal, S., Sharma, M., Vig, L.: Ossr-pid: one-shot symbol recognition in p &id sheets using path sampling and GCN. In: IJCNN, pp.\u00a01\u20138 (2021)","DOI":"10.1109\/IJCNN52387.2021.9534122"},{"key":"23_CR23","doi-asserted-by":"publisher","first-page":"137621","DOI":"10.1109\/ACCESS.2023.3335196","volume":"11","author":"A Payandeh","year":"2023","unstructured":"Payandeh, A., Baghaei, K.T., Fayyazsanavi, P., Ramezani, S.B., Chen, Z., Rahimi, S.: Deep representation learning: fundamentals, technologies, applications, and open challenges. IEEE Access 11, 137621\u2013137659 (2023)","journal-title":"IEEE Access"},{"key":"23_CR24","doi-asserted-by":"crossref","unstructured":"Rahul, R., Paliwal, S., Sharma, M., Vig, L.: Automatic information extraction from piping and instrumentation diagrams. In: Marsico, M.D., di\u00a0Baja, G.S., Fred, A.L.N. (eds.) ICPRAM, pp. 163\u2013172 (2019)","DOI":"10.5220\/0007376401630172"},{"key":"23_CR25","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779\u2013788 (2016)","DOI":"10.1109\/CVPR.2016.91"},{"key":"23_CR26","unstructured":"Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) NIPS, vol.\u00a028, pp. 91\u201399 (2015)"},{"key":"23_CR27","doi-asserted-by":"crossref","unstructured":"Roth, K., Pemula, L., Zepeda, J., Sch\u00f6lkopf, B., Brox, T., Gehler, P.: Towards total recall in industrial anomaly detection. In: CVPR, pp. 14318\u201314328 (2022)","DOI":"10.1109\/CVPR52688.2022.01392"},{"key":"23_CR28","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1007\/s100320050009","volume":"1","author":"H Samet","year":"1998","unstructured":"Samet, H., Soffer, A.: Magellan: map acquisition of geographic labels by legend analysis. Int. J. Doc. Anal. Recogn. 1, 89\u2013101 (1998)","journal-title":"Int. J. Doc. Anal. Recogn."},{"key":"23_CR29","unstructured":"Sarkar, S., Pandey, P.K., Kar, S.: Automatic detection and classification of symbols in engineering drawings. arXiv preprint arXiv:2204.13277 (2022)"},{"key":"23_CR30","doi-asserted-by":"crossref","unstructured":"Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815\u2013823 (2015)","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"23_CR31","unstructured":"Stinner, F., Wiecek, M., Baranski, M., K\u00fcmpel, A., M\u00fcller, D.: Automatic digital twin data model generation of building energy systems from piping and instrumentation diagrams. In: ECOS, pp. 2854\u20131865 (2021)"},{"key":"23_CR32","doi-asserted-by":"publisher","DOI":"10.1016\/j.dche.2022.100072","volume":"6","author":"MF Theisen","year":"2023","unstructured":"Theisen, M.F., Flores, K.N., Schulze Balhorn, L., Schweidtmann, A.M.: Digitization of chemical process flow diagrams using deep convolutional neural networks. Digital Chem. Eng. 6, 100072 (2023)","journal-title":"Digital Chem. Eng."},{"key":"23_CR33","doi-asserted-by":"crossref","unstructured":"Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: CVPR, pp. 3733\u20133742 (2018)","DOI":"10.1109\/CVPR.2018.00393"},{"key":"23_CR34","doi-asserted-by":"crossref","unstructured":"Xiao, X., Li, Z., Zhao, S., Yang, L., Zhao, F., Ge, C.: Improved p &id symbol detection algorithm based on yolov5 network. In: SMC, pp. 120\u2013126 (2023)","DOI":"10.1109\/SMC53992.2023.10394450"},{"issue":"23","key":"23_CR35","doi-asserted-by":"publisher","first-page":"4425","DOI":"10.3390\/en12234425","volume":"12","author":"ES Yu","year":"2019","unstructured":"Yu, E.S., Cha, J.M., Lee, T., Kim, J., Mun, D.: Features recognition from piping and instrumentation diagrams in image format using a deep learning network. Energies 12(23), 4425 (2019)","journal-title":"Energies"}],"container-title":["Lecture Notes in Computer Science","Machine Learning and Knowledge Discovery in Databases. Research Track and Applied Data Science Track"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-662-72243-5_23","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,3]],"date-time":"2025-10-03T12:14:55Z","timestamp":1759493695000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-662-72243-5_23"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,4]]},"ISBN":["9783662722428","9783662722435"],"references-count":35,"URL":"https:\/\/doi.org\/10.1007\/978-3-662-72243-5_23","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"value":"0302-9743","type":"print"},{"value":"1611-3349","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,4]]},"assertion":[{"value":"4 October 2025","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"The authors have no competing interests to declare that\u00a0are relevant to the content of this article.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Disclosure of Interests"}},{"value":"ECML PKDD","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Joint European Conference on Machine Learning and Knowledge Discovery in Databases","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Porto","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Portugal","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2025","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"15 September 2025","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"19 September 2025","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"ecml2025","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/ecmlpkdd.org\/2025\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}}]}}