{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T12:59:02Z","timestamp":1780491542127,"version":"3.54.1"},"reference-count":181,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T00:00:00Z","timestamp":1770336000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>With artificial intelligence (AI) rapidly increasing in popularity and presence in everyday life, new applications utilizing AI are being explored across virtually all domains, from banking and healthcare to cybersecurity to generative AI for images, voice, and video content creation. With that trend comes an inherent need for increased AI capabilities. One cornerstone of AI applications is the ability of generative AI to consume documents and utilize their content to answer questions, generate new content, correlate it with other data sources, and more. No longer constrained to text alone, we now leverage multimodal AI models to help us understand visual elements within documents, such as images, tables, figures, and charts. Within this realm, capabilities have expanded exponentially from traditional Optical Character Recognition (OCR) approaches towards increasingly utilizing complex AI models for visual content analysis and understanding. Modern approaches, especially those leveraging AI, are now focusing on interpreting more complex diagrams such as flowcharts, block diagrams, Unified Modeling Language (UML) diagrams, electrical schematics, and timing diagrams. These diagram types combine text, symbols, and structured layout, making them challenging to parse and comprehend using conventional techniques. This paper presents a historical analysis and comprehensive survey of scientific literature exploring this domain of visual understanding of complex technical illustrations and diagrams. We explore the use of deep learning models, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformer-based architectures. These models, along with OCR, enable the extraction of both textual and structural information from visually complex sources. Despite these advancements, numerous challenges remain, however. These range from hallucinations, where the content extraction system produces outputs not grounded in the source image, which leads to misinterpretations, to a lack of contextual understanding of diagrammatic elements, such as arrows, grouping, and spatial hierarchy. This survey focuses on five key diagram types: flowcharts, block diagrams, UML diagrams, electrical schematics, and timing diagrams. It evaluates the effectiveness, limitations, and practical solutions\u2014both traditional and AI-driven\u2014that aim to enable the extraction of accurate and meaningful information from complex diagrams in a way that is trustworthy and suitable for real-world, high-accuracy AI applications. This survey reveals that virtually all approaches struggle with accurately extracting technical diagram information. It also illustrates a path forward. Pursuing research to further improve their accuracy is crucial for supporting and enabling various applications, including complex document question answering and Retrieval Augmented Generation (RAG), document-driven AI agents, accessibility applications, and automation.<\/jats:p>","DOI":"10.3390\/info17020165","type":"journal-article","created":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T11:10:38Z","timestamp":1770376238000},"page":"165","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Decoding Technical Diagrams: A Survey of AI Methods for Image Content Extraction and Understanding"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-3252-0015","authenticated-orcid":false,"given":"Nick","family":"Bray","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7091-8349","authenticated-orcid":false,"given":"Michael","family":"Hempel","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4510-8280","authenticated-orcid":false,"given":"Matthew","family":"Boeding","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6229-2043","authenticated-orcid":false,"given":"Hamid","family":"Sharif","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2026,2,6]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"3781","DOI":"10.1016\/j.procs.2024.09.178","article-title":"A Survey on RAG with LLMs","volume":"246","author":"Arslan","year":"2024","journal-title":"Procedia Comput. Sci."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"e1230","DOI":"10.1002\/cl2.1230","article-title":"PRISMA2020: An R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and Open Synthesis","volume":"18","author":"Haddaway","year":"2022","journal-title":"Campbell Syst. Rev."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"136","DOI":"10.1007\/s10462-024-10779-2","article-title":"A review of deep learning methods for digitisation of complex documents and engineering diagrams","volume":"57","author":"Jamieson","year":"2024","journal-title":"Artif. Intell. Rev."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Mittal, R., and Garg, A. (2020, January 15\u201317). Text extraction using OCR: A Systematic Review. Proceedings of the 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India.","DOI":"10.1109\/ICIRCA48905.2020.9183326"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Kumar, G., and Bhatia, P.K. (2014, January 8\u20139). A Detailed Review of Feature Extraction in Image Processing Systems. Proceedings of the 2014 Fourth International Conference on Advanced Computing & Communication Technologies, Rohtak, India.","DOI":"10.1109\/ACCT.2014.74"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1007\/s13244-018-0639-9","article-title":"Convolutional neural networks: An overview and application in radiology","volume":"9","author":"Yamashita","year":"2018","journal-title":"Insights Imaging"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"132306","DOI":"10.1016\/j.physd.2019.132306","article-title":"Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network","volume":"404","author":"Sherstinsky","year":"2020","journal-title":"Phys. D Nonlinear Phenom."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"032102","DOI":"10.1088\/2631-8695\/ad6ca7","article-title":"Application of novel hybrid deep learning architectures combining convolutional neural networks (CNN) and recurrent neural networks (RNN): Construction duration estimates prediction considering preconstruction uncertainties","volume":"6","author":"Demiss","year":"2024","journal-title":"Eng. Res. Express"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1109\/TPAMI.2022.3152247","article-title":"A Survey on Vision Transformer","volume":"45","author":"Han","year":"2023","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Li, M., Lv, T., Chen, J., Cui, L., Lu, Y., Florencio, D., Zhang, C., Li, Z., and Wei, F. (2023, January 7\u201314). Trocr: Transformer-based optical character recognition with pre-trained models. Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA.","DOI":"10.1609\/aaai.v37i11.26538"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"26146","DOI":"10.1109\/ACCESS.2019.2900753","article-title":"Transformer-based neural network for answer selection in question answering","volume":"7","author":"Shao","year":"2019","journal-title":"IEEE Access"},{"key":"ref_12","unstructured":"Lin, C.Y. (2004). Rouge: A package for automatic evaluation of summaries. Text Summarization Branches Out, Association for Computational Linguistics."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Saadany, H., and Orasan, C. (2021). BLEU, METEOR, BERTScore: Evaluation of metrics performance in assessing critical translation errors in sentiment-oriented text. arXiv.","DOI":"10.26615\/978-954-452-071-7_006"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1162\/coli_a_00322","article-title":"A structured review of the validity of BLEU","volume":"44","author":"Reiter","year":"2018","journal-title":"Comput. Linguist."},{"key":"ref_15","unstructured":"Ren, S., Guo, D., Lu, S., Zhou, L., Liu, S., Tang, D., Sundaresan, N., Zhou, M., Blanco, A., and Ma, S. (2020). Codebleu: A method for automatic evaluation of code synthesis. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Sellam, T., Das, D., and Parikh, A.P. (2020). BLEURT: Learning robust metrics for text generation. arXiv.","DOI":"10.18653\/v1\/2020.acl-main.704"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Yang, M., Zhu, J., Li, J., Wang, L., Qi, H., Li, S., and Daxin, L. (2008, January 18\u201321). Extending BLEU Evaluation Method with Linguistic Weight. Proceedings of the 2008 The 9th International Conference for Young Computer Scientists, Zhang Jia Jie, China.","DOI":"10.1109\/ICYCS.2008.362"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Henderson, P., and Ferrari, V. (2016). End-to-end training of object class detectors for mean average precision. Proceedings of the Asian Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-54193-8_13"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"5979","DOI":"10.1038\/s41598-022-09954-8","article-title":"On evaluation metrics for medical applications of artificial intelligence","volume":"12","author":"Hicks","year":"2022","journal-title":"Sci. Rep."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., and Savarese, S. (2019, January 15\u201320). Generalized intersection over union: A metric and a loss for bounding box regression. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Leena, C., and Ganesh, M. (2020, January 22\u201323). Generating Graph from 2D Flowchart using Region-Based Segmentation. Proceedings of the 2020 IEEE International Students\u2019 Conference on Electrical, Electronics and Computer Science (SCEECS), Bhopal, India.","DOI":"10.1109\/SCEECS48394.2020.165"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Vasudevan, B.G., Dhanapanichkul, S., and Balakrishnan, R. (2008, January 1\u20138). Flowchart knowledge extraction on image processing. Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China.","DOI":"10.1109\/IJCNN.2008.4634384"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Herrera-Camara, J.I., and Hammond, T. (2017, January 29\u201330). Flow2Code: From hand-drawn flowcharts to code execution. Proceedings of the Symposium on Sketch-Based Interfaces and Modeling, Angeles, CA, USA.","DOI":"10.1145\/3092907.3092909"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"21777","DOI":"10.37622\/IJAER\/10.9.2015.21777-21783","article-title":"Reviewing Otsu\u2019s method for image thresholding","volume":"10","author":"Bangare","year":"2015","journal-title":"Int. J. Appl. Eng. Res."},{"key":"ref_25","first-page":"655","article-title":"Region-based segmentation and object detection","volume":"Volume 22","author":"Bengio","year":"2009","journal-title":"Advances in Neural Information Processing Systems"},{"key":"ref_26","first-page":"1","article-title":"A review on OpenCV","volume":"3","author":"Mohamad","year":"2015","journal-title":"Teren. Univ. Malays. Teren."},{"key":"ref_27","unstructured":"Cheng, L., and Yang, Z. (2020). GRCNN: Graph Recognition Convolutional Neural Network for synthesizing programs from flow charts. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1145\/1925844.1926423","article-title":"Automating string processing in spreadsheets using input-output examples","volume":"46","author":"Gulwani","year":"2011","journal-title":"ACM Sigplan Not."},{"key":"ref_29","unstructured":"Gansner, E.R. (2009). Drawing Graphs with Graphviz, AT&T Bell Laboratories."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Chakraborty, S., Paul, S., and Masudul Ahsan, S.M. (2020, January 5\u20137). A Novel Approach to Rapidly Generate Document from Hand Drawn Flowcharts. Proceedings of the 2020 IEEE Region 10 Symposium (TENSYMP), Dhaka, Bangladesh.","DOI":"10.1109\/TENSYMP50017.2020.9231033"},{"key":"ref_31","unstructured":"Vogel, M., Warnecke, T., Bartelt, C., and Rausch, A. (2014, January 20\u201323). Scribbler\u2014Drawing models in a creative and collaborative environment: From hand-drawn sketches to domain specific models and vice versa. Proceedings of the Fifteenth Australasian User Interface Conference-Volume 150, Auckland, New Zealand."},{"key":"ref_32","unstructured":"Meng, W.K. (2016). Development of Program Flowchart Drawing Tool, Universiti Tunku Abdul Rahman."},{"key":"ref_33","first-page":"112","article-title":"Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Ramer-douglas-peucker algorithm","volume":"10","author":"Douglas","year":"1973","journal-title":"Cartogr. Int. J. Geogr. Inf. Geovisualization"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Scheidl, H., Fiel, S., and Sablatnig, R. (2018, January 5\u20138). Word beam search: A connectionist temporal classification decoding algorithm. Proceedings of the 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), Niagara Falls, NY, USA.","DOI":"10.1109\/ICFHR-2018.2018.00052"},{"key":"ref_35","unstructured":"Chakraborti, A., Naik, A., Pansare, A., and Pant, U. (2020). Extracting Flowchart Features into a Structured Representation, Mukesh Patel School of Technology Management and Engineering."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Chen, Q., Shi, D., Feng, G., Zhao, X., and Luo, B. (2015, January 24\u201326). On-line handwritten flowchart recognition based on logical structure and graph grammar. Proceedings of the 2015 5th International Conference on Information Science and Technology (ICIST), Changsha, China.","DOI":"10.1109\/ICIST.2015.7289009"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Supaartagorn, C. (2017, January 24\u201326). Web application for automatic code generator using a structured flowchart. Proceedings of the 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China.","DOI":"10.1109\/ICSESS.2017.8342876"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Adarsh, P., Rathi, P., and Kumar, M. (2020, January 6\u20137). YOLO v3-Tiny: Object Detection and Recognition using one stage improved model. Proceedings of the 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India.","DOI":"10.1109\/ICACCS48705.2020.9074315"},{"key":"ref_39","first-page":"110","article-title":"An overview of Darknet, rise and challenges and its assumptions","volume":"8","author":"Omar","year":"2020","journal-title":"Int. J. Comput. Sci. Inf. Technol."},{"key":"ref_40","unstructured":"Pebrianto, W., Mudjirahardjo, P., Pramono, S.H., and Setyawan, R.A. (2023). YOLOv3 with spatial pyramid pooling for object detection with unmanned aerial vehicles. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Liu, Z., Hu, X., Zhou, D., Li, L., Zhang, X., and Xiang, Y. (2022, January 7\u201311). Code generation from flowcharts with texts: A benchmark dataset and an approach. Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates.","DOI":"10.18653\/v1\/2022.findings-emnlp.449"},{"key":"ref_42","first-page":"52","article-title":"Raptor: Introducing programming to non-majors with flowcharts","volume":"19","author":"Carlisle","year":"2004","journal-title":"J. Comput. Sci. Coll."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1177\/01445987241269496","article-title":"Comparative study of long short-term memory (LSTM), bidirectional LSTM, and traditional machine learning approaches for energy consumption prediction","volume":"43","author":"Alizadegan","year":"2025","journal-title":"Energy Explor. Exploit."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Vrahatis, A.G., Lazaros, K., and Kotsiantis, S. (2024). Graph attention networks: A comprehensive review of methods and applications. Future Internet, 16.","DOI":"10.3390\/fi16090318"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Yin, P., and Neubig, G. (2018). TRANX: A transition-based neural abstract syntax parser for semantic parsing and code generation. arXiv.","DOI":"10.18653\/v1\/D18-2002"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3505244","article-title":"Transformers in vision: A survey","volume":"54","author":"Khan","year":"2022","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Shukla, S., Gatti, P., Kumar, Y., Yadav, V., and Mishra, A. (2023). Towards making flowchart images machine interpretable. Proceedings of the International Conference on Document Analysis and Recognition, Springer.","DOI":"10.1007\/978-3-031-41734-4_31"},{"key":"ref_48","unstructured":"Tannert, S., Feighelstein, M.G., Bogojeska, J., Shtok, J., Arbelle, A., Staar, P.W., Schumann, A., Kuhn, J., and Karlinsky, L. (2023). FlowchartQA: The first large-scale benchmark for reasoning over flowcharts. Proceedings of the 1st Workshop on Linguistic Insights from and for Multimodal Language Processing, Association for Computational Lingustics."},{"key":"ref_49","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2019, January 2\u20137). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA."},{"key":"ref_50","first-page":"37","article-title":"Recognition of handwritten flowcharts using convolutional neural networks","volume":"184","author":"Montellano","year":"2022","journal-title":"Int. J. Comput. Appl."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"993","DOI":"10.1016\/j.patcog.2014.08.027","article-title":"A survey of Hough Transform","volume":"48","author":"Mukhopadhyay","year":"2015","journal-title":"Pattern Recognit."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Wang, Y., Wang, W., Joty, S., and Hoi, S.C. (2021). Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv.","DOI":"10.18653\/v1\/2021.emnlp-main.685"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Baek, Y., Lee, B., Han, D., Yun, S., and Lee, H. (2019, January 15\u201320). Character region awareness for text detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00959"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Darda, A., and Jain, R. (2024). Code Generation from Flowchart using Optical Character Recognition & Large Language Model. Authorea Prepr.","DOI":"10.36227\/techrxiv.171392799.96378624\/v1"},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"27027","DOI":"10.1007\/s11042-023-14346-9","article-title":"Matching of hand-drawn flowchart, pseudocode, and english description using transfer learning","volume":"82","author":"Ghosh","year":"2023","journal-title":"Multimed. Tools Appl."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Ray, S., Herrera-C\u00e1mara, J.I., Runyon, M., and Hammond, T. (2019). Flow2code: Transforming hand-drawn flowcharts into executable code to enhance learning. Inspiring Students with Digital Ink: Impact of Pen and Touch on Education, Springer.","DOI":"10.1007\/978-3-030-17398-2_6"},{"key":"ref_57","unstructured":"Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., and Bhosale, S. (2023). Llama 2: Open foundation and fine-tuned chat models. arXiv."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Reimers, N., and Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv.","DOI":"10.18653\/v1\/D19-1410"},{"key":"ref_59","unstructured":"Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019). DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv."},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Miyao, H., and Maruyama, R. (2012, January 18\u201320). On-Line Handwritten flowchart Recognition, Beautification and Editing System. Proceedings of the 2012 International Conference on Frontiers in Handwriting Recognition, Bari, Italy.","DOI":"10.1109\/ICFHR.2012.250"},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"2580","DOI":"10.1109\/TMM.2021.3087000","article-title":"Instance GNN: A Learning Framework for Joint Symbol Segmentation and Recognition in Online Handwritten Diagrams","volume":"24","author":"Yun","year":"2022","journal-title":"IEEE Trans. Multimed."},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Ye, J., Dash, A., Yin, W., and Wang, G. (May, January 29). Beyond end-to-end vlms: Leveraging intermediate text representations for superior flowchart understanding. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), Albuquerque, NM, USA.","DOI":"10.18653\/v1\/2025.naacl-long.180"},{"key":"ref_63","unstructured":"Omasa, T., Koshihara, R., and Morishige, M. (2025). Arrow-Guided VLM: Enhancing Flowchart Understanding via Arrow Direction Encoding. arXiv."},{"key":"ref_64","unstructured":"Soman, S., Ranjani, H., Roychowdhury, S., Sastry, V.D.S.N., Jain, A., Gangrade, P., and Khan, A. (2025). A Graph-based Approach for Multi-Modal Question Answering from Flowcharts in Telecom Documents. arXiv."},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Pan, H., Zhang, Q., Caragea, C., Dragut, E., and Latecki, L.J. (2024). Flowlearn: Evaluating large vision-language models on flowchart understanding. arXiv.","DOI":"10.3233\/FAIA240473"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Singh, S., Chaurasia, P., Varun, Y., Pandya, P., Gupta, V., Gupta, V., and Roth, D. (2024). FlowVQA: Mapping multimodal logic in visual question answering with flowcharts. arXiv.","DOI":"10.18653\/v1\/2024.findings-acl.78"},{"key":"ref_67","unstructured":"Anthropic (2024, October 16). Introducing Claude 3.5 Sonnet. Available online: https:\/\/www.anthropic.com\/news\/claude-3-5-sonnet."},{"key":"ref_68","unstructured":"OpenAI (2024, October 16). Hello GPT-4o. Available online: https:\/\/openai.com\/index\/hello-gpt-4o\/."},{"key":"ref_69","unstructured":"Wang, P., Bai, S., Tan, S., Wang, S., Fan, Z., Bai, J., Chen, K., Liu, X., Wang, J., and Ge, W. (2024). Qwen2-vl: Enhancing vision-language model\u2019s perception of the world at any resolution. arXiv."},{"key":"ref_70","doi-asserted-by":"crossref","unstructured":"Arbaz, A., Fan, H., Ding, J., Qiu, M., and Feng, Y. (2024). GenFlowchart: Parsing and understanding flowchart using generative AI. Proceedings of the International Conference on Knowledge Science, Engineering and Management, Springer.","DOI":"10.1007\/978-981-97-5492-2_8"},{"key":"ref_71","first-page":"1S","article-title":"A code automatic generation algorithm based on structured flowchart","volume":"6","author":"Wu","year":"2012","journal-title":"Appl. Math."},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Raghu, D., Agarwal, S., Joshi, S. (2021). End-to-end learning of flowchart grounded task-oriented dialogs. arXiv.","DOI":"10.18653\/v1\/2021.emnlp-main.357"},{"key":"ref_73","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., and Lo, W.Y. (2023, January 1\u20136). Segment anything. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"ref_74","unstructured":"Burges, C.J., Bottou, L., Welling, M., Ghahramani, Z., and Weinberger, K.Q. (2013). Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_75","doi-asserted-by":"crossref","first-page":"64292","DOI":"10.1109\/ACCESS.2022.3183068","article-title":"FR-DETR: End-to-End Flowchart Recognition With Precision and Robustness","volume":"10","author":"Sun","year":"2022","journal-title":"IEEE Access"},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020). End-to-end object detection with transformers. Proceedings of the European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_77","doi-asserted-by":"crossref","unstructured":"Xu, Y., Xu, W., Cheung, D., and Tu, Z. (2021, January 20\u201325). Line segment detection using transformers without edges. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00424"},{"key":"ref_78","doi-asserted-by":"crossref","unstructured":"Bresler, M., Prua, D., and Hlav\u00e1c, V. (2013, January 25\u201328). Modeling flowchart structure recognition as a max-sum problem. Proceedings of the 2013 12th International Conference on Document Analysis and Recognition, Washington, DC, USA.","DOI":"10.1109\/ICDAR.2013.246"},{"key":"ref_79","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Awal, A.M., Feng, G., Mouchere, H., and Viard-Gaudin, C. (2011, January 23\u201327). First experiments on a new online handwritten flowchart database. Proceedings of the Document Recognition and Retrieval XVIII, San Francisco, CA, USA.","DOI":"10.1117\/12.876624"},{"key":"ref_81","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1007\/s10032-016-0269-z","article-title":"Online recognition of sketched arrow-connected diagrams","volume":"19","author":"Bresler","year":"2016","journal-title":"Int. J. Doc. Anal. Recognit. (IJDAR)"},{"key":"ref_82","doi-asserted-by":"crossref","unstructured":"Bresler, M., Van Phan, T., Prusa, D., and Nakagawa, M. (2014, January 1\u20134). Recognition system for on-line sketched diagrams. Proceedings of the 2014 14th International Conference on Frontiers in Handwriting Recognition, Crete, Greece.","DOI":"10.1109\/ICFHR.2014.100"},{"key":"ref_83","doi-asserted-by":"crossref","first-page":"487","DOI":"10.1007\/s10032-024-00506-6","article-title":"A method for analyzing handwritten program flowchart based on detection transformer and logic rules","volume":"28","author":"Wang","year":"2025","journal-title":"Int. J. Doc. Anal. Recognit. (IJDAR)"},{"key":"ref_84","doi-asserted-by":"crossref","unstructured":"Bresler, M., Pr\u016f\u0161a, D., and Hlav\u00e1\u010d, V. (2016, January 23\u201326). Recognizing Off-Line Flowcharts by Reconstructing Strokes and Using On-Line Recognition Techniques. Proceedings of the 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China.","DOI":"10.1109\/ICFHR.2016.0022"},{"key":"ref_85","doi-asserted-by":"crossref","unstructured":"Ye, M., Zhang, J., Zhao, S., Liu, J., Du, B., and Tao, D. (2023, January 7\u201314). Dptext-detr: Towards better scene text detection with dynamic points in transformer. Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA.","DOI":"10.1609\/aaai.v37i3.25430"},{"key":"ref_86","doi-asserted-by":"crossref","unstructured":"Johri, A., Sharma, S., Chaudhary, V., Raj, G., and Khurana, S. (2025, January 29\u201331). Analysis & Modeling of Deep Learning Techniques for Flowchart to Code Generation. Proceedings of the 2025 International Conference on Networks and Cryptology (NETCRYPT), New Delhi, India.","DOI":"10.1109\/NETCRYPT65877.2025.11102763"},{"key":"ref_87","doi-asserted-by":"crossref","unstructured":"Otto, B., Aidarkhan, A., Ristin, M., Braunisch, N., Diedrich, C., van de Venn, H.W., and Wollschlaeger, M. (2025, January 10\u201313). Code and Test Generation for I4.0 State Machines with LLM-based Diagram Recognition. Proceedings of the 2025 IEEE 21st International Conference on Factory Communication Systems (WFCS), Rostock, Germany.","DOI":"10.1109\/WFCS63373.2025.11077624"},{"key":"ref_88","doi-asserted-by":"crossref","unstructured":"Carton, C., Lemaitre, A., and Co\u00fcasnon, B. (2013, January 25\u201328). Fusion of statistical and structural information for flowchart recognition. Proceedings of the 2013 12th International Conference on Document Analysis and Recognition, Washington, DC, USA.","DOI":"10.1109\/ICDAR.2013.245"},{"key":"ref_89","unstructured":"Lemaitre, A., Mouchere, H., Camillerapp, J., and Co\u00fcasnon, B. (2011). Interest of syntactic knowledge for on-line flowchart recognition. International Workshop on Graphics Recognition, Springer."},{"key":"ref_90","doi-asserted-by":"crossref","unstructured":"Suri, M., Mathur, P., Lipka, N., Dernoncourt, F., Rossi, R.A., Gupta, V., and Manocha, D. (2025). Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents. arXiv.","DOI":"10.18653\/v1\/2025.emnlp-main.1144"},{"key":"ref_91","unstructured":"Guyon, I., Von Luxburg, U., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (2017). Attention is all you need. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_92","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10032-020-00361-1","article-title":"Arrow R-CNN for handwritten diagram recognition","volume":"24","author":"Keuper","year":"2021","journal-title":"Int. J. Doc. Anal. Recognit."},{"key":"ref_93","unstructured":"Xu, X., Jiang, Y., Chen, W., Huang, Y., Zhang, Y., and Sun, X. (2022). Damo-yolo: A report on real-time object detection design. arXiv."},{"key":"ref_94","doi-asserted-by":"crossref","unstructured":"Lewis, D., Agam, G., Argamon, S., Frieder, O., Grossman, D., and Heard, J. (2006, January 6\u201311). Building a test collection for complex document information processing. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, WA, USA.","DOI":"10.1145\/1148170.1148307"},{"key":"ref_95","doi-asserted-by":"crossref","unstructured":"Abu-Aisheh, Z., Raveaux, R., Ramel, J.Y., and Martineau, P. (2015, January 10\u201312). An exact graph edit distance algorithm for solving pattern recognition problems. Proceedings of the 4th International Conference on Pattern Recognition Applications and Methods 2015, Lisbon, Portugal.","DOI":"10.5220\/0005209202710278"},{"key":"ref_96","unstructured":"Holm, H. (2021). Bidirectional Encoder Representations from Transformers (Bert) for Question Answering in the Telecom Domain: Adapting a Bert-like Language Model to the Telecom Domain Using the Electra Pre-Training Approach. [Master\u2019s Thesis, KTH Royal Institute of Technology]. Available online: https:\/\/urn.kb.se\/resolve?urn=urn:nbn:se:kth:diva-301313."},{"key":"ref_97","doi-asserted-by":"crossref","unstructured":"Xiao, S., Liu, Z., Zhang, P., Muennighoff, N., Lian, D., and Nie, J.Y. (2024, January 14\u201318). C-pack: Packed resources for general chinese embeddings. Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington DC, USA.","DOI":"10.1145\/3626772.3657878"},{"key":"ref_98","unstructured":"Guernsey, G. (2025). Harnessing Large Language Models for Automated Software Diagram Generation. [Master\u2019s Thesis, University of Cincinnati]."},{"key":"ref_99","first-page":"1","article-title":"Conceptual modeling and large language models: Impressions from first experiments with ChatGPT","volume":"18","author":"Fill","year":"2023","journal-title":"Enterp. Model. Inf. Syst. Archit. (EMISAJ)"},{"key":"ref_100","unstructured":"Conrardy, A., and Cabot, J. (2024). From image to uml: First results of image based uml diagram generation using llms. arXiv."},{"key":"ref_101","unstructured":"Thomas, R., and Webb, B. (2023). Architectural views, 2023. Lecture Notes in Software Engineering, University of Queensland. Last updated 21 February 2025."},{"key":"ref_102","first-page":"123","article-title":"Challenges and advancements in recurrent neural networks for sequential data modeling","volume":"42","author":"Yan","year":"2023","journal-title":"J. Mach. Learn. Res."},{"key":"ref_103","unstructured":"Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., and Agarwal, S. (2020). Language models are few-shot learners. arXiv."},{"key":"ref_104","unstructured":"Pawar, S.S. (2025). Automated Code Generation from Flowcharts: A Multimodal Deep Learning Framework for Accurate Translation and Debugging. [Master\u2019s Thesis, Texas Tech University]."},{"key":"ref_105","doi-asserted-by":"crossref","unstructured":"Yuan, Z., Pan, H., and Zhang, L. (2008). A novel pen-based flowchart recognition system for programming teaching. Proceedings of the Workshop on Blended Learning, Springer.","DOI":"10.1007\/978-3-540-89962-4_6"},{"key":"ref_106","first-page":"167","article-title":"A Framework for Model-Based Code Generation from a Flowchart","volume":"2","author":"Hussein","year":"2013","journal-title":"Int. J. Comput. Acad. Res."},{"key":"ref_107","doi-asserted-by":"crossref","unstructured":"Bhushan, S., and Lee, M. (2022, January 20\u201323). Block diagram-to-text: Understanding block diagram images by generating natural language descriptors. Proceedings of the Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, Taipei, Taiwan.","DOI":"10.18653\/v1\/2022.findings-aacl.15"},{"key":"ref_108","unstructured":"Balaji, A., Ramanathan, T., and Sonathi, V. (2018). Chart-text: A fully automated chart image descriptor. arXiv."},{"key":"ref_109","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_110","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Ioffe, S., Vanhoucke, V., and Alemi, A. (2017, January 4\u20139). Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"ref_111","first-page":"1","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"21","author":"Raffel","year":"2020","journal-title":"J. Mach. Learn. Res."},{"key":"ref_112","doi-asserted-by":"crossref","unstructured":"Gardent, C., Shimorina, A., Narayan, S., and Perez-Beltrachini, L. (2017, January 4\u20137). The WebNLG challenge: Generating text from RDF data. Proceedings of the 10th International Conference on Natural Language Generation. ACL Anthology, Santiago de Compostela, Spain.","DOI":"10.18653\/v1\/W17-3518"},{"key":"ref_113","unstructured":"Ku, L.W., Martins, A., and Srikumar, V. (2024). Unveiling the Power of Integration: Block Diagram Summarization through Local-Global Fusion. Findings of the Association for Computational Linguistics ACL 2024, Association for Computational Linguistics."},{"key":"ref_114","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_115","doi-asserted-by":"crossref","first-page":"111431","DOI":"10.1016\/j.jss.2022.111431","article-title":"Automatically recognizing the semantic elements from UML class diagram images","volume":"193","author":"Chen","year":"2022","journal-title":"J. Syst. Softw."},{"key":"ref_116","doi-asserted-by":"crossref","unstructured":"Karasneh, B., and Chaudron, M.R. (2013, January 27\u201328). Extracting UML models from images. Proceedings of the 2013 5th International Conference on Computer Science and Information Technology, Amman, Jordan.","DOI":"10.1109\/CSIT.2013.6588776"},{"key":"ref_117","doi-asserted-by":"crossref","unstructured":"Ho-Quang, T., Chaudron, M.R., Sam\u00faelsson, I., Hjaltason, J., Karasneh, B., and Osman, H. (2014, January 1\u20134). Automatic classification of UML class diagrams from images. Proceedings of the 2014 21st Asia-Pacific Software Engineering Conference, Jeju, South Korea.","DOI":"10.1109\/APSEC.2014.65"},{"key":"ref_118","doi-asserted-by":"crossref","unstructured":"El-Salamony, M., and Guaily, A. (2020). Enhanced modified-polygon method for point-in-polygon problem. Proceedings of the Recent Advances in Engineering Mathematics and Physics: Proceedings of the International Conference RAEMP 2019, Springer.","DOI":"10.1007\/978-3-030-39847-7_4"},{"key":"ref_119","doi-asserted-by":"crossref","unstructured":"Weerasinghe, D., Thiwanka, K., Jayasith, H., Natalie, P.O., Rajapaksha, U.S., and Karunasena, A. (2022, January 9\u201310). Smart UML-Assignment Management Tool for UML Diagrams. Proceedings of the 2022 4th International Conference on Advancements in Computing (ICAC), Colombo, Sri Lanka.","DOI":"10.1109\/ICAC57685.2022.10025080"},{"key":"ref_120","doi-asserted-by":"crossref","unstructured":"Elallaoui, M., Nafil, K., and Touahni, R. (2015, January 20\u201321). Automatic generation of UML sequence diagrams from user stories in Scrum process. Proceedings of the 2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA), Rabat, Morocco.","DOI":"10.1109\/SITA.2015.7358415"},{"key":"ref_121","doi-asserted-by":"crossref","unstructured":"Gulia, S., and Choudhury, T. (2016, January 14\u201315). An efficient automated design to generate UML diagram from Natural Language Specifications. Proceedings of the 2016 6th International Conference-Cloud System and Big Data Engineering (Confluence), Noida, India.","DOI":"10.1109\/CONFLUENCE.2016.7508197"},{"key":"ref_122","doi-asserted-by":"crossref","unstructured":"Fauzan, R., Siahaan, D., Rochimah, S., and Triandini, E. (2018, January 13\u201314). Class diagram similarity measurement: A different approach. Proceedings of the 2018 3rd International Conference on Information Technology, Information System and Electrical Engineering (ICITISEE), Yogyakarta, Indonesia.","DOI":"10.1109\/ICITISEE.2018.8721021"},{"key":"ref_123","doi-asserted-by":"crossref","unstructured":"Qiu, D., Li, H., and Sun, J. (2013, January 19\u201321). Measuring software similarity based on structure and property of class diagram. Proceedings of the 2013 Sixth International Conference on Advanced Computational Intelligence (ICACI), Hangzhou, China.","DOI":"10.1109\/ICACI.2013.6748477"},{"key":"ref_124","doi-asserted-by":"crossref","unstructured":"Wang, K., Liu, W., Mu, Y., and Gao, S. (2023, January 15\u201317). Automatic extraction of sequence diagram semantic information. Proceedings of the 2023 5th International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Hangzhou, China.","DOI":"10.1109\/MLBDBI60823.2023.10481986"},{"key":"ref_125","first-page":"100660","article-title":"Unified modeling language code generation from diagram images using multimodal large language models","volume":"20","author":"Bates","year":"2025","journal-title":"Mach. Learn. Appl."},{"key":"ref_126","first-page":"34892","article-title":"Visual instruction tuning","volume":"36","author":"Liu","year":"2023","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_127","first-page":"3","article-title":"Lora: Low-rank adaptation of large language models","volume":"1","author":"Hu","year":"2022","journal-title":"ICLR"},{"key":"ref_128","doi-asserted-by":"crossref","first-page":"600","DOI":"10.1109\/TIP.2003.819861","article-title":"Image quality assessment: From error visibility to structural similarity","volume":"13","author":"Wang","year":"2004","journal-title":"IEEE Trans. Image Process."},{"key":"ref_129","doi-asserted-by":"crossref","unstructured":"Bhattacharya, A., Roy, S., Sarkar, N., Malakar, S., and Sarkar, R. (2020, January 28\u201329). Circuit component detection in offline handdrawn electrical\/electronic circuit diagram. Proceedings of the 2020 IEEE Calcutta Conference (CALCON), Kolkata, India.","DOI":"10.1109\/CALCON49167.2020.9106527"},{"key":"ref_130","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1109\/34.3898","article-title":"An automatic circuit diagram reader with loop-structure-based symbol recognition","volume":"10","author":"Okazaki","year":"1988","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_131","unstructured":"De Jesus, E.O., and Lotufo, R.D.A. (1998, January 20\u201323). ECIR-an electronic circuit diagram image recognizer. Proceedings of the Proceedings SIBGRAPI\u201998. International Symposium on Computer Graphics, Image Processing, and Vision (Cat. No. 98EX237), Rio de Janeiro, Brazil."},{"key":"ref_132","doi-asserted-by":"crossref","unstructured":"Malakar, S., Halder, S., Sarkar, R., Das, N., Basu, S., and Nasipuri, M. (2012, January 28\u201329). Text line extraction from handwritten document pages using spiral run length smearing algorithm. Proceedings of the 2012 International Conference on Communications, Devices and Intelligent Systems (CODIS), Kolkata, West Bengal, India.","DOI":"10.1109\/CODIS.2012.6422278"},{"key":"ref_133","doi-asserted-by":"crossref","unstructured":"Sertdemir, A.E., Besenk, M., Dalyan, T., Gokdel, Y.D., and Afacan, E. (2022, January 12\u201315). From Image to Simulation: An ANN-based Automatic Circuit Netlist Generator (Img2Sim). Proceedings of the 2022 18th International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD), Villasimius, Italy.","DOI":"10.1109\/SMACD55068.2022.9816254"},{"key":"ref_134","first-page":"327","article-title":"Introduction to artificial neural network (ANN) methods: What they are and how to use them","volume":"41","author":"Zupan","year":"1994","journal-title":"Acta Chim. Slov."},{"key":"ref_135","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1504\/IJCAT.2020.103905","article-title":"Hand-drawn electronic component recognition using deep learning algorithm","volume":"62","author":"Wang","year":"2020","journal-title":"Int. J. Comput. Appl. Technol."},{"key":"ref_136","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1016\/j.procs.2016.04.064","article-title":"Hand drawn optical circuit recognition","volume":"84","author":"Rabbani","year":"2016","journal-title":"Procedia Comput. Sci."},{"key":"ref_137","doi-asserted-by":"crossref","unstructured":"G\u00fcnay, M., K\u00f6seo\u011flu, M., and Y\u0131ld\u0131r\u0131m, \u00d6. (2020, January 25\u201327). Classification of hand-drawn basic circuit components using convolutional neural networks. Proceedings of the 2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), Ankara, Turkey.","DOI":"10.1109\/HORA49412.2020.9152866"},{"key":"ref_138","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1007\/s42979-022-01159-0","article-title":"Hand-drawn electrical circuit recognition using object detection and node recognition","volume":"3","author":"Rachala","year":"2022","journal-title":"SN Comput. Sci."},{"key":"ref_139","doi-asserted-by":"crossref","first-page":"31353","DOI":"10.1007\/s11042-020-09570-6","article-title":"Offline hand-drawn circuit component recognition using texture and shape-based features","volume":"79","author":"Roy","year":"2020","journal-title":"Multimed. Tools Appl."},{"key":"ref_140","doi-asserted-by":"crossref","first-page":"13367","DOI":"10.1007\/s00521-021-05964-1","article-title":"A two-stage CNN-based hand-drawn electrical and electronic circuit component recognition system","volume":"33","author":"Dey","year":"2021","journal-title":"Neural Comput. Appl."},{"key":"ref_141","doi-asserted-by":"crossref","first-page":"374","DOI":"10.18178\/ijmlc.2019.9.3.813","article-title":"Handwritten electric circuit diagram recognition: An approach based on finite state machine","volume":"9","author":"Dinesh","year":"2019","journal-title":"Int. J. Mach. Learn. Comput."},{"key":"ref_142","doi-asserted-by":"crossref","unstructured":"Alhalabi, M., Ghazal, M., Haneefa, F., Yousaf, J., and El-Baz, A. (2021). Smartphone handwritten circuits solver using augmented reality and capsule deep networks for engineering education. Educ. Sci., 11.","DOI":"10.3390\/educsci11110661"},{"key":"ref_143","unstructured":"Liu, Y., and Xiao, Y. (2013). Circuit Sketch Recognition, Department of Electrical Engineering Stanford University."},{"key":"ref_144","first-page":"24","article-title":"Hand-drawn digital logic circuit component recognition using svm","volume":"143","author":"Patare","year":"2016","journal-title":"Int. J. Comput. Appl."},{"key":"ref_145","unstructured":"Guyon, I., Von Luxburg, U., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (2017). Dynamic routing between capsules. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_146","doi-asserted-by":"crossref","unstructured":"Bohara, B., and Krishnamoorthy, H.S. (2022, January 14\u201317). Computer Vision based Framework for Power Converter Identification and Analysis. Proceedings of the 2022 IEEE International Conference on Power Electronics, Drives and Energy Systems (PEDES), Jaipur, India.","DOI":"10.1109\/PEDES56012.2022.10080752"},{"key":"ref_147","doi-asserted-by":"crossref","unstructured":"Bayer, J., Turabi, S.H., and Dengel, A. (2023). Text extraction for handwritten circuit diagram images. Proceedings of the International Conference on Document Analysis and Recognition, Springer.","DOI":"10.1007\/978-3-031-41498-5_14"},{"key":"ref_148","doi-asserted-by":"crossref","unstructured":"Bayer, J., Roy, A.K., and Dengel, A. (2023). Instance segmentation based graph extraction for handwritten circuit diagram images. arXiv.","DOI":"10.5220\/0011752600003411"},{"key":"ref_149","unstructured":"Gao, M., Qiu, R., Chang, Z.H., Zhang, K., Wei, H., and Chen, H.C. (2025). Circuit Diagram Retrieval Based on Hierarchical Circuit Graph Representation. arXiv."},{"key":"ref_150","doi-asserted-by":"crossref","unstructured":"Mohan, A., Mohan, A., Indushree, B., Malavikaa, M., and Narendra, C. (2022, January 12\u201314). Generation of Netlist from a Hand drawn Circuit through Image Processing and Machine Learning. Proceedings of the 2022 2nd International Conference on Artificial Intelligence and Signal Processing (AISP), Vijayawada, India.","DOI":"10.1109\/AISP53593.2022.9760577"},{"key":"ref_151","unstructured":"Wang, C.Y., Yeh, I.H., and Liao, H.Y.M. (2021). You only learn one representation: Unified network for multiple tasks. arXiv."},{"key":"ref_152","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1016\/S0031-3203(02)00060-2","article-title":"The global k-means clustering algorithm","volume":"36","author":"Likas","year":"2003","journal-title":"Pattern Recognit."},{"key":"ref_153","doi-asserted-by":"crossref","unstructured":"Komoot, N., and Mruetusatorn, S. (2024, January 23\u201324). An Analysis of Electronic Circuits Diagram Using Computer Vision Techniques. Proceedings of the 2024 9th International Conference on Business and Industrial Research (ICBIR), Bangkok, Thailand.","DOI":"10.1109\/ICBIR61386.2024.10875940"},{"key":"ref_154","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1016\/j.procs.2023.01.032","article-title":"Hand-drawn electronic component recognition using orb","volume":"218","author":"Pavithra","year":"2023","journal-title":"Procedia Comput. Sci."},{"key":"ref_155","doi-asserted-by":"crossref","first-page":"61","DOI":"10.5194\/ars-22-61-2024","article-title":"From Schematics to Netlists\u2013Electrical Circuit Analysis Using Deep-Learning Methods","volume":"22","author":"Hemker","year":"2024","journal-title":"Adv. Radio Sci."},{"key":"ref_156","doi-asserted-by":"crossref","first-page":"016109","DOI":"10.1063\/5.0177755","article-title":"Digitizing images of electrical-circuit schematics","volume":"2","author":"Kelly","year":"2024","journal-title":"APL Mach. Learn."},{"key":"ref_157","first-page":"es44389-025-03676-4","article-title":"Hand-Drawn Electronic Component Detection and Simulation Using Deep Learning and Flask","volume":"2","author":"Arun","year":"2025","journal-title":"Cureus J."},{"key":"ref_158","doi-asserted-by":"crossref","unstructured":"Cao, W., Chen, Z., Wu, C., and Li, T. (2025). A Layered Framework for Universal Extraction and Recognition of Electrical Diagrams. Electronics, 14.","DOI":"10.3390\/electronics14050833"},{"key":"ref_159","first-page":"94539","article-title":"Hypothesis testing the circuit hypothesis in LLMs","volume":"Volume 37","author":"Shi","year":"2024","journal-title":"Advances in Neural Information Processing Systems"},{"key":"ref_160","doi-asserted-by":"crossref","first-page":"012029","DOI":"10.1088\/1757-899X\/1074\/1\/012029","article-title":"Design of implantable cardioverter defibrillator using low power subthreshold digital circuit","volume":"Volume 1074","author":"Nareshkumar","year":"2021","journal-title":"IOP Conference Series: Materials Science and Engineering"},{"key":"ref_161","doi-asserted-by":"crossref","unstructured":"Dieste-Velasco, M. (2021). Application of a Pattern-Recognition Neural Network for Detecting Analog Electronic Circuit Faults. Mathematics, 9.","DOI":"10.3390\/math9243247"},{"key":"ref_162","doi-asserted-by":"crossref","unstructured":"Zhou, Y., Qi, H., and Ma, Y. (2019, January 27\u201328). End-to-end wireframe parsing. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00105"},{"key":"ref_163","doi-asserted-by":"crossref","unstructured":"Mishra, D., and Vinayak, C. (2013, January 16\u201318). Image based Circuit Simulation. Proceedings of the 2013 International Conference on Control, Automation, Robotics and Embedded Systems (CARE), Jabalpur, India.","DOI":"10.1109\/CARE.2013.6733773"},{"key":"ref_164","unstructured":"Matasyx, J., and Kittlery, C.G.J. (1998, January 14\u201317). Progressive probabilistic hough transform. Proceedings of the British Machine Vision Conference, Southampton, UK."},{"key":"ref_165","doi-asserted-by":"crossref","unstructured":"Neuner, M., Abel, I., and Graeb, H. (2021, January 1\u20135). Library-free Structure Recognition for Analog Circuits. Proceedings of the 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, France.","DOI":"10.23919\/DATE51398.2021.9474102"},{"key":"ref_166","first-page":"342","article-title":"A novel mobile application for circuit component identification and recognition through machine learning and image processing techniques","volume":"16","author":"Vasudevan","year":"2017","journal-title":"Int. J. Intell. Syst. Technol. Appl."},{"key":"ref_167","doi-asserted-by":"crossref","unstructured":"Pan, L., Xue, Z., and Zhang, K. (2025). GAML-YOLO: A Precise Detection Algorithm for Extracting Key Features from Complex Environments. Electronics, 14.","DOI":"10.3390\/electronics14132523"},{"key":"ref_168","first-page":"353","article-title":"A distance measure between attributed relational graphs for pattern recognition","volume":"3","author":"Sanfeliu","year":"2012","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_169","unstructured":"Gody, M. (2022, October 14). Hand-Drawn Electric Circuit Schematic Components. Available online: https:\/\/www.kaggle.com\/datasets\/moodrammer\/handdrawn-circuit-schematic-components."},{"key":"ref_170","doi-asserted-by":"crossref","unstructured":"Uzair, W., Chai, D., and Rassau, A. (2023). ElectroNet: An Enhanced Model for Small-Scale Object Detection in Electrical Schematic Diagrams. Res. Sq.","DOI":"10.21203\/rs.3.rs-3137489\/v1"},{"key":"ref_171","unstructured":"Cui, C., Sun, T., Lin, M., Gao, T., Zhang, Y., Liu, J., Wang, X., Zhang, Z., Zhou, C., and Liu, H. (2025). Paddleocr 3.0 technical report. arXiv."},{"key":"ref_172","doi-asserted-by":"crossref","unstructured":"Bi, R., Xu, T., Xu, M., and Chen, E. (2022, January 18\u201320). Paddlepaddle: A production-oriented deep learning platform facilitating the competency of enterprises. Proceedings of the 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC\/DSS\/SmartCity\/DependSys), Chengdu, China.","DOI":"10.1109\/HPCC-DSS-SmartCity-DependSys57074.2022.00046"},{"key":"ref_173","doi-asserted-by":"crossref","unstructured":"He, J., Ni\u010dkovi\u0107, D., Bartocci, E., and Grosu, R. (2023, January 9\u201313). TD-Magic: From Pictures of Timing Diagrams To Formal Specifications. Proceedings of the 2023 60th ACM\/IEEE Design Automation Conference (DAC), Francisco, CA, USA.","DOI":"10.1109\/DAC56929.2023.10247685"},{"key":"ref_174","doi-asserted-by":"crossref","first-page":"104077","DOI":"10.1016\/j.compbiomed.2020.104077","article-title":"Deep learning for digitizing highly noisy paper-based ECG records","volume":"127","author":"Li","year":"2020","journal-title":"Comput. Biol. Med."},{"key":"ref_175","unstructured":"He, J., Kenbeek, V.T.W., Yang, Z., Qu, M., Bartocci, E., Ni\u010dkovi\u0107, D., and Grosu, R. (2025). TD-Interpreter: Enhancing the Understanding of Timing Diagrams with Visual-Language Learning. arXiv."},{"key":"ref_176","doi-asserted-by":"crossref","first-page":"27641","DOI":"10.1109\/ACCESS.2023.3258399","article-title":"Design and analysis of convolutional neural layers: A signal processing perspective","volume":"11","author":"Farag","year":"2023","journal-title":"IEEE Access"},{"key":"ref_177","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1016\/j.ins.2021.04.053","article-title":"RTFN: A robust temporal feature network for time series classification","volume":"571","author":"Xiao","year":"2021","journal-title":"Inf. Sci."},{"key":"ref_178","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.ins.2020.03.039","article-title":"Knowledge extraction from deep convolutional neural networks applied to cyclo-stationary time-series classification","volume":"524","author":"Cabrera","year":"2020","journal-title":"Inf. Sci."},{"key":"ref_179","doi-asserted-by":"crossref","first-page":"112","DOI":"10.1109\/TSMC.2020.2968516","article-title":"Anomaly detection based on convolutional recurrent autoencoder for IoT time series","volume":"52","author":"Yin","year":"2020","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_180","doi-asserted-by":"crossref","unstructured":"Tuli, S., Casale, G., and Jennings, N.R. (2022). Tranad: Deep transformer networks for anomaly detection in multivariate time series data. arXiv.","DOI":"10.14778\/3514061.3514067"},{"key":"ref_181","doi-asserted-by":"crossref","first-page":"129024","DOI":"10.1016\/j.neucom.2024.129024","article-title":"LGAT: A novel model for multivariate time series anomaly detection with improved anomaly transformer and learning graph structures","volume":"617","author":"Wen","year":"2025","journal-title":"Neurocomputing"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/17\/2\/165\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T11:30:52Z","timestamp":1770377452000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/17\/2\/165"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,6]]},"references-count":181,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2026,2]]}},"alternative-id":["info17020165"],"URL":"https:\/\/doi.org\/10.3390\/info17020165","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,6]]}}}