{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T20:20:03Z","timestamp":1770495603551,"version":"3.49.0"},"reference-count":60,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2025,11,25]],"date-time":"2025-11-25T00:00:00Z","timestamp":1764028800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>The Aho-Corasick (AC) algorithm remains one of the most influential developments in deterministic multi-pattern matching due to its ability to recognize multiple strings in linear time within a single data stream. Originally conceived for bibliographic text retrieval, the structure of the algorithm is based on a trie augmented with failure links and output functions, which has proven to be remarkably adaptable across computational domains. This review presents a comprehensive synthesis of the AC algorithm, with details on its theoretical foundations, formal automaton structure, and operational principles, as well as tracing its historical evolution from text-search systems to large-scale malware detection. This work further explores the integration of Aho-Corasick automata within modern antivirus architectures, describing mechanisms of signature compilation, real-time scanning pipelines, and large-scale deployment in contemporary cybersecurity systems. The deterministic structure of the Aho-Corasick automaton provides linear-time pattern recognition relative to input size, while practical performance characteristics reflect memory and architecture constraints in large signature sets. This linear-time property enables predictable and efficient malware detection, where each byte of input induces a constant computational cost. Such deterministic efficiency makes the algorithm ideally suited for real-time antivirus scanning and signature-based threat identification. Thus, nearly fifty years after its inception, AC continues to bridge formal automata theory and modern cybersecurity practice.<\/jats:p>","DOI":"10.3390\/a18120742","type":"journal-article","created":{"date-parts":[[2025,11,25]],"date-time":"2025-11-25T16:31:54Z","timestamp":1764088314000},"page":"742","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["The Aho-Corasick Paradigm in Modern Antivirus Engines: A Cornerstone of Signature-Based Malware Detection"],"prefix":"10.3390","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9350-1530","authenticated-orcid":false,"given":"Paul A.","family":"Gagniuc","sequence":"first","affiliation":[{"name":"Faculty of Engineering in Foreign Languages, National University of Science and Technology Politehnica Bucharest, RO-060042 Bucharest, Romania"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2651-7975","authenticated-orcid":false,"given":"Ionel-Bujorel","family":"P\u0103v\u0103loiu","sequence":"additional","affiliation":[{"name":"Faculty of Engineering in Foreign Languages, National University of Science and Technology Politehnica Bucharest, RO-060042 Bucharest, Romania"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8824-5704","authenticated-orcid":false,"given":"Maria-Iuliana","family":"Dasc\u0103lu","sequence":"additional","affiliation":[{"name":"Faculty of Engineering in Foreign Languages, National University of Science and Technology Politehnica Bucharest, RO-060042 Bucharest, Romania"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1145\/360825.360855","article-title":"Efficient string matching: An aid to bibliographic search","volume":"18","author":"Aho","year":"1975","journal-title":"Commun. ACM"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1137\/0206024","article-title":"Fast pattern matching in strings","volume":"6","author":"Knuth","year":"1977","journal-title":"SIAM J. Comput."},{"key":"ref_3","unstructured":"Kernighan, B.W., and Pike, R. (1984). The UNIX Programming Environment, Prentice-Hall."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Navarro, G., and Raffinot, M. (2002). Flexible Pattern Matching in Strings, Cambridge University Press.","DOI":"10.1017\/CBO9781316135228"},{"key":"ref_5","unstructured":"Charras, C., and Lecroq, T. (2004). Handbook of Exact String Matching Algorithms, King\u2019s College Publications."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Erdogan, T., Cao, J., Mazi\u00e8res, D., and Boneh, D. (2007). Hash-AV: Fast Virus Signature Matching by Cache-Resident Hashing, Stanford Applied Cryptography Group, Stanford University.","DOI":"10.1504\/IJSN.2007.012824"},{"key":"ref_7","unstructured":"Szor, P. (2005). The Art of Computer Virus Research and Defense, Addison-Wesley."},{"key":"ref_8","unstructured":"Tuck, N., Sherwood, T., Calder, B., and Varghese, G. (2004). Deterministic Memory-Efficient String Matching Algorithms for Intrusion Detection, IEEE INFOCOM."},{"key":"ref_9","unstructured":"Roesch, M. (1999, January 7\u201312). Snort: Lightweight intrusion detection for networks. Proceedings of the 13th USENIX Conference on System Administration, Seattle, WA, USA."},{"key":"ref_10","unstructured":"Vasiliadis, G., Antonatos, S., Polychronakis, M., Markatos, E.P., and Ioannidis, S. (2008, January 15\u201317). Gnort: High Performance Network Intrusion Detection Using Graphics Processors. Proceedings of the 11th International Symposium on Recent Advances in Intrusion Detection (RAID 2008), Cambridge, MA, USA."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Dimopoulos, V., Papaefstathiou, I., and Pnevmatikatos, D. (2007, January 16\u201319). A memory-efficient reconfigurable Aho-Corasick FSM implementation for intrusion detection systems. Proceedings of the 2007 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (ICSAMOS), Samos, Greece.","DOI":"10.1109\/ICSAMOS.2007.4285750"},{"key":"ref_12","first-page":"1","article-title":"Learning from examples using the Aho-Corasick algorithm","volume":"80","author":"Blumer","year":"1989","journal-title":"Inf. Comput."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Lin, C.-H., Tsai, S.-Y., Liu, C.-H., Chang, S.-C., and Shyu, J.-M. (2010). Accelerating string matching using multi-threaded algorithm on GPU. IEEE GLOBECOM Workshops, IEEE Communications Society.","DOI":"10.1109\/GLOCOM.2010.5683320"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Kim, H., and Choi, K.-I. (2016). A Pipelined Non-Deterministic Finite Automaton-Based String Matching Scheme Using Merged State Transitions in an FPGA. PLoS ONE, 11.","DOI":"10.1371\/journal.pone.0163535"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Gagniuc, P.A. (2024). Antivirus Engines: From Methods to Innovations, Design, and Applications, Elsevier Syngress.","DOI":"10.1016\/B978-0-443-32952-4.00018-8"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1016\/0020-0190(85)90088-2","article-title":"Incremental string matching","volume":"21","author":"Meyer","year":"1985","journal-title":"Inf. Process. Lett."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1016\/0304-3975(94)90176-7","article-title":"Dynamic dictionary matching with failure functions","volume":"131","author":"Idury","year":"1994","journal-title":"Theor. Comput. Sci."},{"key":"ref_18","unstructured":"Amir, A., Farach, M., Idury, R.M., La Poutr\u00e9, J.A., and Sch\u00e4ffer, A.A. (1993, January 25\u201327). Improved dynamic dictionary matching. Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA \u201893), Society for Industrial and Applied Mathematics USA, Austin, TX, USA."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"258","DOI":"10.1006\/inco.1995.1090","article-title":"Improved dynamic dictionary matching","volume":"119","author":"Amir","year":"1995","journal-title":"Inf. Comput."},{"key":"ref_20","unstructured":"Zha, X., and Sahni, S. (2008, January 6\u20139). Highly compressed Aho-Corasick automata for efficient intrusion detection. Proceedings of the 2008 IEEE Symposium on Computers and Communications, Marrakech, Morocco."},{"key":"ref_21","first-page":"10","article-title":"Compressed Multiple Pattern Matching","volume":"151","author":"Kosolobov","year":"2019","journal-title":"Leibniz Int. Proc. Inform. (LIPIcs. CPM 2019)"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1145\/1132462.1132464","article-title":"Bit-split string-matching engines for intrusion detection and prevention","volume":"3","author":"Tan","year":"2006","journal-title":"ACM Trans. Arch. Code Optim."},{"key":"ref_23","first-page":"10","article-title":"A memory-efficient pipelined implementation of the Aho-Corasick string-matching algorithm","volume":"7","author":"Pao","year":"2010","journal-title":"ACM Trans. Arch. Code Optim."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1906","DOI":"10.1109\/TC.2012.254","article-title":"Accelerating Pattern Matching Using a Novel Parallel Algorithm on GPUs","volume":"62","author":"Lin","year":"2012","journal-title":"IEEE Trans. Comput."},{"key":"ref_25","first-page":"110","article-title":"AC-Automaton Update Algorithm for Semi-dynamic Dictionary Matching","volume":"9954","author":"Diptarama","year":"2016","journal-title":"Int. Symp. String Process. Inf. Retr."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Song, T., Zhang, W., Wang, D.S., and Xue, Y.B. (2008, January 13\u201318). A Memory Efficient Multiple Pattern Matching Architecture for Network Security. Proceedings of the IEEE INFOCOM, Phoenix, AZ, USA.","DOI":"10.1109\/INFOCOM.2007.42"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Kanda, K., Akabe, K., and Oda, Y. (2022). Engineering faster double-array Aho-Corasick automata. arXiv.","DOI":"10.1002\/spe.3190"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Herrero, \u00c1., Sn\u00e1\u0161el, V., Abraham, A., Zelinka, I., Baruque, B., Quinti\u00e1n, H., Calvo, J.L., Sedano, J., and Corchado, E. (2013). Hybrid Compression of the Aho-Corasick Automaton for Static Analysis in Intrusion Detection Systems. International Joint Conference CISIS\u201912-ICEUTE\u201912-SOCO\u201912 Special Sessions. Advances in Intelligent Systems and Computing, Springer.","DOI":"10.1007\/978-3-642-33018-6"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1016\/j.ipm.2006.04.004","article-title":"A Compact Static Double-Array Keeping Character Codes","volume":"43","author":"Yata","year":"2007","journal-title":"Inf. Process. Manag."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1007\/s10115-015-0873-0","article-title":"A compression method of double-array structures using linear functions","volume":"48","author":"Kanda","year":"2015","journal-title":"Knowl. Inf. Syst."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Herrero, \u00c1., Sn\u00e1\u0161el, V., Abraham, A., Zelinka, I., Baruque, B., Quinti\u00e1n, H., Calvo, J.L., Sedano, J., and Corchado, E. (2014). Real-Time Polymorphic Aho-Corasick Automata for Heterogeneous Malicious Code Detection. International Joint Conference SOCO\u201913-CISIS\u201913-ICEUTE\u201913. Advances in Intelligent Systems and Computing, Springer.","DOI":"10.1007\/978-3-319-01854-6"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"24188","DOI":"10.3390\/s141224188","article-title":"A Malicious Pattern Detection Engine for Embedded Systems","volume":"14","author":"Oh","year":"2014","journal-title":"Sensors"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Gazzan, M., Alobaywi, B., Almutairi, M., and Sheldon, F.T. (2025). A Deep Learning Framework for Enhanced Detection of Polymorphic Ransomware. Future Internet, 17.","DOI":"10.3390\/fi17070311"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"5371","DOI":"10.1109\/ACCESS.2020.3048319","article-title":"Tight arms race: Overview of current malware threats and trends in their detection","volume":"9","author":"Caviglione","year":"2020","journal-title":"IEEE Access"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Belazzougui, D. (2010). Succinct Dictionary Matching with No Slowdown. arXiv.","DOI":"10.1007\/978-3-642-13509-5_9"},{"key":"ref_36","unstructured":"Bellekens, X., Seeam, A., and Tachtatzis, C. (2017). Atkinson, Trie Compression for GPU-Accelerated Multi-Pattern Matching. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Al-Asli, M., and Ghaleb, T.A. (2019, January 3\u20134). Review of Signature-based Techniques in Antivirus Products. Proceedings of the 2019 International Conference on Computer and Information Sciences (ICCIS), Sakaka, Saudi Arabia.","DOI":"10.1109\/ICCISci.2019.8716381"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Kirda, E., Jha, S., and Balzarotti, D. (2009). Automatic Generation of String Signatures for Malware Detection. Recent Advances in Intrusion Detection. RAID 2009, Springer. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-642-04342-0"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/j.tcs.2018.04.016","article-title":"Efficient Dynamic Dictionary Matching with DAWGs and AC-automata","volume":"792","author":"Hendrian","year":"2019","journal-title":"Theor. Comput. Sci."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1016\/j.future.2017.02.002","article-title":"Eagle+: A fast incremental approach to automaton and table online up-dates for cloud services","volume":"80","author":"Li","year":"2018","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Tumeo, A. (2012). Hardware Architectures for Data-Intensive Computing Problems: A Case Study for String Matching. Data-Intensive Computing: Architectures, Algorithms and Applications, Cambridge University Press.","DOI":"10.1017\/CBO9780511844409.003"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"290","DOI":"10.3844\/jcssp.2017.290.300","article-title":"On Improving Antivirus Scanning Engines: Memory On-Access Scanner","volume":"13","year":"2017","journal-title":"J. Comput. Sci."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"\u00c7elebi, M. (2023). Accelerating Pattern Matching Using a Novel Multi-Pattern-Matching Algorithm for Deep Packet Inspection. Appl. Sci., 13.","DOI":"10.3390\/app13148104"},{"key":"ref_44","first-page":"185","article-title":"Multi-pattern matching algorithm for embedded computer network intrusion detection systems","volume":"18","author":"Cai","year":"2024","journal-title":"Intell. Decis. Technol."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Huang, N.F., Chu, Y.M., Hsieh, C.Y., Tsai, C.H., and Tzang, Y.J. (2007, January 24\u201328). A deterministic cost-effective string matching algorithm for network intrusion detection system. Proceedings of the 2007 IEEE International Conference on Communications (ICC), Glasgow, UK.","DOI":"10.1109\/ICC.2007.218"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Hsu, F.H., Lee, C.H., Luo, T., Chang, T.C., and Wu, M.H. (2019). A Cloud-Based Real-Time Mechanism to Protect End Hosts against Malware. Appl. Sci., 9.","DOI":"10.3390\/app9183748"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Jha, S., Sommer, R., and Kreibich, C. (2010). GrAVity: A Massively Parallel Antivirus Engine. Recent Advances in Intrusion Detection. RAID 2010, Springer. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-642-15512-3"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Terawi, N., Ashqar, H.I., Darwish, O., Alsobeh, A., Zahariev, P., and Tashtoush, Y. (2025). Enhanced Detection of Intrusion Detection System in Cloud Networks Using Time-Aware and Deep Learning Techniques. Computers, 14.","DOI":"10.3390\/computers14070282"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Song, H.H., Di Pietro, R., Alrabaee, S., Tubishat, M., Al-kfairy, M., and Alfandi, O. (2025). Identifying the Origins of Business Data Breaches Through CTC Detection. Network and System Security. NSS 2024, Springer. Lecture Notes in Computer Science.","DOI":"10.1007\/978-981-96-3531-3"},{"key":"ref_50","unstructured":"Chiril\u0103, A.I., S\u0103racin, C.G., Deaconu, I.D., Nicolescu, D.\u0218., and Radulian, A. (2024). Remote Monitoring and Control System with Increased Operational Technology Cybersecurity Resilience. U.P.B. Sci. Bull. Ser. C, 86."},{"key":"ref_51","unstructured":"Ghi\u021b\u0103, A.A., Chiroiu, M.D., and \u021aurcanu, D. (2025). Evaluating Students\u2019 Performance in Cybersecurity Scenarios Using Binary Trees. U.P.B. Sci. Bull. Ser. C, 87."},{"key":"ref_52","unstructured":"Hilgurt, S.Y. (2021, January 16\u201318). A Survey on Hardware Solutions for Signature-Based Security Systems. Proceedings of the 1st International Workshop on Information Technologies: Theoretical and Applied Problems (ITTAP 2021), Ternopil, Ukraine."},{"key":"ref_53","first-page":"563","article-title":"Multiple pattern matching: Survey and experimental results","volume":"22","author":"Kouzinopoulos","year":"2014","journal-title":"Neural Parallel Sci. Comput."},{"key":"ref_54","first-page":"104660","article-title":"A survey on security applications with SmartNICs: Taxonomy, challenges and future directions","volume":"207","author":"Elizaldea","year":"2025","journal-title":"J. Netw. Comput. Appl."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"563","DOI":"10.12694\/scpe.v20i3.1572","article-title":"SIMD Implementation of the Aho-Corasick Algorithm Using Intel\u00ae AVX2","volume":"20","author":"Ourlis","year":"2019","journal-title":"Scalable Comput. Pract. Exp."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Scarpazza, D.P., Villa, O., and Petrini, F. (2007, January 26\u201330). Peak-Performance DFA-Based String Matching on the Cell Processor. Proceedings of the 21st International Parallel and Distributed Processing Symposium (IPDPS 2007), Long Beach, CA, USA.","DOI":"10.1109\/IPDPS.2007.370634"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Kouzinopoulos, C.S., and Margaritis, K.G. (2009, January 10\u201312). String Matching on a Multicore GPU Using CUDA. Proceedings of the PCI \u201909, 2009 13th Panhellenic Conference on Informatics, Corfu, Greece.","DOI":"10.1109\/PCI.2009.47"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Lee, C.-L., Lin, Y.-S., and Chen, Y.-C. (2015). A hybrid CPU\/GPU pattern-matching algorithm for deep packet inspection. PLoS ONE, 10.","DOI":"10.1371\/journal.pone.0139301"},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1016\/j.jpdc.2015.11.001","article-title":"A high-throughput DPI engine on GPU via algorithm\/implementation co-optimization","volume":"88","author":"Hsieh","year":"2016","journal-title":"J. Parallel Distrib. Comput."},{"key":"ref_60","unstructured":"Ce\u0161ka, M., Havlena, V., Hol\u00edk, L., Korenek, J., Leng\u00e1l, O., Matou\u0161ek, D., Matou\u0161ek, J., Semric, J., and Vojnar, T. (May, January 28). Deep packet inspection in FPGAs via approximate nondeterministic automata. Proceedings of the IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), San Diego, CA, USA."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/12\/742\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,25]],"date-time":"2025-11-25T16:35:57Z","timestamp":1764088557000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/12\/742"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,25]]},"references-count":60,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["a18120742"],"URL":"https:\/\/doi.org\/10.3390\/a18120742","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,25]]}}}