{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,19]],"date-time":"2025-12-19T10:10:10Z","timestamp":1766139010401,"version":"3.41.0"},"reference-count":57,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2025,4,28]],"date-time":"2025-04-28T00:00:00Z","timestamp":1745798400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union\u2019s Horizon 2020","award":["819141"],"award-info":[{"award-number":["819141"]}]},{"DOI":"10.13039\/501100000266","name":"UK Engineering and Physical Sciences Research Council","doi-asserted-by":"crossref","award":["EP\/R006865\/1"],"award-info":[{"award-number":["EP\/R006865\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2025,5,31]]},"abstract":"<jats:p>\n            Grammar-based fuzzing is an effective method for testing programs that consume structured inputs, particularly input parsers. However, if the available grammar does not accurately represent the input format, or if the system under test (SUT) does not conform strictly to the grammar, there may be an impedance mismatch between inputs generated via grammars and inputs accepted by the SUT. Even if the SUT\n            <jats:italic>has<\/jats:italic>\n            been designed to strictly conform to the grammar, the SUT parser may exhibit vulnerabilities that would only be triggered by slightly invalid inputs. Grammar-based generation, by construction, will not yield such edge case inputs. To overcome these limitations, we present two mutational-based approaches:\n            <jats:sc>Gmutator<\/jats:sc>\n            and\n            <jats:sc>G+M<\/jats:sc>\n            . Both approaches are built upon\n            <jats:sc>Grammarinator<\/jats:sc>\n            , a grammar-based generator.\n            <jats:sc>Gmutator<\/jats:sc>\n            applies mutations to the grammar input of\n            <jats:sc>Grammarinator<\/jats:sc>\n            , while\n            <jats:sc>G+M<\/jats:sc>\n            directly applies byte-level mutations to\n            <jats:sc>Grammarinator<\/jats:sc>\n            -generated inputs. To evaluate the effectiveness of these techniques (\n            <jats:sc>Grammarinator<\/jats:sc>\n            ,\n            <jats:sc>Gmutator<\/jats:sc>\n            ,\n            <jats:sc>G+M<\/jats:sc>\n            ) in testing programs that parse various input formats, we conducted an experimental evaluation over four different input formats and twelve SUTs (three per input format). Our findings suggest that both\n            <jats:sc>Gmutator<\/jats:sc>\n            and\n            <jats:sc>G+M<\/jats:sc>\n            excel in generating edge case inputs, facilitating the detection of disparities between input specifications and parser implementations.\n          <\/jats:p>","DOI":"10.1145\/3708517","type":"journal-article","created":{"date-parts":[[2024,12,20]],"date-time":"2024-12-20T13:58:06Z","timestamp":1734703086000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Grammar Mutation for Testing Input Parsers"],"prefix":"10.1145","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2864-1892","authenticated-orcid":false,"given":"Bachir","family":"Bendrissou","sequence":"first","affiliation":[{"name":"Imperial College London, London, United Kingdom of Great Britain and Northern Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3599-7264","authenticated-orcid":false,"given":"Cristian","family":"Cadar","sequence":"additional","affiliation":[{"name":"Imperial College London, London, United Kingdom of Great Britain and Northern Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7448-7961","authenticated-orcid":false,"given":"Alastair F.","family":"Donaldson","sequence":"additional","affiliation":[{"name":"Imperial College London, London, United Kingdom of Great Britain and Northern Ireland"}]}],"member":"320","published-online":{"date-parts":[[2025,4,28]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"Antlr. 2020. ANTLR V4 Grammars. Retrieved May 7 2023 from https:\/\/github.com\/antlr\/grammars-v4"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.14722\/ndss.2019.23412"},{"key":"e_1_3_2_4_2","volume-title":"Proceedings of the 32nd USENIX Security Symposium (USENIX Security\u201923)","author":"Bars Nils","year":"2023","unstructured":"Nils Bars, Moritz Schloegel, Tobias Scharnowski, Schiller Nico, and Thorsten Holz. 2023. Fuzztruction: Using fault injection-based fuzzing to leverage implicit domain knowledge. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security\u201923)."},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3062341.3062349"},{"key":"e_1_3_2_6_2","unstructured":"Bachir Bendrissou. 2023. Accepting Invalid Array. Retrieved from https:\/\/github.com\/kgabis\/parson\/issues\/194"},{"key":"e_1_3_2_7_2","unstructured":"Bachir Bendrissou. 2023. Accepting Invalid Integers. Retrieved from https:\/\/github.com\/DaveGamble\/cJSON\/issues\/718"},{"key":"e_1_3_2_8_2","unstructured":"Bachir Bendrissou. 2023. Ambiguity in URL Grammar. Retrieved from https:\/\/github.com\/antlr\/grammars-v4\/pull\/3718"},{"key":"e_1_3_2_9_2","unstructured":"Bachir Bendrissou. 2023. EOF Not Enforced. Retrieved from https:\/\/github.com\/kgabis\/parson\/issues\/195"},{"key":"e_1_3_2_10_2","unstructured":"Bachir Bendrissou. 2023e. Fail to reject an invalid escape sequence. Retrieved from https:\/\/github.com\/boolangery\/py-lua-parser\/issues\/30"},{"key":"e_1_3_2_11_2","unstructured":"Bachir Bendrissou. 2023. Fail to Reject Invalid Escape Character. Retrieved from https:\/\/github.com\/DaveGamble\/cJSON\/issues\/736"},{"key":"e_1_3_2_12_2","unstructured":"Bachir Bendrissou. 2023. Fail to Reject Multiple Root Nodes. Retrieved from https:\/\/github.com\/NaturalIntelligence\/fast-xml-parser\/issues\/542"},{"key":"e_1_3_2_13_2","unstructured":"Bachir Bendrissou. 2023. Fail to reject unassigned global variable declaration. Retrieved from https:\/\/github.com\/boolangery\/py-lua-parser\/issues\/29"},{"key":"e_1_3_2_14_2","unstructured":"Bachir Bendrissou. 2023. Failure to Parse Chained Comparisons. Retrieved from https:\/\/github.com\/boolangery\/py-lua-parser\/issues\/56"},{"key":"e_1_3_2_15_2","unstructured":"Bachir Bendrissou. 2023. Failure to Reject Incorrect Inputs: Missing Function Call Arguments. Retrieved from https:\/\/github.com\/boolangery\/py-lua-parser\/issues\/50"},{"key":"e_1_3_2_16_2","unstructured":"Bachir Bendrissou. 2023. Failure to Reject Incorrect Inputs: Name Token. Retrieved from https:\/\/github.com\/boolangery\/py-lua-parser\/issues\/49"},{"key":"e_1_3_2_17_2","unstructured":"Bachir Bendrissou. 2023. Literal String Gets Parsed as Lua Code. Retrieved from https:\/\/github.com\/boolangery\/py-lua-parser\/issues\/51"},{"key":"e_1_3_2_18_2","unstructured":"Bachir Bendrissou. 2023. Parsing an Invalid XML Element. Retrieved from https:\/\/github.com\/NaturalIntelligence\/fast-xml-parser\/issues\/618"},{"key":"e_1_3_2_19_2","unstructured":"Bachir Bendrissou. 2023. Parsing Lua Long Comment as Short Comment. Retrieved from https:\/\/github.com\/antlr\/grammars-v4\/issues\/3741"},{"key":"e_1_3_2_20_2","unstructured":"Bachir Bendrissou. 2023. Underscore Symbol Not Allowed in XML NameStartChar. Retrieved from https:\/\/github.com\/antlr\/grammars-v4\/issues\/3758"},{"key":"e_1_3_2_21_2","unstructured":"Bachir Bendrissou. 2023. Validation of Invalid XML Declarations. Retrieved from https:\/\/github.com\/NaturalIntelligence\/fast-xml-parser\/issues\/616"},{"key":"e_1_3_2_22_2","unstructured":"Bachir Bendrissou. 2024. Semicolon Not Allowed in Userinfo. Retrieved from https:\/\/lists.gnu.org\/archive\/html\/bug-wget\/2024-06\/msg00005.html"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/3605157.3605170"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/3519939.3523716"},{"key":"e_1_3_2_25_2","unstructured":"CLOC - Count Lines of Code. 2024. CLOC - Count Lines of Code. Retrieved from http:\/\/cloc.sourceforge.net\/"},{"key":"e_1_3_2_26_2","unstructured":"Douglas Crockford. 2017. cjson. https:\/\/json.org"},{"key":"e_1_3_2_27_2","unstructured":"Eliott Dumeix. 2023. A Lua Parser and AST Builder Written in Python. Retrieved from https:\/\/github.com\/boolangery\/py-lua-parser"},{"key":"e_1_3_2_28_2","unstructured":"Krzysztof Gabis. 2023. Lightweight JSON library Written in C. Retrieved from https:\/\/github.com\/kgabis\/parson"},{"key":"e_1_3_2_29_2","unstructured":"Dave Gamble. 2023. Ultralightweight JSON Parser in ANSI C. Retrieved from https:\/\/github.com\/DaveGamble\/cJSON"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3409679"},{"key":"e_1_3_2_31_2","unstructured":"Amit Kumar Gupta. 2023. Fast XML Parser. Retrieved from https:\/\/github.com\/NaturalIntelligence\/fast-xml-parser"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3278186.3278193"},{"key":"e_1_3_2_33_2","volume-title":"Proceedings of the 21st USENIX Security Symposium (USENIX Security\u2019 12)","author":"Holler Christian","year":"2012","unstructured":"Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with Code Fragments. In Proceedings of the 21st USENIX Security Symposium (USENIX Security\u2019 12)."},{"key":"e_1_3_2_34_2","unstructured":"Roberto Ierusalimschy Waldemar Celes and Luiz Henrique de Figueiredo. 2023. Lua. Retrieved from https:\/\/www.lua.org\/manual\/5.3\/manual.html"},{"key":"e_1_3_2_35_2","unstructured":"Arseny Kapoulkine. 2022. Light-Weight C++ XML Processing Library. Retrieved from https:\/\/pugixml.org"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE51524.2021.9678879"},{"key":"e_1_3_2_37_2","unstructured":"Daniel Lemire Geoff Langdale and John Keiser. 2023. Fast Parser for Large JSON Files. Retrieved from https:\/\/simdjson.org"},{"key":"e_1_3_2_38_2","unstructured":"LibFuzzer 2022. LibFuzzer Website. Retrieved from http:\/\/llvm.org\/docs\/LibFuzzer.html"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3314221.3314651"},{"key":"e_1_3_2_40_2","unstructured":"Jake Miller. 2021. An Exploration of JSON Interoperability Vulnerabilities. Retrieved May 12 2021 from https:\/\/labs.bishopfox.com\/tech-blog\/an-exploration-of-json-interoperability-vulnerabilities"},{"key":"e_1_3_2_41_2","unstructured":"Hrvoje Nik\u0161i\u0107. 2023. Network Utility to Retrieve Files from the World Wide Web. Retrieved from https:\/\/www.gnu.org\/software\/wget\/"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3293882.3330576"},{"key":"e_1_3_2_43_2","unstructured":"Mike Pall. 2022. Just-in-Time (JIT) Compiler for the Lua Programming Language. Retrieved from http:\/\/luajit.org"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.5555\/2501720"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2019.2941681"},{"issue":"3","key":"e_1_3_2_46_2","article-title":"A review on grammar-based fuzzing techniques","volume":"13","author":"Al Salem Hamad Ali","year":"2019","unstructured":"Hamad Ali Al Salem and Jia Song. 2019. A review on grammar-based fuzzing techniques. International Journal of Computer Science and Security 13, 3 (June 2019).","journal-title":"International Journal of Computer Science and Security"},{"key":"e_1_3_2_47_2","volume-title":"Proceedings of the 2012 USENIX Annual Technical Conference (USENIX ATC \u201912)","author":"Serebryany Konstantin","year":"2012","unstructured":"Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: A fast address sanity checker. In Proceedings of the 2012 USENIX Annual Technical Conference (USENIX ATC \u201912)."},{"key":"e_1_3_2_48_2","unstructured":"Nicolas Seriot. 2016. Parsing JSON Is a Minefield. Retrieved September 15 2020 from https:\/\/seriot.ch\/parsing_json.php"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1145\/3460319.3464814"},{"key":"e_1_3_2_50_2","unstructured":"Daniel Stenberg. 2023. Command line tool and library for transferring data with URLs. Retrieved from https:\/\/curl.se"},{"key":"e_1_3_2_51_2","unstructured":"The GNOME Project. 2023. XML Toolkit Implemented in C. Retrieved from https:\/\/gitlab.gnome.org\/GNOME\/libxml2"},{"key":"e_1_3_2_52_2","unstructured":"Tatsuhiro Tsujikawa. 2021. Utility for Downloading Files. Retrieved from https:\/\/aria2.github.io"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/TR.2022.3171220"},{"key":"e_1_3_2_54_2","volume-title":"Extensible Markup Language (XML) 1.0","author":"W3C","year":"2008","unstructured":"W3C. 2008. Extensible Markup Language (XML) 1.0 (th ed.). Retrieved from https:\/\/www.w3.org\/TR\/xml\/"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE.2019.00081"},{"key":"e_1_3_2_56_2","unstructured":"Wikipedia. 2023. Robustness Principle. Retrieved from https:\/\/en.wikipedia.org\/wiki\/Robustness_principle"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1145\/1993498.1993532"},{"key":"e_1_3_2_58_2","unstructured":"Michal Zalewski. 2024. Technical \u201cWhitepaper\u201d for Afl-Fuzz. Retrieved from http:\/\/lcamtuf.coredump.cx\/afl\/technical_details.txt"}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3708517","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3708517","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:17:45Z","timestamp":1750295865000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3708517"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,28]]},"references-count":57,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,5,31]]}},"alternative-id":["10.1145\/3708517"],"URL":"https:\/\/doi.org\/10.1145\/3708517","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"type":"print","value":"1049-331X"},{"type":"electronic","value":"1557-7392"}],"subject":[],"published":{"date-parts":[[2025,4,28]]},"assertion":[{"value":"2023-08-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-12-21","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-28","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}