{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T17:57:40Z","timestamp":1770832660724,"version":"3.50.1"},"reference-count":36,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T00:00:00Z","timestamp":1675123200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Softw. Eng. Methodol."],"published-print":{"date-parts":[[2023,1,31]]},"abstract":"<jats:p>The Bourne-again shell (Bash) is a prevalent scripting language for orchestrating shell commands and managing resources in Unix-like environments. It is one of the mainstream shell dialects that is available on most GNU Linux systems. However, the unique syntax and semantics of Bash could easily lead to unintended behaviors if carelessly used. Prior studies primarily focused on improving the reliability of Bash scripts or facilitating writing Bash scripts; there is yet no empirical study on the characteristics of Bash programs written in reality, e.g., frequently used language features, common code smells, and bugs.<\/jats:p>\n          <jats:p\/>\n          <jats:p>In this article, we perform a large-scale empirical study of Bash usage, based on analyses over one million open source Bash scripts found in Github repositories. We identify and discuss which features and utilities of Bash are most often used. Using static analysis, we find that Bash scripts are often error-prone, and the error-proneness has a moderately positive correlation with the size of the scripts. We also find that the most common problem areas concern quoting, resource management, command options, permissions, and error handling. We envision that these findings can be beneficial for learning Bash and future research that aims to improve shell and command-line productivity and reliability.<\/jats:p>","DOI":"10.1145\/3517193","type":"journal-article","created":{"date-parts":[[2022,4,23]],"date-time":"2022-04-23T11:22:32Z","timestamp":1650712952000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Bash in the Wild: Language Usage, Code Smells, and Bugs"],"prefix":"10.1145","volume":"32","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3205-9010","authenticated-orcid":false,"given":"Yiwen","family":"Dong","sequence":"first","affiliation":[{"name":"University of Waterloo, Waterloo, ON, Canada"}]},{"given":"Zheyang","family":"Li","sequence":"additional","affiliation":[{"name":"University of Waterloo, Waterloo, ON, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1644-2965","authenticated-orcid":false,"given":"Yongqiang","family":"Tian","sequence":"additional","affiliation":[{"name":"University of Waterloo, Waterloo, ON, Canada"}]},{"given":"Chengnian","family":"Sun","sequence":"additional","affiliation":[{"name":"University of Waterloo, Waterloo, ON, Canada"}]},{"given":"Michael W.","family":"Godfrey","sequence":"additional","affiliation":[{"name":"University of Waterloo, Waterloo, ON, Canada"}]},{"given":"Meiyappan","family":"Nagappan","sequence":"additional","affiliation":[{"name":"University of Waterloo, Waterloo, ON, Canada"}]}],"member":"320","published-online":{"date-parts":[[2023,2,13]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"[n.d.]. Advanced Bash-Scripting Guide . Retrieved June 2 2021 from https:\/\/tldp.org\/LDP\/abs\/html\/internalvariables.html."},{"key":"e_1_3_2_3_2","unstructured":"Mayank Agarwal Jorge J. Barroso Tathagata Chakraborti Eli M. Dow Kshitij Fadnis Borja Godoy Madhavan Pallan and Kartik Talamadupula. 2020. Project CLAI: Instrumenting the Command Line as a New Environment for AI Agents. arxiv:2002.00762 [cs.HC]. 10.48550\/arXiv.2002.00762"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3126905"},{"key":"e_1_3_2_5_2","first-page":"171","volume-title":"Proceedings of the 2011 33rd International Conference on Software Engineering (ICSE\u201911)","author":"Bhattacharya Pamela","year":"2011","unstructured":"Pamela Bhattacharya and Iulian Neamtiu. 2011. Assessing programming language impact on development and maintenance: A study on C and C++. In Proceedings of the 2011 33rd International Conference on Software Engineering (ICSE\u201911). 171\u2013180. 10.1145\/1985793.1985817"},{"key":"e_1_3_2_6_2","volume-title":"An Introduction to the UNIX Shell","author":"Bourne Stephen R.","year":"1978","unstructured":"Stephen R. Bourne. 1978. An Introduction to the UNIX Shell. Bell Laboratories. Computing Science."},{"issue":"398","key":"e_1_3_2_7_2","first-page":"424","article-title":"Scatterplot matrix techniques for large N","volume":"82","author":"Carr Daniel B.","year":"1987","unstructured":"Daniel B. Carr, Richard J. Littlefield, W. L. Nicholson, and J. S. Littlefield. 1987. Scatterplot matrix techniques for large N. J. Amer. Statist. Assoc. 82, 398 (1987), 424\u2013436.","journal-title":"J. Amer. Statist. Assoc."},{"key":"e_1_3_2_8_2","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1145\/502034.502042","volume-title":"Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP\u201901)","author":"Chou Andy","year":"2001","unstructured":"Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, and Dawson Engler. 2001. An empirical study of operating systems errors. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP\u201901). Association for Computing Machinery, New York, NY, 73\u201388. 10.1145\/502034.502042"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1002\/spe.776"},{"key":"e_1_3_2_10_2","doi-asserted-by":"crossref","first-page":"582","DOI":"10.1145\/3106237.3106241","volume-title":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC\/FSE\u201917)","author":"D\u2019Antoni Loris","year":"2017","unstructured":"Loris D\u2019Antoni, Rishabh Singh, and Michael Vaughn. 2017. NoFAQ: Synthesizing command repairs from examples. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC\/FSE\u201917). Association for Computing Machinery, New York, NY, 582\u2013592. 10.1145\/3106237.3106241"},{"key":"e_1_3_2_11_2","first-page":"508","volume-title":"Proceedings of the 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER\u201915)","author":"Davis Ian J.","year":"2015","unstructured":"Ian J. Davis, Mike Wexler, Cheng Zhang, Richard. C. Holt, and Theresa Weber. 2015. Bash2py: A bash to Python translator. In Proceedings of the 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER\u201915). 508\u2013511. 10.1109\/SANER.2015.7081866"},{"key":"e_1_3_2_12_2","first-page":"574","volume-title":"Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE\u201918)","author":"Dutta Saikat","year":"2018","unstructured":"Saikat Dutta, Owolabi Legunsen, Zixin Huang, and Sasa Misailovic. 2018. Testing probabilistic programming systems. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC\/FSE\u201918). Association for Computing Machinery, New York, NY, 574\u2013586. 10.1145\/3236024.3236057"},{"key":"e_1_3_2_13_2","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1145\/2568225.2568295","volume-title":"Proceedings of the 36th International Conference on Software Engineering (ICSE\u201914)","author":"Dyer Robert","year":"2014","unstructured":"Robert Dyer, Hridesh Rajan, Hoan Anh Nguyen, and Tien N. Nguyen. 2014. Mining billions of AST nodes to study actual and potential usage of Java language features. In Proceedings of the 36th International Conference on Software Engineering (ICSE\u201914). Association for Computing Machinery, New York, NY, 779\u2013790. 10.1145\/2568225.2568295"},{"key":"e_1_3_2_14_2","unstructured":"Free Software Foundation. 2020. Bash. Retrieved February 2 2021 from https:\/\/www.gnu.org\/software\/bash\/."},{"key":"e_1_3_2_15_2","unstructured":"Free Software Foundation. 2020. GNU Bash Manual. Retrieved February 15 2021 from https:\/\/www.gnu.org\/software\/bash\/manual\/."},{"key":"e_1_3_2_16_2","unstructured":"Free Software Foundation. 2020. GNU Core Utilities. Retrieved February 15 2021 from https:\/\/www.gnu.org\/software\/coreutils\/."},{"key":"e_1_3_2_17_2","unstructured":"Github. 2020. The 2020 State of the Octoverse. Retrieved February 2 2021 from https:\/\/octoverse.github.com\/."},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3371111"},{"key":"e_1_3_2_19_2","unstructured":"Greg. 2021. Bash Pitfalls. Retrieved February 23 2021 from https:\/\/mywiki.wooledge.org\/BashPitfalls\/."},{"key":"e_1_3_2_20_2","first-page":"426","volume-title":"Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC\/FSE\u201915)","author":"Gu Rui","year":"2015","unstructured":"Rui Gu, Guoliang Jin, Linhai Song, Linjie Zhu, and Shan Lu. 2015. What change history tells us about thread synchronization. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC\/FSE\u201915). Association for Computing Machinery, New York, NY, 426\u2013438. 10.1145\/2786805.2786815"},{"key":"e_1_3_2_21_2","doi-asserted-by":"crossref","first-page":"325","DOI":"10.1145\/2483760.2483786","volume-title":"Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA\u201913)","author":"Hills Mark","year":"2013","unstructured":"Mark Hills, Paul Klint, and Jurgen Vinju. 2013. An empirical study of PHP feature usage: A static analysis perspective. In Proceedings of the 2013 International Symposium on Software Testing and Analysis (ISSTA\u201913). Association for Computing Machinery, New York, NY, 325\u2013335. 10.1145\/2483760.2483786"},{"key":"e_1_3_2_22_2","unstructured":"Vidar Holen. 2021. ShellCheck. Retrieved February 2 2021 from https:\/\/www.shellcheck.net\/."},{"key":"e_1_3_2_23_2","first-page":"77","volume-title":"Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI\u201912)","author":"Jin Guoliang","year":"2012","unstructured":"Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu. 2012. Understanding and detecting real-world performance bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI\u201912). Association for Computing Machinery, New York, NY, 77\u201388. 10.1145\/2254064.2254075"},{"key":"e_1_3_2_24_2","article-title":"Evolution of Shells in Linux","author":"Jones M.","year":"2011","unstructured":"M. Jones. 2011. Evolution of Shells in Linux. Retrieved April 11, 2021 from https:\/\/web.archive.org\/web\/20210411144653\/https:\/\/developer.ibm.com\/technologies\/linux\/tutorials\/l-linux-shells\/.","journal-title":"https:\/\/web.archive.org\/web\/20210411144653\/https:\/\/developer.ibm.com\/technologies\/linux\/tutorials\/l-linux-shells\/"},{"key":"e_1_3_2_25_2","doi-asserted-by":"crossref","first-page":"1317","DOI":"10.1145\/1982185.1982471","volume-title":"Proceedings of the 2011 ACM Symposium on Applied Computing (SAC\u201911)","author":"L\u00e4mmel Ralf","year":"2011","unstructured":"Ralf L\u00e4mmel, Ekaterina Pek, and J\u00fcrgen Starek. 2011. Large-scale, AST-based API-usage analysis of open-source Java projects. In Proceedings of the 2011 ACM Symposium on Applied Computing (SAC\u201911). Association for Computing Machinery, New York, NY, 1317\u20131324. 10.1145\/1982185.1982471"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.5732299"},{"key":"e_1_3_2_27_2","volume-title":"LREC: Language Resources and Evaluation Conference","author":"Lin Xi Victoria","year":"2018","unstructured":"Xi Victoria Lin, Chenglong Wang, Luke Zettlemoyer, and Michael D. Ernst. 2018. NL2Bash: A corpus and semantic parser for natural language interface to the Linux operating system. In LREC: Language Resources and Evaluation Conference."},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/2560012"},{"key":"e_1_3_2_29_2","first-page":"329","volume-title":"Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XIII)","author":"Lu Shan","year":"2008","unstructured":"Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. 2008. Learning from mistakes: A comprehensive study on real world concurrency bug characteristics. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XIII). Association for Computing Machinery, New York, NY, 329\u2013339. 10.1145\/1346281.1346323"},{"key":"e_1_3_2_30_2","first-page":"169","volume-title":"Proceedings of the 2nd International Conference on Software Engineering (ICSE\u201976)","author":"Mashey John R.","year":"1976","unstructured":"John R. Mashey. 1976. Using a command language as a high-level programming language. In Proceedings of the 2nd International Conference on Software Engineering (ICSE\u201976). IEEE Computer Society Press, Washington, DC, 169\u2013176."},{"key":"e_1_3_2_31_2","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1145\/1255329.1255347","volume-title":"Proceedings of the 2007 Workshop on Programming Languages and Analysis for Security (PLAS\u201907)","author":"Mazurak Karl","year":"2007","unstructured":"Karl Mazurak and Steve Zdancewic. 2007. ABASH: Finding bugs in bash scripts. In Proceedings of the 2007 Workshop on Programming Languages and Analysis for Security (PLAS\u201907). Association for Computing Machinery, New York, NY, 105\u2013114. 10.1145\/1255329.1255347"},{"key":"e_1_3_2_32_2","doi-asserted-by":"crossref","first-page":"763","DOI":"10.1145\/3385412.3386036","volume-title":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI\u201920)","author":"Qin Boqin","year":"2020","unstructured":"Boqin Qin, Yilun Chen, Zeming Yu, Linhai Song, and Yiying Zhang. 2020. Understanding memory and thread safety practices and issues in real-world rust programs. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI\u201920). Association for Computing Machinery, New York, NY, 763\u2013779. 10.1145\/3385412.3386036"},{"key":"e_1_3_2_33_2","first-page":"294","volume-title":"Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA\u201916)","author":"Sun Chengnian","year":"2016","unstructured":"Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. 2016. Toward understanding compiler bugs in GCC and LLVM. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA\u201916). Association for Computing Machinery, New York, NY, 294\u2013305. 10.1145\/2931037.2931074"},{"key":"e_1_3_2_34_2","unstructured":"Ubuntu. 2019. Bash-Builtins. Retrieved February 15 2021 from http:\/\/manpages.ubuntu.com\/manpages\/bionic\/man7\/bash-builtins.7.html."},{"key":"e_1_3_2_35_2","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1145\/3213846.3213866","volume-title":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA\u201918)","author":"Zhang Yuhao","year":"2018","unstructured":"Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA\u201918). ACM, New York, NY, 129\u2013140. 10.1145\/3213846.3213866"},{"key":"e_1_3_2_36_2","doi-asserted-by":"crossref","first-page":"913","DOI":"10.1109\/ICSE.2015.101","volume-title":"2015 IEEE\/ACM 37th IEEE International Conference on Software Engineering","volume":"1","author":"Zhong Hao","year":"2015","unstructured":"Hao Zhong and Zhendong Su. 2015. An empirical study on real bug fixes. In 2015 IEEE\/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. 913\u2013923. 10.1109\/ICSE.2015.101"},{"key":"e_1_3_2_37_2","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1016\/B978-0-12-804206-9.00027-1","volume-title":"Perspectives on Data Science for Software Engineering","author":"Zimmermann Thomas","year":"2016","unstructured":"Thomas Zimmermann. 2016. Card-sorting: From text to themes. In Perspectives on Data Science for Software Engineering. Elsevier, 137\u2013141."}],"container-title":["ACM Transactions on Software Engineering and Methodology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3517193","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3517193","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:31:29Z","timestamp":1750188689000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3517193"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1,31]]},"references-count":36,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,1,31]]}},"alternative-id":["10.1145\/3517193"],"URL":"https:\/\/doi.org\/10.1145\/3517193","relation":{},"ISSN":["1049-331X","1557-7392"],"issn-type":[{"value":"1049-331X","type":"print"},{"value":"1557-7392","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1,31]]},"assertion":[{"value":"2021-07-08","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-02-07","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-02-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}