{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,2]],"date-time":"2025-08-02T14:32:25Z","timestamp":1754145145207,"version":"3.41.2"},"reference-count":54,"publisher":"Association for Computing Machinery (ACM)","issue":"ISSTA","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Softw. Eng."],"published-print":{"date-parts":[[2025,6,22]]},"abstract":"<jats:p>Containerization has revolutionized software deployment, with Docker leading the way due to its ease of use and consistent runtime environment. As Docker usage grows, optimizing Dockerfile performance, particularly by reducing rebuild time, has become essential for maintaining efficient CI\/CD pipelines. However, existing optimization approaches primarily address single builds without considering the recurring rebuild costs associated with modifications and evolution, limiting long-term efficiency gains. To bridge this gap, we present Doctor, a method for improving Dockerfile build efficiency through instruction re-ordering that addresses key challenges: identifying instruction dependencies, predicting future modifications, ensuring behavioral equivalence, and managing the optimization\u2019s computational complexity. We developed a comprehensive dependency taxonomy based on Dockerfile syntax and a historical modification analysis to prioritize frequently modified instructions. Using a weighted topological sorting algorithm, Doctor optimizes instruction order to minimize future rebuild time while maintaining functionality. Experiments on 2,000 GitHub repositories show that Doctor improves 92.75% of Dockerfiles, reducing rebuild time by an average of 26.5%, with 12.82% of files achieving over a 50% reduction. Notably, 86.2% of cases preserve functional similarity. These findings highlight best practices for Dockerfile management, enabling developers to enhance Docker efficiency through informed optimization strategies.<\/jats:p>","DOI":"10.1145\/3728870","type":"journal-article","created":{"date-parts":[[2025,6,22]],"date-time":"2025-06-22T10:52:56Z","timestamp":1750589576000},"page":"1-23","source":"Crossref","is-referenced-by-count":0,"title":["Doctor: Optimizing Container Rebuild Efficiency by Instruction Re-orchestration"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-0859-0129","authenticated-orcid":false,"given":"Zhiling","family":"Zhu","sequence":"first","affiliation":[{"name":"Zhejiang University of Technology, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4664-3311","authenticated-orcid":false,"given":"Tieming","family":"Chen","sequence":"additional","affiliation":[{"name":"Zhejiang University of Technology, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1175-2753","authenticated-orcid":false,"given":"Chengwei","family":"Liu","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-8384-7933","authenticated-orcid":false,"given":"Han","family":"Liu","sequence":"additional","affiliation":[{"name":"Hong Kong University of Science and Technology, Hong Kong, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6137-0767","authenticated-orcid":false,"given":"Qijie","family":"Song","sequence":"additional","affiliation":[{"name":"Zhejiang University of Technology, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8390-7518","authenticated-orcid":false,"given":"Zhengzi","family":"Xu","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7300-9215","authenticated-orcid":false,"given":"Yang","family":"Liu","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore, Singapore"}]}],"member":"320","published-online":{"date-parts":[[2025,6,22]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2023. docker system df. https:\/\/docs.docker.com\/engine\/reference\/commandline\/system_df\/ (Accessed on 12\/14\/2023)"},{"key":"e_1_2_1_2_1","unstructured":"2023. docker system prune. https:\/\/docs.docker.com\/engine\/reference\/commandline\/system_prune\/ (Accessed on 12\/14\/2023)"},{"key":"e_1_2_1_3_1","unstructured":"2023. Dockerfile reference. https:\/\/docs.docker.com\/engine\/reference\/builder\/####dockerfile-reference (Accessed on 12\/14\/2023)"},{"key":"e_1_2_1_4_1","unstructured":"2024. Application Container Market Size and Share Analysis - Growth Trends and Forecasts (2023 - 2028) Source: https:\/\/www.mordorintelligence.com\/industry-reports\/application-container-market.. https:\/\/www.mordorintelligence.com\/industry-reports\/application-container-market (Accessed on 10\/30\/2023)"},{"key":"e_1_2_1_5_1","unstructured":"2024. Cache | Docker Docs.. https:\/\/docs.docker.com\/build\/cache\/ (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_6_1","unstructured":"2024. The dash shell as a linkable library.. https:\/\/github.com\/binpash\/libdash (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_7_1","unstructured":"2024. Docker should include a formal grammar for Dockerfile \u00b7 Issue #12221 \u00b7 moby\/moby. https:\/\/github.com\/moby\/moby\/issues\/12221 (Accessed on 10\/29\/2024)"},{"key":"e_1_2_1_8_1","unstructured":"2024. Dockerfile reference | Docker Docs.. https:\/\/docs.docker.com\/reference\/dockerfile\/ (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_9_1","unstructured":"2024. Dockerfile tips and tricks.. https:\/\/medium.com\/@andreajrubino\/dockerfile-tips-and-tricks-58e61d69e41b (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_10_1","unstructured":"2024. Doctor Home.. https:\/\/sites.google.com\/view\/doctor-dockerfile-optimization\/home (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_11_1","unstructured":"2024. hadolint: Dockerfile linter validate inline bash written in Haskell.. https:\/\/github.com\/hadolint\/hadolint (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_12_1","unstructured":"2024. Optimize cache usage in builds.. https:\/\/docs.docker.com\/build\/cache\/optimize\/ (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_13_1","unstructured":"2024. Optimizing Your Dockerfile.. https:\/\/medium.com\/@esotericmeans\/optimizing-your-dockerfile-dc4b7b527756 (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_14_1","unstructured":"2024. Realm is a mobile database: a replacement for SQLite & ORMs. https:\/\/github.com\/realm\/realm-java\/ (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_15_1","unstructured":"2024. This repository contains the source code for the paper First Order Motion Model for Image Animation.. https:\/\/github.com\/AliaksandrSiarohin\/first-order-model\/ (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_16_1","unstructured":"2024. Ubuntu Manpage | apt - command-line interface. https:\/\/manpages.ubuntu.com\/manpages\/xenial\/man8\/apt.8.html\/ (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_17_1","unstructured":"2024. What is an image? | Docker Docs.. https:\/\/docs.docker.com\/get-started\/docker-concepts\/the-basics\/what-is-an-image\/ (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_18_1","unstructured":"2024. Writing a Dockerfile | Docker Docs.. https:\/\/docs.docker.com\/get-started\/docker-concepts\/building-images\/writing-a-dockerfile\/ (Accessed on 09\/09\/2024)"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/5.4.349"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-021-03914-1"},{"key":"e_1_2_1_21_1","volume-title":"2023 IEEE International Conference on Software Maintenance and Evolution (ICSME). 160\u2013170","author":"Bui Quang-Cuong","year":"2023","unstructured":"Quang-Cuong Bui, Malte Lauk\u00f6tter, and Riccardo Scandariato. 2023. Dockercleaner: Automatic repair of security smells in dockerfiles. In 2023 IEEE International Conference on Software Maintenance and Evolution (ICSME). 160\u2013170."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-020-03266-2"},{"key":"e_1_2_1_23_1","volume-title":"2017 IEEE\/ACM 14th International Conference on Mining Software Repositories (MSR). 323\u2013333","author":"Cito J\u00fcrgen","year":"2017","unstructured":"J\u00fcrgen Cito, Gerald Schermann, John Erik Wittern, Philipp Leitner, Sali Zumberi, and Harald C Gall. 2017. An empirical analysis of the docker container ecosystem on github. In 2017 IEEE\/ACM 14th International Conference on Mining Software Repositories (MSR). 323\u2013333."},{"volume-title":"2009 IEEE 25th international conference on data engineering. IEEE, 138\u2013149","author":"Cormode Graham","key":"e_1_2_1_24_1","unstructured":"Graham Cormode, Vladislav Shkapenyuk, Divesh Srivastava, and Bojian Xu. [n. d.]. Forward decay: A practical time decay model for streaming systems. In 2009 IEEE 25th international conference on data engineering. IEEE, 138\u2013149. isbn:142443422X"},{"key":"e_1_2_1_25_1","volume-title":"Parfum: Detection and automatic repair of dockerfile smells. arXiv preprint arXiv:2302.01707.","author":"Durieux Thomas","year":"2023","unstructured":"Thomas Durieux. 2023. Parfum: Detection and automatic repair of dockerfile smells. arXiv preprint arXiv:2302.01707."},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering. 1\u201312","author":"Durieux Thomas","year":"2024","unstructured":"Thomas Durieux. 2024. Empirical Study of the Docker Smells Impact on the Image Size. In Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering. 1\u201312."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2714064.2660239"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 40th international conference on software engineering. 1078\u20131089","author":"Hassan Foyzul","year":"2018","unstructured":"Foyzul Hassan and Xiaoyin Wang. 2018. Hirebuild: An automatic approach to history-driven repair of build scripts. In Proceedings of the 40th international conference on software engineering. 1078\u20131089."},{"key":"e_1_2_1_29_1","volume-title":"2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). 1148\u20131160","author":"Henkel Jordan","year":"2021","unstructured":"Jordan Henkel, Denini Silva, Leopoldo Teixeira, Marcelo d\u2019Amorim, and Thomas Reps. 2021. Shipwright: A human-in-the-loop system for dockerfile repair. In 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). 1148\u20131160."},{"key":"e_1_2_1_30_1","volume-title":"2019 35th Symposium on Mass Storage Systems and Technologies (MSST). 28\u201337","author":"Huang Zhuo","year":"2019","unstructured":"Zhuo Huang, Song Wu, Song Jiang, and Hai Jin. 2019. Fastbuild: Accelerating docker image building for efficient development and deployment of container. In 2019 35th Symposium on Mass Storage Systems and Technologies (MSST). 28\u201337."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2019.04.055"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0218194022500218"},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the 21st International Conference on Mining Software Repositories. 584\u2013594","author":"Ksontini Emna","year":"2024","unstructured":"Emna Ksontini, Aycha Abid, Rania Khalsi, and Marouane Kessentini. 2024. DRMiner: A Tool For Identifying And Analyzing Refactorings In Dockerfile. In Proceedings of the 21st International Conference on Mining Software Repositories. 584\u2013594."},{"key":"e_1_2_1_34_1","volume-title":"2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE). 781\u2013791","author":"Ksontini Emna","year":"2021","unstructured":"Emna Ksontini, Marouane Kessentini, Thiago do N Ferreira, and Foyzul Hassan. 2021. Refactorings and technical debt in docker projects: An empirical study. In 2021 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE). 781\u2013791."},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Emna Ksontini Meriem Mastouri Rania Khalsi and Wael Kessentini. 2025. Refactoring for Dockerfile Quality: A Dive into Developer Practices and Automation Potential. arXiv preprint arXiv:2501.14131.","DOI":"10.1109\/MSR66628.2025.00116"},{"key":"e_1_2_1_36_1","volume-title":"2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). 371\u2013381","author":"Lin Changyuan","year":"2020","unstructured":"Changyuan Lin, Sarah Nadi, and Hamzeh Khazaei. 2020. A large-scale data set and an empirical study of docker images hosted on docker hub. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). 371\u2013381."},{"key":"e_1_2_1_37_1","first-page":"1","article-title":"The nature of build changes: An empirical study of Maven-based build systems","volume":"26","author":"Macho Christian","year":"2021","unstructured":"Christian Macho, Stefanie Beyer, Shane McIntosh, and Martin Pinzger. 2021. The nature of build changes: An empirical study of Maven-based build systems. Empirical Software Engineering, 26 (2021), 1\u201353.","journal-title":"Empirical Software Engineering"},{"key":"e_1_2_1_38_1","unstructured":"Daniel D McCracken and Edwin D Reilly. 2003. Backus-naur form (bnf). 129\u2013131."},{"volume-title":"d.]. GitHub - pixiu-io\/kubez-ansible: To provide quick deployment tools for kubernetes cluster and cloud native application by ansible. https:\/\/github.com\/pixiu-io\/kubez-ansible\/ [Online","year":"2025","key":"e_1_2_1_39_1","unstructured":"pixiu io. [n. d.]. GitHub - pixiu-io\/kubez-ansible: To provide quick deployment tools for kubernetes cluster and cloud native application by ansible. https:\/\/github.com\/pixiu-io\/kubez-ansible\/ [Online; accessed 2025-02-28]"},{"key":"e_1_2_1_40_1","first-page":"228","article-title":"An introduction to docker and analysis of its performance","volume":"17","author":"Rad Babak Bashari","year":"2017","unstructured":"Babak Bashari Rad, Harrison John Bhatti, and Mohammad Ahmadi. 2017. An introduction to docker and analysis of its performance. International Journal of Computer Science and Network Security (IJCSNS), 17, 3 (2017), 228.","journal-title":"International Journal of Computer Science and Network Security (IJCSNS)"},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 21st International Conference on Mining Software Repositories. 231\u2013241","author":"Rosa Giovanni","year":"2024","unstructured":"Giovanni Rosa, Simone Scalabrino, Gregorio Robles, and Rocco Oliveto. 2024. Not all dockerfile smells are the same: An empirical evaluation of hadolint writing practices by experts. In Proceedings of the 21st International Conference on Mining Software Repositories. 231\u2013241."},{"key":"e_1_2_1_42_1","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1007\/s10664-024-10471-7","article-title":"Fixing dockerfile smells: An empirical study","volume":"29","author":"Rosa Giovanni","year":"2024","unstructured":"Giovanni Rosa, Federico Zappone, Simone Scalabrino, and Rocco Oliveto. 2024. Fixing dockerfile smells: An empirical study. Empirical Software Engineering, 29, 5 (2024), 108.","journal-title":"Empirical Software Engineering"},{"volume-title":"Dockerfile flakiness: characterization and repair. Ph. D. Dissertation","author":"ShabaniMirzaei Taha","key":"e_1_2_1_43_1","unstructured":"Taha ShabaniMirzaei. 2024. Dockerfile flakiness: characterization and repair. Ph. D. Dissertation. University of British Columbia."},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of the 27th IEEE\/ACM International Conference on Automated Software Engineering. 366\u2013369","author":"Tamrawi Ahmed","year":"2012","unstructured":"Ahmed Tamrawi, Hoan Anh Nguyen, Hung Viet Nguyen, and Tien N Nguyen. 2012. SYMake: a build code analysis and refactoring tool for makefiles. In Proceedings of the 27th IEEE\/ACM International Conference on Automated Software Engineering. 366\u2013369."},{"volume-title":"Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 617\u2013626","author":"Tan Yinyan","key":"e_1_2_1_45_1","unstructured":"Yinyan Tan, Zhe Fan, Guilin Li, Fangshan Wang, Zhengbing Li, Shikai Liu, Qiuling Pan, Eric P Xing, and Qirong Ho. [n. d.]. Scalable time-decaying adaptive prediction algorithm. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 617\u2013626."},{"key":"e_1_2_1_46_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3485136","article-title":"How software refactoring impacts execution time","volume":"31","author":"Traini Luca","year":"2021","unstructured":"Luca Traini, Daniele Di Pompeo, Michele Tucci, Bin Lin, Simone Scalabrino, Gabriele Bavota, Michele Lanza, Rocco Oliveto, and Vittorio Cortellessa. 2021. How software refactoring impacts execution time. ACM Transactions on Software Engineering and Methodology (TOSEM), 31, 2 (2021), 1\u201323.","journal-title":"ACM Transactions on Software Engineering and Methodology (TOSEM)"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2945930"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.2973750"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3551349.3556940"},{"key":"e_1_2_1_50_1","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1007\/s10664-020-09908-6","article-title":"A multi-dimensional analysis of technical lag in Debian-based Docker images","volume":"26","author":"Zerouali Ahmed","year":"2021","unstructured":"Ahmed Zerouali, Tom Mens, Alexandre Decan, Jesus Gonzalez-Barahona, and Gregorio Robles. 2021. A multi-dimensional analysis of technical lag in Debian-based Docker images. Empirical Software Engineering, 26, 2 (2021), 19.","journal-title":"Empirical Software Engineering"},{"volume-title":"On the relation between outdated docker containers, severity vulnerabilities, and bugs. In 2019 ieee 26th international conference on software analysis, evolution and reengineering (saner). 491\u2013501","author":"Zerouali Ahmed","key":"e_1_2_1_51_1","unstructured":"Ahmed Zerouali, Tom Mens, Gregorio Robles, and Jesus M Gonzalez-Barahona. 2019. On the relation between outdated docker containers, severity vulnerabilities, and bugs. In 2019 ieee 26th international conference on software analysis, evolution and reengineering (saner). 491\u2013501."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2020.3034517"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2019.8891000"},{"key":"e_1_2_1_54_1","volume-title":"International Conference on Intelligent Computing. 392\u2013404","author":"Zhu Zhiling","year":"2024","unstructured":"Zhiling Zhu, Tieming Chen, Haobin Kong, Yunjin Zhong, and Qijie Song. 2024. DocSecKG: A Systematic Approach for Building Knowledge Graph to Understand the Relationship Between Docker Image and Vulnerability. In International Conference on Intelligent Computing. 392\u2013404."}],"container-title":["Proceedings of the ACM on Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3728870","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,16]],"date-time":"2025-07-16T16:54:51Z","timestamp":1752684891000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3728870"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,22]]},"references-count":54,"journal-issue":{"issue":"ISSTA","published-print":{"date-parts":[[2025,6,22]]}},"alternative-id":["10.1145\/3728870"],"URL":"https:\/\/doi.org\/10.1145\/3728870","relation":{},"ISSN":["2994-970X"],"issn-type":[{"type":"electronic","value":"2994-970X"}],"subject":[],"published":{"date-parts":[[2025,6,22]]}}}