{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T02:13:16Z","timestamp":1775873596489,"version":"3.50.1"},"reference-count":76,"publisher":"Association for Computing Machinery (ACM)","issue":"FSE","license":[{"start":{"date-parts":[[2024,7,12]],"date-time":"2024-07-12T00:00:00Z","timestamp":1720742400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union's Horizon 2020 research and innovation programme","award":["825328"],"award-info":[{"award-number":["825328"]}]},{"name":"European Union's Horizon 2021 research and innovation programme","award":["101070599"],"award-info":[{"award-number":["101070599"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Softw. Eng."],"published-print":{"date-parts":[[2024,7,12]]},"abstract":"<jats:p>\n                    Modern programming languages promote software reuse via package managers that facilitate the integration of inter-dependent software libraries. Software reuse comes with the challenge of\n                    <jats:italic toggle=\"yes\">dependency bloat<\/jats:italic>\n                    , which refers to unneeded and excessive code incorporated into a project through reused libraries. Such bloat exhibits security risks and maintenance costs, increases storage requirements, and slows down load times. In this work, we conduct a large-scale, fine-grained analysis to understand bloated dependency code in the PyPI ecosystem. Our analysis is the first to focus on different granularity levels, including bloated dependencies, bloated files, and bloated methods. This allows us to identify the specific parts of a library that contribute to the bloat. To do so, we analyze the source code of 1,302 popular Python projects and their 3,232 transitive dependencies. For each project, we employ a state-of-the-art static analyzer and incrementally construct the\n                    <jats:italic toggle=\"yes\">fine-grained project dependency graph (FPDG)<\/jats:italic>\n                    , a representation that captures all inter-project dependencies at method-level.\n                  <\/jats:p>\n                  <jats:p>Our reachability analysis on the FPDG enables the assessment of bloated dependency code in terms of several aspects, including its prevalence in the PyPI ecosystem, its relation to software vulnerabilities, its root causes, and developer perception. Our key finding suggests that PyPI exhibits significant resource underutilization: more than 50% of dependencies are bloated. This rate gets worse when considering bloated dependency code at a more subtle level, such as bloated files and bloated methods. Our fine-grained analysis also indicates that there are numerous vulnerabilities that reside in bloated areas of utilized packages (15% of the defects existing in PyPI). Other major observations suggest that bloated code primarily stems from omissions during code refactoring processes and that developers are willing to debloat their code: Out of the 36 submitted pull requests, developers accepted and merged 30, removing a total of 35 bloated dependencies. We believe that our findings can help researchers and practitioners come up with new debloating techniques and development practices to detect and avoid bloated code, ensuring that dependency resources are utilized efficiently.<\/jats:p>","DOI":"10.1145\/3660821","type":"journal-article","created":{"date-parts":[[2024,7,12]],"date-time":"2024-07-12T10:22:09Z","timestamp":1720779729000},"page":"2584-2607","source":"Crossref","is-referenced-by-count":7,"title":["Bloat beneath Python\u2019s Scales: A Fine-Grained Inter-Project Dependency Analysis"],"prefix":"10.1145","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-2457-1421","authenticated-orcid":false,"given":"Georgios-Petros","family":"Drosos","sequence":"first","affiliation":[{"name":"Athens University of Economics and Business, Athens, Greece"},{"name":"ETH Zurich, Zurich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9906-3073","authenticated-orcid":false,"given":"Thodoris","family":"Sotiropoulos","sequence":"additional","affiliation":[{"name":"ETH Zurich, Zurich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4231-1897","authenticated-orcid":false,"given":"Diomidis","family":"Spinellis","sequence":"additional","affiliation":[{"name":"Athens University of Economics and Business, Athens, Greece"},{"name":"Delft University of Technology, Delft, Netherlands"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5061-9018","authenticated-orcid":false,"given":"Dimitris","family":"Mitropoulos","sequence":"additional","affiliation":[{"name":"University of Athens, Athens, Greece"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,7,12]]},"reference":[{"key":"e_1_3_1_2_1","unstructured":"2023. GitHub Advisory Database. https:\/\/github.com\/advisories [Online; accessed 11-September-2023]."},{"key":"e_1_3_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-019-09792-9"},{"key":"e_1_3_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3414997"},{"key":"e_1_3_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-022-10278-4"},{"key":"e_1_3_1_6_1","doi-asserted-by":"publisher","unstructured":"Shihab E Alfadel M Costa DE. 2020. Empirical Analysis of Security Vulnerabilities in Python Packages. https:\/\/doi.org\/10.5281\/zenodo.5645517 10.5281\/zenodo.5645517","DOI":"10.5281\/zenodo.5645517"},{"key":"e_1_3_1_7_1","first-page":"5575","article-title":"AnimateDead: Debloating Web Applications Using Concolic Execution","author":"Azad Babak Amin","year":"2023","unstructured":"Babak Amin Azad, Rasoul Jahanshahi, Chris Tsoukaladelis, Manuel Egele, and Nick Nikiforakis. 2023. AnimateDead: Debloating Web Applications Using Concolic Execution. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA, 5575\u20135591. https:\/\/www.usenix.org\/conference\/usenixsecurity23\/presentation\/azad","journal-title":"32nd USENIX Security Symposium (USENIX Security 23)"},{"key":"e_1_3_1_8_1","first-page":"1697","article-title":"Less is More: Quantifying the Security Benefits of Debloating Web Applications","author":"Azad Babak Amin","year":"2019","unstructured":"Babak Amin Azad, Pierre Laperdrix, and Nick Nikiforakis. 2019. Less is More: Quantifying the Security Benefits of Debloating Web Applications. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 1697\u20131714. https:\/\/www.usenix.org\/conference\/usenixsecurity19\/presentation\/azad","journal-title":"28th USENIX Security Symposium (USENIX Security 19)"},{"key":"e_1_3_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3418209"},{"key":"e_1_3_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2006.07.009"},{"key":"e_1_3_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3409738"},{"key":"e_1_3_1_12_1","unstructured":"Brett Cannon Nathaniel Smith and Donald Stufft. 2016. PEP 518 \u2013 Specifying Minimum Build System Requirements for Python Projects. PEP 518. Python Software Foundation. https:\/\/www.python.org\/dev\/peps\/pep-0518\/"},{"key":"e_1_3_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2022.3191353"},{"key":"e_1_3_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3485500"},{"key":"e_1_3_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2017.2682323"},{"key":"e_1_3_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3347446"},{"key":"e_1_3_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSR52588.2021.00074"},{"key":"e_1_3_1_18_1","doi-asserted-by":"publisher","unstructured":"Georgios-Petros Drosos Thodoris Sotiropoulos Diomidis Spinellis and Dimitris Mitropoulos. 2024. Artifact for \"Bloat beneath Python\u2019s Scales: A Fine-Grained Inter-Project Dependency Analysis\". https:\/\/doi.org\/10.5281\/zenodo.11095274 10.5281\/zenodo.11095274","DOI":"10.5281\/zenodo.11095274"},{"key":"e_1_3_1_19_1","article-title":"Towards Measuring Supply Chain Attacks on Package Managers for Interpreted Languages","author":"Duan Ruian","year":"2020","unstructured":"Ruian Duan, Omar Alrawi, Ranjita Pai Kasturi, Ryan Elder, Brendan Saltaformaggio, and Wenke Lee. 2020. Towards Measuring Supply Chain Attacks on Package Managers for Interpreted Languages. Proceedings 2021 Network and Distributed System Security Symposium (2020). https:\/\/api.semanticscholar.org\/CorpusID:227247756","journal-title":"Proceedings 2021 Network and Distributed System Security Symposium"},{"key":"e_1_3_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPC.2010.48"},{"key":"e_1_3_1_21_1","unstructured":"GitHub. 2023. The State of the Octoverse: Top Programming Languages 2023. https:\/\/github.blog\/2023-11-08-the-state-of-open-source-and-ai\/. Online: accessed 29 February 2023."},{"key":"e_1_3_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-22888-0_13"},{"key":"e_1_3_1_23_1","unstructured":"J\u00fcrgen Gmach. 2021. Remove unused Sphinx dependency. https:\/\/github.com\/zopefoundation\/Zope\/pull\/968 [Online; accessed 26-September-2023]."},{"key":"e_1_3_1_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-021-10071-9"},{"key":"e_1_3_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183399.3183417"},{"key":"e_1_3_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243734.3243838"},{"key":"e_1_3_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2021.3106247"},{"key":"e_1_3_1_28_1","first-page":"5557","article-title":"Minimalist: Semi-automated Debloating of PHP Web Applications through Static Analysis","author":"Jahanshahi Rasoul","year":"2023","unstructured":"Rasoul Jahanshahi, Babak Amin Azad, Nick Nikiforakis, and Manuel Egele. 2023. Minimalist: Semi-automated Debloating of PHP Web Applications through Static Analysis. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA, 5557\u20135573. https:\/\/www.usenix.org\/conference\/usenixsecurity23\/presentation\/jahanshahi","journal-title":"32nd USENIX Security Symposium (USENIX Security 23)"},{"key":"e_1_3_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSRE.2018.00029"},{"key":"e_1_3_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/COMPSAC.2016.146"},{"key":"e_1_3_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2597073.2597074"},{"key":"e_1_3_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE-Companion52605.2021.00046"},{"key":"e_1_3_1_33_1","first-page":"121","article-title":"Mininode: Reducing the Attack Surface of Node.js Applications","author":"Koishybayev Igibek","year":"2020","unstructured":"Igibek Koishybayev and Alexandros Kapravelos. 2020. Mininode: Reducing the Attack Surface of Node.js Applications. In 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020). USENIX Association, San Sebastian, 121\u2013134. https:\/\/www.usenix.org\/conference\/raid2020\/presentation\/koishybayev","journal-title":"23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020)"},{"key":"e_1_3_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3572905"},{"key":"e_1_3_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/130844.130856"},{"key":"e_1_3_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3417934"},{"key":"e_1_3_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3360584"},{"key":"e_1_3_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-006-9033-1"},{"key":"e_1_3_1_39_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.ECOOP.2018.7"},{"key":"e_1_3_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/SANER56733.2023.00028"},{"key":"e_1_3_1_41_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-007-9040-x"},{"key":"e_1_3_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3428255"},{"key":"e_1_3_1_43_1","unstructured":"Peter Naur and Brian Randell. 1969. Software engineering: Report of a conference sponsored by the nato science committee garmisch germany 7th-11th october 1968. (1969)."},{"key":"e_1_3_1_44_1","first-page":"3439","article-title":"Beyond Typosquatting: An In-depth Look at Package Confusion","author":"Neupane Shradha","year":"2023","unstructured":"Shradha Neupane, Grant Holmes, Elizabeth Wyss, Drew Davidson, and Lorenzo De Carli. 2023. Beyond Typosquatting: An In-depth Look at Package Confusion. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA, 3439\u20133456. https:\/\/www.usenix.org\/conference\/usenixsecurity23\/presentation\/neupane","journal-title":"32nd USENIX Security Symposium (USENIX Security 23)"},{"key":"e_1_3_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3460319.3464836"},{"key":"e_1_3_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/SANER.2015.7081834"},{"key":"e_1_3_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3180155.3180184"},{"key":"e_1_3_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSME52107.2021.00056"},{"key":"e_1_3_1_49_1","unstructured":"Python Packaging Authority. 2023. Pip v23.1.2 Documentation: Build System Interface. https:\/\/pip.pypa.io\/en\/stable\/reference\/build-system\/# Accessed: July 9 2023."},{"key":"e_1_3_1_50_1","unstructured":"Python Packaging Authority. 2024. top_level.txt \u2013 Conflict Management Metadata. https:\/\/setuptools.pypa.io\/en\/latest\/deprecated\/python_eggs.html#top-level-txt-conflict-management-metadata [Online; accessed 21-February-2024]."},{"key":"e_1_3_1_51_1","first-page":"1733","article-title":"RAZOR: A Framework for Post-deployment Software Debloating","author":"Qian Chenxiong","year":"2019","unstructured":"Chenxiong Qian, Hong Hu, Mansour Alharthi, Pak Ho Chung, Taesoo Kim, and Wenke Lee. 2019. RAZOR: A Framework for Post-deployment Software Debloating. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 1733\u20131750. https:\/\/www.usenix.org\/conference\/usenixsecurity19\/presentation\/qian","journal-title":"28th USENIX Security Symposium (USENIX Security 19)"},{"key":"e_1_3_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/3372297.3417866"},{"key":"e_1_3_1_53_1","first-page":"869","article-title":"Debloating Software through Piece-Wise Compilation and Loading","author":"Quach Anh","year":"2018","unstructured":"Anh Quach, Aravind Prakash, and Lok Yan. 2018. Debloating Software through Piece-Wise Compilation and Loading. In 27th USENIX Security Symposium (USENIX Security 18). USENIX Association, Baltimore, MD, 869\u2013886. https:\/\/www.usenix.org\/conference\/usenixsecurity18\/presentation\/quach","journal-title":"27th USENIX Security Symposium (USENIX Security 18)"},{"key":"e_1_3_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3576915.3623140"},{"key":"e_1_3_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3106237.3106271"},{"key":"e_1_3_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/2345156.2254104"},{"key":"e_1_3_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE43902.2021.00146"},{"key":"e_1_3_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3238147.3238160"},{"key":"e_1_3_1_59_1","first-page":"5521","article-title":"Silent Spring: Prototype Pollution Leads to Remote Code Execution in Node.js","author":"Shcherbakov Mikhail","year":"2023","unstructured":"Mikhail Shcherbakov, Musard Balliu, and Cristian-Alexandru Staicu. 2023. Silent Spring: Prototype Pollution Leads to Remote Code Execution in Node.js. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA, 5521\u20135538. https:\/\/www.usenix.org\/conference\/usenixsecurity23\/presentation\/shcherbakov","journal-title":"32nd USENIX Security Symposium (USENIX Security 23)"},{"key":"e_1_3_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/CAIN58948.2023.00016"},{"key":"e_1_3_1_61_1","doi-asserted-by":"publisher","unstructured":"C\u00e9sar Soto-Valero Thomas Durieux and Benoit Baudry. 2021a. A Longitudinal Analysis of Bloated Java Dependencies (ESEC\/FSE 2021). Association for Computing Machinery New York NY USA 1021\u20131031. https:\/\/doi.org\/10.1145\/3468264.3468589 10.1145\/3468264.3468589","DOI":"10.1145\/3468264.3468589"},{"key":"e_1_3_1_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/3546948"},{"key":"e_1_3_1_63_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-020-09914-8"},{"key":"e_1_3_1_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/MS.2012.38"},{"key":"e_1_3_1_65_1","doi-asserted-by":"crossref","DOI":"10.14722\/ndss.2018.23071","article-title":"SYNODE: Understanding and Automatically Preventing Injection Attacks on NODE.JS","author":"Staicu Cristian-Alexandru","year":"2018","unstructured":"Cristian-Alexandru Staicu, Michael Pradel, and Benjamin Livshits. 2018. SYNODE: Understanding and Automatically Preventing Injection Attacks on NODE.JS. In Network and Distributed System Security Symposium. https:\/\/api.semanticscholar.org\/CorpusID:51951699","journal-title":"Network and Distributed System Security Symposium"},{"key":"e_1_3_1_66_1","first-page":"6133","article-title":"Bilingual Problems: Studying the Security Risks Incurred by Native Extensions in Scripting Languages","author":"Staicu Cristian-Alexandru","year":"2023","unstructured":"Cristian-Alexandru Staicu, Sazzadur Rahaman, \u00c1gnes Kiss, and Michael Backes. 2023. Bilingual Problems: Studying the Security Risks Incurred by Native Extensions in Scripting Languages. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA, 6133\u20136150. https:\/\/www.usenix.org\/conference\/usenixsecurity23\/presentation\/staicu","journal-title":"32nd USENIX Security Symposium (USENIX Security 23)"},{"key":"e_1_3_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380390"},{"key":"e_1_3_1_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/3180155.3180236"},{"key":"e_1_3_1_69_1","doi-asserted-by":"publisher","DOI":"10.5555\/2755629"},{"key":"e_1_3_1_70_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-022-10195-6"},{"key":"e_1_3_1_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/2642937.2643013"},{"key":"e_1_3_1_72_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2018.10.009"},{"key":"e_1_3_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/SANER56733.2023.00044"},{"key":"e_1_3_1_74_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE43902.2021.00144"},{"key":"e_1_3_1_75_1","doi-asserted-by":"publisher","DOI":"10.1145\/3377811.3380426"},{"key":"e_1_3_1_76_1","doi-asserted-by":"publisher","DOI":"10.1145\/3510457.3513044"},{"key":"e_1_3_1_77_1","first-page":"995","article-title":"Small World with High Risks: A Study of Security Threats in the npm Ecosystem","author":"Zimmermann Markus","year":"2019","unstructured":"Markus Zimmermann, Cristian-Alexandru Staicu, Cam Tenny, and Michael Pradel. 2019. Small World with High Risks: A Study of Security Threats in the npm Ecosystem. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 995\u20131010. https:\/\/www.usenix.org\/conference\/usenixsecurity19\/presentation\/zimmerman","journal-title":"28th USENIX Security Symposium (USENIX Security 19)"}],"container-title":["Proceedings of the ACM on Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3660821","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3660821","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,4]],"date-time":"2026-02-04T07:57:59Z","timestamp":1770191879000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3660821"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,12]]},"references-count":76,"journal-issue":{"issue":"FSE","published-print":{"date-parts":[[2024,7,12]]}},"alternative-id":["10.1145\/3660821"],"URL":"https:\/\/doi.org\/10.1145\/3660821","relation":{},"ISSN":["2994-970X"],"issn-type":[{"value":"2994-970X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,12]]}}}