{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T20:06:54Z","timestamp":1776110814174,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":43,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,5,23]],"date-time":"2022-05-23T00:00:00Z","timestamp":1653264000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,5,23]]},"DOI":"10.1145\/3524842.3528447","type":"proceedings-article","created":{"date-parts":[[2022,10,18]],"date-time":"2022-10-18T00:08:36Z","timestamp":1666051716000},"page":"353-364","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":38,"title":["A large-scale comparison of Python code in Jupyter notebooks and scripts"],"prefix":"10.1145","author":[{"given":"Konstantin","family":"Grotov","sequence":"first","affiliation":[{"name":"ITMO University"}]},{"given":"Sergey","family":"Titov","sequence":"additional","affiliation":[{"name":"JetBrains Research"}]},{"given":"Vladimir","family":"Sotnikov","sequence":"additional","affiliation":[{"name":"JetBrains Research"}]},{"given":"Yaroslav","family":"Golubev","sequence":"additional","affiliation":[{"name":"JetBrains Research"}]},{"given":"Timofey","family":"Bryksin","sequence":"additional","affiliation":[{"name":"JetBrains Research"}]}],"member":"320","published-online":{"date-parts":[[2022,10,17]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"[n.d.]. GitHub licenses. https:\/\/docs.github.com\/en\/repositories\/managing-your-repositorys-settings-and-features\/customizing-your-repository\/licensing-a-repository. [Online. Accessed 25-March-2022].  [n.d.]. GitHub licenses. https:\/\/docs.github.com\/en\/repositories\/managing-your-repositorys-settings-and-features\/customizing-your-repository\/licensing-a-repository. [Online. Accessed 25-March-2022]."},{"key":"e_1_3_2_1_2_1","unstructured":"[n.d.]. IntelliJ IDEA. https:\/\/www.jetbrains.com\/idea\/. [Online. Accessed 25-March-2022].  [n.d.]. IntelliJ IDEA. https:\/\/www.jetbrains.com\/idea\/. [Online. Accessed 25-March-2022]."},{"key":"e_1_3_2_1_3_1","unstructured":"[n.d.]. Matroskin: a library for the large scale analysis of Jupyter notebooks. https:\/\/github.com\/JetBrains-Research\/Matroskin. [Online. Accessed 25-March-2022].  [n.d.]. Matroskin: a library for the large scale analysis of Jupyter notebooks. https:\/\/github.com\/JetBrains-Research\/Matroskin. [Online. Accessed 25-March-2022]."},{"key":"e_1_3_2_1_4_1","unstructured":"[n.d.]. VS Code. https:\/\/code.visualstudio.com\/. [Online. Accessed 25-March-2022].  [n.d.]. VS Code. https:\/\/code.visualstudio.com\/. [Online. Accessed 25-March-2022]."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3478431.3499294"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2018.09.016"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/SEAA51224.2020.00075"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3313831.3376729"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.303623"},{"key":"e_1_3_2_1_10_1","volume-title":"Sampling Projects in GitHub for MSR Studies. In 18th IEEE\/ACM International Conference on Mining Software Repositories, MSR","author":"Dabic Ozren","year":"2021","unstructured":"Ozren Dabic , Emad Aghajani , and Gabriele Bavota . 2021 . Sampling Projects in GitHub for MSR Studies. In 18th IEEE\/ACM International Conference on Mining Software Repositories, MSR 2021. IEEE, 560--564. Ozren Dabic, Emad Aghajani, and Gabriele Bavota. 2021. Sampling Projects in GitHub for MSR Studies. In 18th IEEE\/ACM International Conference on Mining Software Repositories, MSR 2021. IEEE, 560--564."},{"key":"e_1_3_2_1_11_1","unstructured":"Robert L Glass. 2002. Software engineering: facts and fallacies.  Robert L Glass. 2002. Software engineering: facts and fallacies."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3379597.3387455"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.6383115"},{"key":"e_1_3_2_1_14_1","unstructured":"Nick Coghlan Guido van Rossum Barry Warsaw. 2022. PEP8 standard. https:\/\/www.python.org\/dev\/peps\/pep-0008\/ [Online. Accessed 25-March-2022].  Nick Coghlan Guido van Rossum Barry Warsaw. 2022. PEP8 standard. https:\/\/www.python.org\/dev\/peps\/pep-0008\/ [Online. Accessed 25-March-2022]."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0218194013500484"},{"key":"e_1_3_2_1_16_1","unstructured":"JetBrains. 2020. Jetbrains Python developers survey. https:\/\/www.jetbrains.com\/lp\/python-developers-survey-2020\/ [Online. Accessed 25-March-2022].  JetBrains. 2020. Jetbrains Python developers survey. https:\/\/www.jetbrains.com\/lp\/python-developers-survey-2020\/ [Online. Accessed 25-March-2022]."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/3417639.3417673"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3059009.3059061"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3304221.3319780"},{"key":"e_1_3_2_1_20_1","volume-title":"Literate Programming. The computer journal 27, 2","author":"Knuth Donald Ervin","year":"1984","unstructured":"Donald Ervin Knuth . 1984. Literate Programming. The computer journal 27, 2 ( 1984 ), 97--111. Donald Ervin Knuth. 1984. Literate Programming. The computer journal 27, 2 (1984), 97--111."},{"key":"e_1_3_2_1_21_1","volume-title":"Code Duplication and Reuse in Jupyter Notebooks. In 2020 IEEE Symposium on Visual Languages and Human-Centric Computing (VL\/HCC). IEEE, 1--9.","author":"Koenzen Andreas P","year":"2020","unstructured":"Andreas P Koenzen , Neil A Ernst , and Margaret-Anne D Storey . 2020 . Code Duplication and Reuse in Jupyter Notebooks. In 2020 IEEE Symposium on Visual Languages and Human-Centric Computing (VL\/HCC). IEEE, 1--9. Andreas P Koenzen, Neil A Ernst, and Margaret-Anne D Storey. 2020. Code Duplication and Reuse in Jupyter Notebooks. In 2020 IEEE Symposium on Visual Languages and Human-Centric Computing (VL\/HCC). IEEE, 1--9."},{"key":"e_1_3_2_1_22_1","unstructured":"Robert C Martin. 2009. Clean Code: a Handbook of Agile Software Craftsmanship. Pearson Education.  Robert C Martin. 2009. Clean Code: a Handbook of Agile Software Craftsmanship. Pearson Education."},{"key":"e_1_3_2_1_23_1","volume-title":"Ray: A Distributed Framework for Emerging {AI} Applications. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 561--577.","author":"Moritz Philipp","year":"2018","unstructured":"Philipp Moritz , Robert Nishihara , Stephanie Wang , Alexey Tumanov , Richard Liaw , Eric Liang , Melih Elibol , Zongheng Yang , William Paul , Michael I Jordan , 2018 . Ray: A Distributed Framework for Emerging {AI} Applications. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 561--577. Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I Jordan, et al. 2018. Ray: A Distributed Framework for Emerging {AI} Applications. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 561--577."},{"key":"e_1_3_2_1_24_1","volume-title":"Code Complexity Analyser and Visualiser for Novice Programmer. In 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE). IEEE, 1--6.","author":"Mullanu Siripond","year":"2020","unstructured":"Siripond Mullanu , Sunwit Petchoo , and Caslon Chua . 2020 . Code Complexity Analyser and Visualiser for Novice Programmer. In 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE). IEEE, 1--6. Siripond Mullanu, Sunwit Petchoo, and Caslon Chua. 2020. Code Complexity Analyser and Visualiser for Novice Programmer. In 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE). IEEE, 1--6."},{"key":"e_1_3_2_1_25_1","volume-title":"Proceedings of the World Congress on Engineering and Computer Science","volume":"1","author":"Nundhapana Ruchuta","year":"2018","unstructured":"Ruchuta Nundhapana and Twittie Senivongse . 2018 . Enhancing Understandability of Objective C Programs Using Naming Convention Checking Framework . In Proceedings of the World Congress on Engineering and Computer Science , Vol. 1 . Ruchuta Nundhapana and Twittie Senivongse. 2018. Enhancing Understandability of Objective C Programs Using Naming Convention Checking Framework. In Proceedings of the World Congress on Engineering and Computer Science, Vol. 1."},{"key":"e_1_3_2_1_26_1","unstructured":"Serge Sans Paille. 2022. Beniget tool. https:\/\/github.com\/serge-sans-paille\/beniget [Online. Accessed 25-March-2022].  Serge Sans Paille. 2022. Beniget tool. https:\/\/github.com\/serge-sans-paille\/beniget [Online. Accessed 25-March-2022]."},{"key":"e_1_3_2_1_27_1","volume-title":"An Empirical Study for Common Language Features Used in Python Projects. In 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 24--35","author":"Peng Yun","year":"2021","unstructured":"Yun Peng , Yu Zhang , and Mingzhe Hu . 2021 . An Empirical Study for Common Language Features Used in Python Projects. In 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 24--35 . Yun Peng, Yu Zhang, and Mingzhe Hu. 2021. An Empirical Study for Common Language Features Used in Python Projects. In 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 24--35."},{"key":"e_1_3_2_1_28_1","volume-title":"Why Jupyter is Data Scientists' Computational Notebook of Choice. Nature 563, 7732","author":"Perkel Jeffrey M","year":"2018","unstructured":"Jeffrey M Perkel . 2018. Why Jupyter is Data Scientists' Computational Notebook of Choice. Nature 563, 7732 ( 2018 ), 145--147. Jeffrey M Perkel. 2018. Why Jupyter is Data Scientists' Computational Notebook of Choice. Nature 563, 7732 (2018), 145--147."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSR.2019.00077"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-021-09961-9"},{"key":"e_1_3_2_1_31_1","unstructured":"Pylint. 2022. Pylint tool. https:\/\/pylint.org\/ [Online. Accessed 25-March-2022].  Pylint. 2022. Pylint tool. https:\/\/pylint.org\/ [Online. Accessed 25-March-2022]."},{"key":"e_1_3_2_1_32_1","volume-title":"Software Fault Prediction Metrics: A Systematic Literature Review. Information and software technology 55, 8","author":"Radjenovi\u0107 Danijel","year":"2013","unstructured":"Danijel Radjenovi\u0107 , Marjan Heri\u010dko , Richard Torkar , and Ale\u0161 \u017divkovi\u010d . 2013. Software Fault Prediction Metrics: A Systematic Literature Review. Information and software technology 55, 8 ( 2013 ), 1397--1418. Danijel Radjenovi\u0107, Marjan Heri\u010dko, Richard Torkar, and Ale\u0161 \u017divkovi\u010d. 2013. Software Fault Prediction Metrics: A Systematic Literature Review. Information and software technology 55, 8 (2013), 1397--1418."},{"key":"e_1_3_2_1_33_1","unstructured":"Radon. 2022. Radon tool. https:\/\/radon.readthedocs.io\/en\/latest\/ [Online. Accessed 25-March-2022].  Radon. 2022. Radon tool. https:\/\/radon.readthedocs.io\/en\/latest\/ [Online. Accessed 25-March-2022]."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3173574.3173606"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3382494.3410680"},{"key":"e_1_3_2_1_36_1","unstructured":"Ian Cordasco Tarek Ziad\u00e9. 2022. Flake8 tool. https:\/\/github.com\/pycqa\/flake8 [Online. Accessed 25-March-2022].  Ian Cordasco Tarek Ziad\u00e9. 2022. Flake8 tool. https:\/\/github.com\/pycqa\/flake8 [Online. Accessed 25-March-2022]."},{"key":"e_1_3_2_1_37_1","volume-title":"ReSplit: Improving the Structure of Jupyter Notebooks by Re-Splitting Their Cells. arXiv preprint arXiv:2112.14825","author":"Titov Sergey","year":"2021","unstructured":"Sergey Titov , Yaroslav Golubev , and Timofey Bryksin . 2021. ReSplit: Improving the Structure of Jupyter Notebooks by Re-Splitting Their Cells. arXiv preprint arXiv:2112.14825 ( 2021 ). Sergey Titov, Yaroslav Golubev, and Timofey Bryksin. 2021. ReSplit: Improving the Structure of Jupyter Notebooks by Re-Splitting Their Cells. arXiv preprint arXiv:2112.14825 (2021)."},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.21105\/joss.01026"},{"key":"e_1_3_2_1_39_1","volume-title":"How Does Machine Learning Change Software Development Practices? IEEE Transactions on Software Engineering","author":"Wan Zhiyuan","year":"2019","unstructured":"Zhiyuan Wan , Xin Xia , David Lo , and Gail C Murphy . 2019. How Does Machine Learning Change Software Development Practices? IEEE Transactions on Software Engineering ( 2019 ). Zhiyuan Wan, Xin Xia, David Lo, and Gail C Murphy. 2019. How Does Machine Learning Change Software Development Practices? IEEE Transactions on Software Engineering (2019)."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3377816.3381724"},{"key":"e_1_3_2_1_41_1","volume-title":"Restoring Execution Environments of Jupyter Notebooks. In 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1622--1633","author":"Wang Jiawei","year":"2021","unstructured":"Jiawei Wang , Li Li , and Andreas Zeller . 2021 . Restoring Execution Environments of Jupyter Notebooks. In 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1622--1633 . Jiawei Wang, Li Li, and Andreas Zeller. 2021. Restoring Execution Environments of Jupyter Notebooks. In 2021 IEEE\/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1622--1633."},{"key":"e_1_3_2_1_42_1","unstructured":"WPS. 2022. Wemake Python Styleguide. https:\/\/wemake-python-stylegui.de\/en\/latest\/ [Online. Accessed 25-March-2022].  WPS. 2022. Wemake Python Styleguide. https:\/\/wemake-python-stylegui.de\/en\/latest\/ [Online. Accessed 25-March-2022]."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1140\/epjds\/s13688-022-00327-9"}],"event":{"name":"MSR '22: 19th International Conference on Mining Software Repositories","location":"Pittsburgh Pennsylvania","acronym":"MSR '22","sponsor":["SIGSOFT ACM Special Interest Group on Software Engineering","IEEE CS"]},"container-title":["Proceedings of the 19th International Conference on Mining Software Repositories"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3524842.3528447","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3524842.3528447","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:35Z","timestamp":1750183775000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3524842.3528447"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,23]]},"references-count":43,"alternative-id":["10.1145\/3524842.3528447","10.1145\/3524842"],"URL":"https:\/\/doi.org\/10.1145\/3524842.3528447","relation":{},"subject":[],"published":{"date-parts":[[2022,5,23]]},"assertion":[{"value":"2022-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}