{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T22:56:18Z","timestamp":1773442578814,"version":"3.50.1"},"reference-count":65,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2022,7,2]],"date-time":"2022-07-02T00:00:00Z","timestamp":1656720000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,7,2]],"date-time":"2022-07-02T00:00:00Z","timestamp":1656720000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100018933","name":"Technische Universit\u00e4t Clausthal","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100018933","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Empir Software Eng"],"published-print":{"date-parts":[[2022,11]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Context<\/jats:title>\n                <jats:p>Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Objective<\/jats:title>\n                <jats:p>We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Methods<\/jats:title>\n                <jats:p>We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusion<\/jats:title>\n                <jats:p>Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1007\/s10664-021-10083-5","type":"journal-article","created":{"date-parts":[[2022,7,2]],"date-time":"2022-07-02T06:03:03Z","timestamp":1656741783000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":36,"title":["A fine-grained data set and analysis of tangling in bug fixing commits"],"prefix":"10.1007","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9765-2803","authenticated-orcid":false,"given":"Steffen","family":"Herbold","sequence":"first","affiliation":[]},{"given":"Alexander","family":"Trautsch","sequence":"additional","affiliation":[]},{"given":"Benjamin","family":"Ledel","sequence":"additional","affiliation":[]},{"given":"Alireza","family":"Aghamohammadi","sequence":"additional","affiliation":[]},{"given":"Taher A.","family":"Ghaleb","sequence":"additional","affiliation":[]},{"given":"Kuljit Kaur","family":"Chahal","sequence":"additional","affiliation":[]},{"given":"Tim","family":"Bossenmaier","sequence":"additional","affiliation":[]},{"given":"Bhaveet","family":"Nagaria","sequence":"additional","affiliation":[]},{"given":"Philip","family":"Makedonski","sequence":"additional","affiliation":[]},{"given":"Matin Nili","family":"Ahmadabadi","sequence":"additional","affiliation":[]},{"given":"Kristof","family":"Szabados","sequence":"additional","affiliation":[]},{"given":"Helge","family":"Spieker","sequence":"additional","affiliation":[]},{"given":"Matej","family":"Madeja","sequence":"additional","affiliation":[]},{"given":"Nathaniel","family":"Hoy","sequence":"additional","affiliation":[]},{"given":"Valentina","family":"Lenarduzzi","sequence":"additional","affiliation":[]},{"given":"Shangwen","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Gema","family":"Rodr\u00edguez-P\u00e9rez","sequence":"additional","affiliation":[]},{"given":"Ricardo","family":"Colomo-Palacios","sequence":"additional","affiliation":[]},{"given":"Roberto","family":"Verdecchia","sequence":"additional","affiliation":[]},{"given":"Paramvir","family":"Singh","sequence":"additional","affiliation":[]},{"given":"Yihao","family":"Qin","sequence":"additional","affiliation":[]},{"given":"Debasish","family":"Chakroborti","sequence":"additional","affiliation":[]},{"given":"Willard","family":"Davis","sequence":"additional","affiliation":[]},{"given":"Vijay","family":"Walunj","sequence":"additional","affiliation":[]},{"given":"Hongjun","family":"Wu","sequence":"additional","affiliation":[]},{"given":"Diego","family":"Marcilio","sequence":"additional","affiliation":[]},{"given":"Omar","family":"Alam","sequence":"additional","affiliation":[]},{"given":"Abdullah","family":"Aldaeej","sequence":"additional","affiliation":[]},{"given":"Idan","family":"Amit","sequence":"additional","affiliation":[]},{"given":"Burak","family":"Turhan","sequence":"additional","affiliation":[]},{"given":"Simon","family":"Eismann","sequence":"additional","affiliation":[]},{"given":"Anna-Katharina","family":"Wickert","sequence":"additional","affiliation":[]},{"given":"Ivano","family":"Malavolta","sequence":"additional","affiliation":[]},{"given":"Mat\u00fa\u0161","family":"Sul\u00edr","sequence":"additional","affiliation":[]},{"given":"Fatemeh","family":"Fard","sequence":"additional","affiliation":[]},{"given":"Austin Z.","family":"Henley","sequence":"additional","affiliation":[]},{"given":"Stratos","family":"Kourtzanidis","sequence":"additional","affiliation":[]},{"given":"Eray","family":"Tuzun","sequence":"additional","affiliation":[]},{"given":"Christoph","family":"Treude","sequence":"additional","affiliation":[]},{"given":"Simin Maleki","family":"Shamasbi","sequence":"additional","affiliation":[]},{"given":"Ivan","family":"Pashchenko","sequence":"additional","affiliation":[]},{"given":"Marvin","family":"Wyrich","sequence":"additional","affiliation":[]},{"given":"James","family":"Davis","sequence":"additional","affiliation":[]},{"given":"Alexander","family":"Serebrenik","sequence":"additional","affiliation":[]},{"given":"Ella","family":"Albrecht","sequence":"additional","affiliation":[]},{"given":"Ethem Utku","family":"Aktas","sequence":"additional","affiliation":[]},{"given":"Daniel","family":"Str\u00fcber","sequence":"additional","affiliation":[]},{"given":"Johannes","family":"Erbel","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,7,2]]},"reference":[{"issue":"2","key":"10083_CR1","first-page":"119","volume":"52","author":"A Agresti","year":"1998","unstructured":"Agresti A, Coull B A (1998) Approximate is better than \u201dexact\u201d for interval estimation of binomial proportions. Amer Stat 52(2):119\u2013126","journal-title":"Amer Stat"},{"key":"10083_CR2","doi-asserted-by":"crossref","unstructured":"Anderson A, Huttenlocher D, Kleinberg J, Leskovec J (2013) Steering user behavior with badges. In: Proceedings of the 22nd International Conference on World Wide Web, WWW \u201913. Association for Computing Machinery, New York, pp 95\u2013106","DOI":"10.1145\/2488388.2488398"},{"key":"10083_CR3","doi-asserted-by":"crossref","unstructured":"Arima R, Higo Y, Kusumoto S (2018) A study on inappropriately partitioned commits: How much and what kinds of ip commits in java projects?. In: Proceedings of the 15th International Conference on Mining Software Repositories, MSR \u201918. Association for Computing Machinery, New York, pp 336\u2013340","DOI":"10.1145\/3196398.3196406"},{"key":"10083_CR4","unstructured":"Baltes S, Ralph P (2020) Sampling in software engineering research: A critical review and guidelines. arXiv:2002.07764"},{"key":"10083_CR5","doi-asserted-by":"crossref","unstructured":"Bissyand\u00ed T F, Thung F, Wang S, Lo D, Jiang L, R\u00e9veill\u00e8re L (2013) Empirical evaluation of bug linking. In: 2013 17th European Conference on Software Maintenance and Reengineering, pp 89\u201398","DOI":"10.1109\/CSMR.2013.19"},{"issue":"2","key":"10083_CR6","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1214\/ss\/1009213286","volume":"16","author":"LD Brown","year":"2001","unstructured":"Brown L D, Cai T T, DasGupta A (2001) Interval estimation for a binomial proportion. Stat Sci 16(2):101\u2013133. https:\/\/doi.org\/10.1214\/ss\/1009213286","journal-title":"Stat Sci"},{"issue":"6634","key":"10083_CR7","doi-asserted-by":"publisher","first-page":"1454","DOI":"10.1136\/bmj.296.6634.1454","volume":"296","author":"MJ Campbell","year":"1988","unstructured":"Campbell M J, Gardner M J (1988) Calculating confidence intervals for some non-parametric analyses. Br Med J (Clin Res Ed) 296(6634):1454\u20131456","journal-title":"Br Med J (Clin Res Ed)"},{"key":"10083_CR8","unstructured":"Cook T D, Campbell D T, Day A (1979) Quasi-experimentation: Design & analysis issues for field settings, vol 351. Houghton Mifflin Boston"},{"key":"10083_CR9","doi-asserted-by":"crossref","unstructured":"Dias M, Bacchelli A, Gousios G, Cassou D, Ducasse S (2015) Untangling fine-grained code changes. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp 341\u2013350","DOI":"10.1109\/SANER.2015.7081844"},{"issue":"4","key":"10083_CR10","doi-asserted-by":"publisher","first-page":"405","DOI":"10.1007\/s10664-005-3861-2","volume":"10","author":"H Do","year":"2005","unstructured":"Do H, Elbaum S, Rothermel G (2005) Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empir Softw Eng 10(4):405\u2013435","journal-title":"Empir Softw Eng"},{"issue":"272","key":"10083_CR11","doi-asserted-by":"publisher","first-page":"1096","DOI":"10.1080\/01621459.1955.10501294","volume":"50","author":"C Dunnett","year":"1955","unstructured":"Dunnett C (1955) A multiple comparison procedure for comparing several treatments with a control. J Am Stat Assoc 50(272):1096\u20131121","journal-title":"J Am Stat Assoc"},{"issue":"5","key":"10083_CR12","doi-asserted-by":"publisher","first-page":"378","DOI":"10.1037\/h0031619","volume":"76","author":"JL Fleiss","year":"1971","unstructured":"Fleiss J L (1971) Measuring nominal scale agreement among many raters. Psychol Bullet 76(5):378","journal-title":"Psychol Bullet"},{"issue":"1","key":"10083_CR13","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1109\/TSE.2017.2755013","volume":"45","author":"L Gazzola","year":"2019","unstructured":"Gazzola L, Micucci D, Mariani L (2019) Automatic software repair: A survey. IEEE Trans Softw Eng 45(1):34\u201367","journal-title":"IEEE Trans Softw Eng"},{"key":"10083_CR14","doi-asserted-by":"crossref","unstructured":"Gopstein D, Zhou H H, Frankl P, Cappos J (2018) Prevalence of confusing code in software projects: Atoms of confusion in the wild. In: 2018 IEEE\/ACM 15th International Conference on Mining Software Repositories (MSR), pp 281\u2013291","DOI":"10.1145\/3196398.3196432"},{"key":"10083_CR15","doi-asserted-by":"crossref","unstructured":"Grant S, Betts B (2013) Encouraging user behaviour with achievements: An empirical study. In: Proceedings of the 10th Working Conference on Mining Software Repositories (MSR). IEEE, pp 65\u201368","DOI":"10.1109\/MSR.2013.6624007"},{"key":"10083_CR16","doi-asserted-by":"crossref","unstructured":"Gyimesi P, Vancsics B, Stocco A, Mazinanian D, Beszedes A, Ferenc R, Mesbah A (2019) Bugsjs: a benchmark of javascript bugs. In: 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), pp 90\u2013101","DOI":"10.1109\/ICST.2019.00019"},{"key":"10083_CR17","doi-asserted-by":"crossref","unstructured":"Herbold S (2020) With registered reports towards large scale data curation. In: Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER \u201920. Association for Computing Machinery, New York, pp 93\u201396","DOI":"10.1145\/3377816.3381721"},{"key":"10083_CR18","doi-asserted-by":"publisher","unstructured":"Herbold S, Trautsch A, Ledel B (2020) Large-scale manual validation of bugfixing changes. OSF. https:\/\/doi.org\/10.17605\/OSF.IO\/ACNWK, https:\/\/osf.io\/acnwk","DOI":"10.17605\/OSF.IO\/ACNWK"},{"key":"10083_CR19","unstructured":"Herbold S, Trautsch A, Trautsch F, Ledel B (2019) Issues with SZZ: An empirical assessment of the state of practice of defect prediction data collection"},{"key":"10083_CR20","doi-asserted-by":"crossref","unstructured":"Herzig K, Just S, Zeller A (2013) It\u2019s not a bug, it\u2019s a feature: How misclassification impacts bug prediction. In: Proceedings of the 2013 International Conference on Software Engineering (ICSE). IEEE","DOI":"10.1109\/ICSE.2013.6606585"},{"issue":"2","key":"10083_CR21","doi-asserted-by":"publisher","first-page":"303","DOI":"10.1007\/s10664-015-9376-6","volume":"21","author":"K Herzig","year":"2016","unstructured":"Herzig K, Just S, Zeller A (2016) The impact of tangled code changes on defect prediction models. Empir Softw Eng 21(2):303\u2013336","journal-title":"Empir Softw Eng"},{"key":"10083_CR22","doi-asserted-by":"crossref","unstructured":"Herzig K, Zeller A (2013) The impact of tangled code changes. In: Proceedings of the 10th Working Conference on Mining Software Repositories, MSR \u201913. IEEE Press, pp 121\u2013130","DOI":"10.1109\/MSR.2013.6624018"},{"key":"10083_CR23","doi-asserted-by":"crossref","unstructured":"Hindle A, German D M, Holt R (2008) What do large commits tell us? a taxonomical study of large commits. In: Proceedings of the 2008 International Working Conference on Mining Software Repositories, MSR \u201908. Association for Computing Machinery, New York, pp 99\u2013108","DOI":"10.1145\/1370750.1370773"},{"issue":"2","key":"10083_CR24","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1109\/TSE.2017.2770124","volume":"45","author":"S Hosseini","year":"2019","unstructured":"Hosseini S, Turhan B, Gunarathna D (2019) A systematic literature review and meta-analysis on cross project defect prediction. IEEE Trans Softw Eng 45(2):111\u2013147","journal-title":"IEEE Trans Softw Eng"},{"key":"10083_CR25","doi-asserted-by":"crossref","unstructured":"Hutchins M, Foster H, Goradia T, Ostrand T (1994) Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria. In: Proceedings of 16th International Conference on Software Engineering, pp 191\u2013200","DOI":"10.1109\/ICSE.1994.296778"},{"issue":"5","key":"10083_CR26","doi-asserted-by":"publisher","first-page":"649","DOI":"10.1109\/TSE.2010.62","volume":"37","author":"Y Jia","year":"2011","unstructured":"Jia Y, Harman M (2011) An analysis and survey of the development of mutation testing. IEEE Trans Softw Eng 37(5):649\u2013678","journal-title":"IEEE Trans Softw Eng"},{"key":"10083_CR27","doi-asserted-by":"crossref","unstructured":"Just R, Jalali D, Ernst M D (2014) Defects4j: A database of existing faults to enable controlled testing studies for java programs. In: Proceedings of the 2014 International Symposium on Software Testing and Analysis, ISSTA 2014. Association for Computing Machinery, New York, pp 437\u2013440","DOI":"10.1145\/2610384.2628055"},{"issue":"6","key":"10083_CR28","doi-asserted-by":"publisher","first-page":"757","DOI":"10.1109\/TSE.2012.70","volume":"39","author":"Y Kamei","year":"2013","unstructured":"Kamei Y, Shihab E, Adams B, Hassan A E, Mockus A, Sinha A, Ubayashi N (2013) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng 39(6):757\u2013773","journal-title":"IEEE Trans Softw Eng"},{"key":"10083_CR29","doi-asserted-by":"crossref","unstructured":"Kawrykow D, Robillard M P (2011) Non-essential changes in version histories. In: Proceedings of the 2011 33rd International Conference on Software Engineering (ICSE), pp 351\u2013360","DOI":"10.1145\/1985793.1985842"},{"key":"10083_CR30","doi-asserted-by":"crossref","unstructured":"Kiczales G, Lamping J, Mendhekar A, Maeda C, Lopes C, Loingtier J-M, Irwin J (1997) Aspect-oriented programming. In: Proceedings of the European Conference on Object-Oriented Programming (ECOOP). Springer, pp 220\u2013242","DOI":"10.1007\/BFb0053381"},{"key":"10083_CR31","doi-asserted-by":"crossref","unstructured":"Kim S, Zimmermann T, Pan K, Whitehead Jr. E J (2006) Automatic identification of bug-introducing changes. In: Proceedings of the 21st IEEE\/ACM International Conference on Automated Software Engineering (ASE), pp 81\u201390","DOI":"10.1109\/ASE.2006.23"},{"key":"10083_CR32","unstructured":"(2008) Pearson\u2019s correlation coefficient. In: Kirch W (ed) Encyclopedia of Public Health. Springer Netherlands, Dordrecht, pp 1090\u20131091"},{"key":"10083_CR33","doi-asserted-by":"crossref","unstructured":"Kirinuki H, Higo Y, Hotta K, Kusumoto S (2016) Splitting commits via past code changes. In: Proceedings of the 2016 23rd Asia-Pacific Software Engineering Conference (APSEC), pp 129\u2013136","DOI":"10.1109\/APSEC.2016.028"},{"key":"10083_CR34","doi-asserted-by":"crossref","unstructured":"Kirinuki H, Higo Y, Hotta K, Kusumoto S (2014) Hey! are you committing tangled changes?. In: Proceedings of the 22nd International Conference on Program Comprehension, ICPC 2014. Association for Computing Machinery, New York, pp 262\u2013265","DOI":"10.1145\/2597008.2597798"},{"key":"10083_CR35","doi-asserted-by":"crossref","unstructured":"Kochhar P S, Tian Y, Lo D (2014) Potential biases in bug localization: Do they matter?. In: Proceedings of the 29th ACM\/IEEE International Conference on Automated Software Engineering, ASE \u201914. Association for Computing Machinery, New York, pp 803\u2013814","DOI":"10.1145\/2642937.2642997"},{"key":"10083_CR36","doi-asserted-by":"crossref","unstructured":"Kreutzer P, Dotzler G, Ring M, Eskofier B M, Philippsen M (2016) Automatic clustering of code changes. In: Proceedings of the 2016 IEEE\/ACM 13th Working Conference on Mining Software Repositories (MSR), pp 61\u201372","DOI":"10.1145\/2901739.2901749"},{"issue":"1","key":"10083_CR37","doi-asserted-by":"publisher","first-page":"159","DOI":"10.2307\/2529310","volume":"33","author":"JR Landis","year":"1977","unstructured":"Landis J R, Koch G G (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159\u2013174","journal-title":"Biometrics"},{"issue":"12","key":"10083_CR38","doi-asserted-by":"publisher","first-page":"1236","DOI":"10.1109\/TSE.2015.2454513","volume":"41","author":"C Le Goues","year":"2015","unstructured":"Le Goues C, Holtschulte N, Smith E K, Brun Y, Devanbu P, Forrest S, Weimer W (2015) The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs. IEEE Trans Softw Eng 41(12):1236\u20131256","journal-title":"IEEE Trans Softw Eng"},{"key":"10083_CR39","doi-asserted-by":"publisher","unstructured":"Li Y, Wang S, Nguyen T N (2020) Dlfix: Context-based code transformation learning for automated program repair. In: Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering, ICSE \u201920. https:\/\/doi.org\/10.1145\/3377811.3380345. Association for Computing Machinery, New York, pp 602\u2013614","DOI":"10.1145\/3377811.3380345"},{"issue":"4","key":"10083_CR40","doi-asserted-by":"publisher","first-page":"1936","DOI":"10.1007\/s10664-016-9470-4","volume":"22","author":"M Martinez","year":"2016","unstructured":"Martinez M, Durieux T, Sommerard R, Xuan J, Monperrus M (2016) Automatic repair of real bugs in java: a large-scale experiment on the defects4j dataset. Empir Softw Eng 22(4):1936\u20131964. https:\/\/doi.org\/10.1007\/s10664-016-9470-4","journal-title":"Empir Softw Eng"},{"key":"10083_CR41","doi-asserted-by":"crossref","unstructured":"Mills C, Pantiuchina J, Parra E, Bavota G, Haiduc S (2018) Are bug reports enough for text retrieval-based bug localization?. In: Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME), pp 381\u2013392","DOI":"10.1109\/ICSME.2018.00046"},{"key":"10083_CR42","doi-asserted-by":"crossref","unstructured":"Mills C, Parra E, Pantiuchina J, Bavota G, Haiduc S (2020) On the relationship between bug reports and queries for text retrieval-based bug localization. Empir Softw Eng:1\u201342","DOI":"10.1007\/s10664-020-09823-w"},{"key":"10083_CR43","doi-asserted-by":"crossref","unstructured":"Neto E C, da Costa D A, Kulesza U (2018) The impact of refactoring changes on the SZZ algorithm: An empirical study. In: Proceedings of the 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp 380\u2013390","DOI":"10.1109\/SANER.2018.8330225"},{"key":"10083_CR44","doi-asserted-by":"crossref","unstructured":"Nguyen H A, Nguyen A T, Nguyen T N (2013) Filtering noise in mixed-purpose fixing commits to improve defect prediction and localization. In: Proceedings of the 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE). IEEE, pp 138\u2013147","DOI":"10.1109\/ISSRE.2013.6698913"},{"key":"10083_CR45","doi-asserted-by":"crossref","unstructured":"P\u00e2rtachi PP, Dash SK, Allamanis M, Barr ET (2020) Flexeme: Untangling commits using lexical flows. In: Proceedings of the 2020 ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC\/FSE 2020. ACM","DOI":"10.1145\/3368089.3409693"},{"key":"10083_CR46","unstructured":"Patton M Q (2014) Qualitative research & evaluation methods: Integrating theory and practice. Sage publications"},{"issue":"5","key":"10083_CR47","doi-asserted-by":"publisher","first-page":"879","DOI":"10.1037\/0021-9010.88.5.879","volume":"88","author":"PM Podsakoff","year":"2003","unstructured":"Podsakoff P M, MacKenzie S B, Lee J-Y, Podsakoff N P (2003) Common method biases in behavioral research: a critical review of the literature and recommended remedies. J Appl Psychol 88(5):879","journal-title":"J Appl Psychol"},{"issue":"2","key":"10083_CR48","doi-asserted-by":"publisher","first-page":"1294","DOI":"10.1007\/s10664-019-09781-y","volume":"25","author":"G Rodr\u00edguez-P\u00e9rez","year":"2020","unstructured":"Rodr\u00edguez-P\u00e9rez G, Robles G, Serebrenik A, Zaidman A, Germ\u00e1n D M, Gonzalez-Barahona J M (2020) How bugs are born: a model to identify how bugs are introduced in software components. Empir Softw Eng 25(2):1294\u20131340. https:\/\/doi.org\/10.1007\/s10664-019-09781-y","journal-title":"Empir Softw Eng"},{"issue":"2","key":"10083_CR49","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1007\/s10664-008-9102-8","volume":"14","author":"P Runeson","year":"2009","unstructured":"Runeson P, H\u00f6st M (2009) Guidelines for conducting and reporting case study research in software engineering. Empir Softw Eng 14(2):131\u2013164","journal-title":"Empir Softw Eng"},{"key":"10083_CR50","doi-asserted-by":"crossref","unstructured":"Saha R, Lyu Y, Lam W, Yoshida H, Prasad M (2018) Bugs.jar: A large-scale, diverse dataset of real-world java bugs. In: Proceedings of the 2018 IEEE\/ACM 15th International Conference on Mining Software Repositories (MSR), pp 10\u201313","DOI":"10.1145\/3196398.3196473"},{"issue":"3\/4","key":"10083_CR51","doi-asserted-by":"publisher","first-page":"591","DOI":"10.2307\/2333709","volume":"52","author":"SS Shapiro","year":"1965","unstructured":"Shapiro S S, Wilk M B (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3\/4):591\u2013611","journal-title":"Biometrika"},{"key":"10083_CR52","doi-asserted-by":"crossref","unstructured":"\u015aliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. In: Proceedings of the 2005 Int. Workshop on Mining Software Repositories (MSR). ACM, pp 1\u20135","DOI":"10.1145\/1082983.1083147"},{"key":"10083_CR53","doi-asserted-by":"crossref","unstructured":"Str\u00fcder S, Mukelabai M, Str\u00fcber D, Berger T (2020) Feature-oriented defect prediction. In: Proceedings of the International Systems and Software Product Line Conference (SPLC). ACM, pp 21:1\u201321:12","DOI":"10.1145\/3382025.3414960"},{"key":"10083_CR54","doi-asserted-by":"crossref","unstructured":"Tao Y, Kim S (2015) Partitioning composite code changes to facilitate code review. In: Proceedings of the 2015 IEEE\/ACM 12th Working Conference on Mining Software Repositories, pp 180\u2013190","DOI":"10.1109\/MSR.2015.24"},{"key":"10083_CR55","doi-asserted-by":"publisher","unstructured":"Trautsch A, Herbold S, Grabowski J (2020) A longitudinal study of static analysis warning evolution and the effects of PMD on software quality in apache open source projects. Empir Softw Eng. https:\/\/doi.org\/10.1007\/s10664-020-09880-1","DOI":"10.1007\/s10664-020-09880-1"},{"key":"10083_CR56","doi-asserted-by":"crossref","unstructured":"Trautsch A, Trautsch F, Herbold S, Ledel B, Grabowski J (2020) The smartshark ecosystem for software repository mining. In: Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering: Companion Proceedings, ICSE \u201920. Association for Computing Machinery, New York, pp 25\u201328","DOI":"10.1145\/3377812.3382139"},{"issue":"2","key":"10083_CR57","doi-asserted-by":"publisher","first-page":"1036","DOI":"10.1007\/s10664-017-9537-x","volume":"23","author":"F Trautsch","year":"2018","unstructured":"Trautsch F, Herbold S, Makedonski P, Grabowski J (2018) Addressing problems with replicability and validity of repository mining studies through a smart data platform. Empir Softw Eng 23(2):1036\u20131083","journal-title":"Empir Softw Eng"},{"key":"10083_CR58","doi-asserted-by":"publisher","unstructured":"Tsantalis N, Ketkar A, Dig D (2020) Refactoringminer 2.0. IEEE Trans Softw Eng. https:\/\/doi.org\/10.1109\/TSE.2020.3007722","DOI":"10.1109\/TSE.2020.3007722"},{"key":"10083_CR59","doi-asserted-by":"crossref","unstructured":"Tsantalis N, Mansouri M, Eshkevari L M, Mazinanian D, Dig D (2018) Accurate and efficient refactoring detection in commit history. In: Proceedings of the 40th International Conference on Software Engineering (ICSE). ACM, pp 483\u2013494","DOI":"10.1145\/3180155.3180206"},{"key":"10083_CR60","doi-asserted-by":"publisher","unstructured":"Tufano M, Pantiuchina J, Watson C, Bavota G, Poshyvanyk D (2019) On learning meaningful code changes via neural machine translation. In: Proceedings of the 41st International Conference on Software Engineering, ICSE \u201919. https:\/\/doi.org\/10.1109\/ICSE.2019.00021. IEEE Press, pp 25\u201336","DOI":"10.1109\/ICSE.2019.00021"},{"key":"10083_CR61","doi-asserted-by":"crossref","unstructured":"Wang M, Lin Z, Zou Y, Xie B (2019) Cora: Decomposing and describing tangled code changes for reviewer. In: Proceedings of the 2019 34th IEEE\/ACM International Conference on Automated Software Engineering (ASE), pp 1050\u20131061","DOI":"10.1109\/ASE.2019.00101"},{"key":"10083_CR62","unstructured":"Werbach K, Hunter D (2012) For the win: How game thinking can revolutionize your business. Wharton Digital Press"},{"key":"10083_CR63","doi-asserted-by":"crossref","unstructured":"Wohlin C, Runeson P, H\u00f6st M, Ohlsson M C, Regnell B, Wesslen A (2012) Experimentation in Software Engineering. Springer Publishing Company, Incorporated","DOI":"10.1007\/978-3-642-29044-2"},{"key":"10083_CR64","doi-asserted-by":"crossref","unstructured":"Yamashita S, Hayashi S, Saeki M (2020) Changebeadsthreader: An interactive environment for tailoring automatically untangled changes. In: Proceedings of the 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp 657\u2013661","DOI":"10.1109\/SANER48275.2020.9054861"},{"issue":"7","key":"10083_CR65","doi-asserted-by":"publisher","first-page":"739","DOI":"10.1002\/spe.4380210706","volume":"21","author":"W Yang","year":"1991","unstructured":"Yang W (1991) Identifying syntactic differences between two programs. Softw Practice Exper 21 (7):739\u2013755. https:\/\/doi.org\/10.1002\/spe.4380210706","journal-title":"Softw Practice Exper"}],"container-title":["Empirical Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-021-10083-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10664-021-10083-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-021-10083-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,9,26]],"date-time":"2022-09-26T08:15:16Z","timestamp":1664180116000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10664-021-10083-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,2]]},"references-count":65,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2022,11]]}},"alternative-id":["10083"],"URL":"https:\/\/doi.org\/10.1007\/s10664-021-10083-5","relation":{},"ISSN":["1382-3256","1573-7616"],"issn-type":[{"value":"1382-3256","type":"print"},{"value":"1573-7616","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,2]]},"assertion":[{"value":"11 October 2021","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 July 2022","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"125"}}