{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T23:58:33Z","timestamp":1769731113728,"version":"3.49.0"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2019,9,11]],"date-time":"2019-09-11T00:00:00Z","timestamp":1568160000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2019,9,11]],"date-time":"2019-09-11T00:00:00Z","timestamp":1568160000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"crossref","award":["17H00731"],"award-info":[{"award-number":["17H00731"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"crossref","award":["16H05857"],"award-info":[{"award-number":["16H05857"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Empir Software Eng"],"published-print":{"date-parts":[[2020,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n              <jats:p>Automatic identification of the differences between two versions of a file is a common and basic task in several applications of mining code repositories. Git, a version control system, has a diff utility and users can select algorithms of diff from the default algorithm <jats:italic>Myers<\/jats:italic> to the advanced <jats:italic>Histogram<\/jats:italic> algorithm. From our systematic mapping, we identified three popular applications of diff in recent studies. On the impact on code churn metrics in 14 Java projects, we obtained different values in 1.7% to 8.2% commits based on the different diff algorithms. Regarding bug-introducing change identification, we found 6.0% and 13.3% in the identified bug-fix commits had different results of bug-introducing changes from 10 Java projects. For patch application, we found that the <jats:italic>Histogram<\/jats:italic> is more suitable than <jats:italic>Myers<\/jats:italic> for providing the changes of code, from our manual analysis. Thus, we strongly recommend using the <jats:italic>Histogram<\/jats:italic> algorithm when mining Git repositories to consider differences in source code.<\/jats:p>","DOI":"10.1007\/s10664-019-09772-z","type":"journal-article","created":{"date-parts":[[2019,9,11]],"date-time":"2019-09-11T06:52:45Z","timestamp":1568184765000},"page":"790-823","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":43,"title":["How different are different diff algorithms in Git?"],"prefix":"10.1007","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6391-0851","authenticated-orcid":false,"given":"Yusuf Sulistyo","family":"Nugroho","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hideaki","family":"Hata","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kenichi","family":"Matsumoto","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2019,9,11]]},"reference":[{"key":"9772_CR1","doi-asserted-by":"crossref","unstructured":"Barr ET, Brun Y, Devanbu P, Harman M, Sarro F (2014) The plastic surgery hypothesis. In: Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering, FSE 2014. ACM, New York, pp 306\u2013317","DOI":"10.1145\/2635868.2635898"},{"key":"9772_CR2","unstructured":"Budgen D, Turner M, Brereton P, Kitchenham B (2008) Using mapping studies in software engineering. \nhttp:\/\/citeseerx.ist.psu.edu\/viewdoc\/summary?doi=10.1.1.222.9091"},{"issue":"7","key":"9772_CR3","doi-asserted-by":"publisher","first-page":"641","DOI":"10.1109\/TSE.2016.2616306","volume":"43","author":"DA da Costa","year":"2017","unstructured":"da Costa DA, McIntosh S, Shang W, Kulesza U, Coelho R, Hassan AE (2017) A framework for evaluating the results of the szz approach for identifying bug-introducing changes. IEEE Trans Softw Eng 43 (7):641\u2013657. \nhttps:\/\/doi.org\/10.1109\/TSE.2016.2616306","journal-title":"IEEE Trans Softw Eng"},{"key":"9772_CR4","doi-asserted-by":"crossref","unstructured":"Dotzler G, Philippsen M (2016) Move-optimized source code tree differencing. In: 2016 31st IEEE\/ACM international conference on automated software engineering (ASE), pp 660\u2013671","DOI":"10.1145\/2970276.2970315"},{"key":"9772_CR5","doi-asserted-by":"publisher","unstructured":"Duala-Ekoko E, Robillard MP (2007) Tracking code clones in evolving software. In: Proceedings of the 29th international conference on software engineering, ICSE \u201907. \nhttps:\/\/doi.org\/10.1109\/ICSE.2007.90\n\n. IEEE Computer Society, Washington, pp 158\u2013167","DOI":"10.1109\/ICSE.2007.90"},{"key":"9772_CR6","doi-asserted-by":"publisher","unstructured":"Falleri JR, Morandat F, Blanc X, Martinez M, Monperrus M (2014) Fine-grained and accurate source code differencing. In: Proceedings of the 29th ACM\/IEEE international conference on automated software engineering, ASE \u201914. ACM, New York, pp 313\u2013324. \nhttps:\/\/doi.org\/10.1145\/2642937.2642982\n\n. \nhttp:\/\/doi.acm.org\/10.1145\/2642937.2642982","DOI":"10.1145\/2642937.2642982"},{"issue":"11","key":"9772_CR7","doi-asserted-by":"publisher","first-page":"725","DOI":"10.1109\/TSE.2007.70731","volume":"33","author":"B Fluri","year":"2007","unstructured":"Fluri B, Wuersch M, PInzger M, Gall H (2007) Change distilling: tree differencing for fine-grained source code change extraction. IEEE Trans Softw Eng 33 (11):725\u2013743. \nhttps:\/\/doi.org\/10.1109\/TSE.2007.70731","journal-title":"IEEE Trans Softw Eng"},{"key":"9772_CR8","doi-asserted-by":"publisher","unstructured":"Gousios G, Kalliamvakou E, Spinellis D (2008) Measuring developer contribution from software repository data. In: Proceedings of the 2008 international working conference on mining software repositories, MSR \u201908. ACM, New York, pp 129\u2013132. \nhttps:\/\/doi.org\/10.1145\/1370750.1370781\n\n. \nhttp:\/\/doi.acm.org\/10.1145\/1370750.1370781","DOI":"10.1145\/1370750.1370781"},{"key":"9772_CR9","doi-asserted-by":"publisher","unstructured":"Hashimoto M, Mori A (2008) Diff\/ts: a tool for fine-grained structural change analysis. In: 2008 15th working conference on reverse engineering, pp 279\u2013288. \nhttps:\/\/doi.org\/10.1109\/WCRE.2008.44","DOI":"10.1109\/WCRE.2008.44"},{"key":"9772_CR10","doi-asserted-by":"crossref","unstructured":"Hata H, Mizuno O, Kikuno T (2012) Bug prediction based on fine-grained module histories. In: Proceedings of the 34th international conference on software engineering, ICSE \u201912. \nhttp:\/\/dl.acm.org\/citation.cfm?id=2337223.2337247\n\n. IEEE Press, Piscataway, pp 200\u2013210","DOI":"10.1109\/ICSE.2012.6227193"},{"key":"9772_CR11","doi-asserted-by":"crossref","unstructured":"Higo Y, Ohtani A, Kusumoto S (2017) Generating simpler ast edit scripts by considering copy-and-paste. In: Proceedings of the 32Nd IEEE\/ACM international conference on automated on software engineering, ASE 2017. IEEE Press, Piscataway, pp 532\u2014542","DOI":"10.1109\/ASE.2017.8115664"},{"key":"9772_CR12","doi-asserted-by":"publisher","unstructured":"Huang K, Chen B, Peng X, Zhou D, Wang Y, Liu Y, Zhao W (2018) Cldiff: generating concise linked code differences. In: Proceedings of the 33rd ACM\/IEEE international conference on automated software engineering, ASE 2018. \nhttps:\/\/doi.org\/10.1145\/3238147.3238219\n\n. \nhttp:\/\/doi.acm.org\/10.1145\/3238147.3238219\n\n. ACM, New York, pp 679\u2014690","DOI":"10.1145\/3238147.3238219"},{"key":"9772_CR13","unstructured":"Hunt J, MacIlroy M (1976) An algorithm for differential file comparison. Computing science technical report, Bell Laboratories, \nhttps:\/\/books.google.com\/books?id=zJ2LMwAACAAJ"},{"key":"9772_CR14","doi-asserted-by":"publisher","unstructured":"Kamei Y, Shihab E (2016) Defect prediction: accomplishments and future challenges. In: 2016 IEEE 23rd International conference on software analysis, evolution, and reengineering (SANER), vol 5, pp 33\u201345. \nhttps:\/\/doi.org\/10.1109\/SANER.2016.56","DOI":"10.1109\/SANER.2016.56"},{"key":"9772_CR15","unstructured":"Kavitha R (2009) Collection development in digital libraries: trends and problems. \nhttp:\/\/citeseerx.ist.psu.edu\/viewdoc\/summary?doi=10.1.1.924.7945&rank=1"},{"key":"9772_CR16","doi-asserted-by":"publisher","unstructured":"Kim M, Sazawal V, Notkin D, Murphy G (2005) An empirical study of code clone genealogies. In: Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on foundations of software engineering, ESEC\/FSE-13. ACM, New York, pp 187\u2013196. \nhttps:\/\/doi.org\/10.1145\/1081706.1081737\n\n. \nhttp:\/\/doi.acm.org\/10.1145\/1081706.1081737","DOI":"10.1145\/1081706.1081737"},{"issue":"1","key":"9772_CR17","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1109\/TSE.2012.16","volume":"39","author":"M Kim","year":"2013","unstructured":"Kim M, Notkin D, Grossman D, Wilson G (2013) Identifying and summarizing systematic code changes via rule inference. IEEE Trans Softw Eng 39 (1):45\u201362. \nhttps:\/\/doi.org\/10.1109\/TSE.2012.16","journal-title":"IEEE Trans Softw Eng"},{"issue":"6","key":"9772_CR18","doi-asserted-by":"publisher","first-page":"2852","DOI":"10.1007\/s10664-016-9492-y","volume":"22","author":"M Kuhrmann","year":"2017","unstructured":"Kuhrmann M, Fernandez MD, Daneva M (2017) On the pragmatic design of literature studies in software engineering: an experience-based guideline. Empir Softw Eng 22(6):2852\u20132891. \nhttps:\/\/doi.org\/10.1007\/s10664-016-9492-y","journal-title":"Empir Softw Eng"},{"issue":"3","key":"9772_CR19","doi-asserted-by":"publisher","first-page":"393","DOI":"10.1007\/s11219-014-9241-7","volume":"23","author":"L Madeyski","year":"2015","unstructured":"Madeyski L, Jureczko M (2015) Which process metrics can significantly improve defect prediction models? an empirical study. Softw Qual J 23(3):393\u2013422. \nhttps:\/\/doi.org\/10.1007\/s11219-014-9241-7","journal-title":"Softw Qual J"},{"key":"9772_CR20","doi-asserted-by":"publisher","unstructured":"Meng X, Miller BP, Williams WR, Bernat AR (2013) Mining software repositories for accurate authorship. In: 2013 IEEE international conference on software maintenance, pp 250\u2013259. \nhttps:\/\/doi.org\/10.1109\/ICSM.2013.36","DOI":"10.1109\/ICSM.2013.36"},{"key":"9772_CR21","doi-asserted-by":"publisher","first-page":"251\u2014266","DOI":"10.1007\/BF01840446","volume":"1","author":"EW Myers","year":"1986","unstructured":"Myers EW (1986) An o(nd) difference algorithm and its variations. Algorithmica 1:251\u2014266","journal-title":"Algorithmica"},{"key":"9772_CR22","doi-asserted-by":"publisher","unstructured":"Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proceedings of the 27th international conference on software engineering, ICSE \u201905. \nhttps:\/\/doi.org\/10.1145\/1062455.1062514\n\n. \nhttp:\/\/doi.acm.org\/10.1145\/1062455.1062514\n\n. ACM, New York, pp 284\u2013292","DOI":"10.1145\/1062455.1062514"},{"key":"9772_CR23","doi-asserted-by":"crossref","unstructured":"Petersen K, Feldt R, Mujtaba S, Mattsson M (2008) Systematic mapping studies in software engineering. In: Proceedings of the 12th international conference on evaluation and assessment in software engineering, BCS Learning & Development Ltd., Swindon, UK, EASE\u201908, pp 68\u201377. \nhttp:\/\/dl.acm.org\/citation.cfm?id=2227115.2227123","DOI":"10.14236\/ewic\/EASE2008.8"},{"key":"9772_CR24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.infsof.2015.03.007","volume":"64","author":"K Petersen","year":"2015","unstructured":"Petersen K, Vakkalanka S, Kuzniarz L (2015) Guidelines for conducting systematic mapping studies in software engineering: an update. Inf Softw Technol 64:1\u201318. \nhttps:\/\/doi.org\/10.1016\/j.infsof.2015.03.007\n\n. \nhttp:\/\/www.sciencedirect.com\/science\/article\/pii\/S0950584915000646","journal-title":"Inf Softw Technol"},{"key":"9772_CR25","doi-asserted-by":"publisher","unstructured":"Rahman F, Devanbu P (2011) Ownership, experience and defects: a fine-grained study of authorship. In: Proceedings of the 33rd international conference on software engineering, ICSE \u201911. \nhttps:\/\/doi.org\/10.1145\/1985793.1985860\n\n. \nhttp:\/\/doi.acm.org\/10.1145\/1985793.1985860\n\n. ACM, New York, pp 491\u2013500","DOI":"10.1145\/1985793.1985860"},{"key":"9772_CR26","doi-asserted-by":"publisher","unstructured":"Rausch T, Hummer W, Leitner P, Schulte S (2017) An empirical analysis of build failures in the continuous integration workflows of java-based open-source software. In: Proceedings of the 14th international conference on mining software repositories, MSR \u201917. IEEE Press, Piscataway, pp 345\u2013355, DOI \nhttps:\/\/doi.org\/10.1109\/MSR.2017.54","DOI":"10.1109\/MSR.2017.54"},{"key":"9772_CR27","doi-asserted-by":"crossref","unstructured":"Ray B, Nagappan M, Bird C, Nagappan N, Zimmermann T (2015) The uniqueness of changes: characteristics and applications. In: Proceedings of the 12th working conference on mining software repositories, MSR\u201915. \nhttp:\/\/dl.acm.org\/citation.cfm?id=2820518.2820526\n\n. IEEE Press, Piscataway, pp 34\u201344","DOI":"10.1109\/MSR.2015.11"},{"key":"9772_CR28","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1016\/j.infsof.2018.03.009","volume":"99","author":"G Rodriguez-Perez","year":"2018","unstructured":"Rodriguez-Perez G, Robles G, Gonzalez-Barahona JM (2018) Reproducibility and credibility in empirical software engineering: a case study based on a systematic literature review of the use of the szz algorithm. Inf Softw Technol 99:164\u2013176","journal-title":"Inf Softw Technol"},{"issue":"6","key":"9772_CR29","doi-asserted-by":"publisher","first-page":"772","DOI":"10.1109\/TSE.2010.81","volume":"37","author":"Y Shin","year":"2011","unstructured":"Shin Y, Meneely A, Williams L, Osborne JA (2011) Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Trans Softw Eng 37(6):772\u2013787. \nhttps:\/\/doi.org\/10.1109\/TSE.2010.81","journal-title":"IEEE Trans Softw Eng"},{"issue":"4","key":"9772_CR30","doi-asserted-by":"publisher","first-page":"1\u20145","DOI":"10.1145\/1082983.1083147","volume":"30","author":"J \u015aliwerski","year":"2005","unstructured":"\u015aliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes SIGSOFT Software Engineering Notes 30(4):1\u20145. \nhttps:\/\/doi.org\/10.1145\/1082983.1083147\n\n. \nhttp:\/\/doi.acm.org\/10.1145\/1082983.1083147","journal-title":"SIGSOFT Software Engineering Notes"},{"issue":"5","key":"9772_CR31","first-page":"360\u2014363","volume":"37","author":"A Viera","year":"2005","unstructured":"Viera A, Garrett J (2005) Understanding interobserver agreement: the kappa statistic. Fam Med 37(5):360\u2014363","journal-title":"Fam Med"},{"issue":"10","key":"9772_CR32","doi-asserted-by":"publisher","first-page":"2594","DOI":"10.1016\/j.jss.2013.04.076","volume":"86","author":"C Wohlin","year":"2013","unstructured":"Wohlin C, Runeson P, da Mota Silveira Neto PA, Engstrom E, do Carmo Machado I, de Almeida ES (2013) On the reliability of mapping studies in software engineering. Journal of Systems and Software 86(10):2594\u20132610. \nhttps:\/\/doi.org\/10.1016\/j.jss.2013.04.076\n\n. \nhttp:\/\/www.sciencedirect.com\/science\/article\/pii\/S0164121213001234","journal-title":"Journal of Systems and Software"}],"container-title":["Empirical Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-019-09772-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10664-019-09772-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-019-09772-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,9,10]],"date-time":"2020-09-10T18:34:58Z","timestamp":1599762898000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10664-019-09772-z"}},"subtitle":["Use --histogram for code changes"],"short-title":[],"issued":{"date-parts":[[2019,9,11]]},"references-count":32,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,1]]}},"alternative-id":["9772"],"URL":"https:\/\/doi.org\/10.1007\/s10664-019-09772-z","relation":{},"ISSN":["1382-3256","1573-7616"],"issn-type":[{"value":"1382-3256","type":"print"},{"value":"1573-7616","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,9,11]]},"assertion":[{"value":"11 September 2019","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}