{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T02:11:32Z","timestamp":1775873492010,"version":"3.50.1"},"reference-count":26,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2020,9,21]],"date-time":"2020-09-21T00:00:00Z","timestamp":1600646400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100006109","name":"Vedeck\u00e1 Grantov\u00e1 Agent\u00fara M\u0160VVa\u0160 SR a SAV","doi-asserted-by":"publisher","award":["1\/0762\/19"],"award-info":[{"award-number":["1\/0762\/19"]}],"id":[{"id":"10.13039\/501100006109","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>When a person decides to inspect or modify a third-party software project, the first necessary step is its successful compilation from source code using a build system. However, such attempts often end in failure. In this data descriptor paper, we provide a dataset of build results of open source Java software systems. We tried to automatically build a large number of Java projects from GitHub using their Maven, Gradle, and Ant build scripts in a Docker container simulating a standard programmer\u2019s environment. The dataset consists of the output of two executions: 7264 build logs from a study executed in 2016 and 7233 logs from the 2020 execution. In addition to the logs, we collected exit codes, file counts, and various project metadata. The proportion of failed builds in our dataset is 38% in the 2016 execution and 59% in the 2020 execution. The published data can be helpful for multiple purposes, such as correlation analysis of factors affecting build success, build failure prediction, and research in the area of build breakage repair.<\/jats:p>","DOI":"10.3390\/data5030086","type":"journal-article","created":{"date-parts":[[2020,9,21]],"date-time":"2020-09-21T21:01:21Z","timestamp":1600722081000},"page":"86","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":17,"title":["Large-Scale Dataset of Local Java Software Build Results"],"prefix":"10.3390","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2221-9225","authenticated-orcid":false,"given":"Mat\u00fa\u0161","family":"Sul\u00edr","sequence":"first","affiliation":[{"name":"Department of Computers and Informatics, Faculty of Electrical Engineering and Informatics, Technical University of Ko\u0161ice, Letn\u00e1 9, 042 00 Ko\u0161ice, Slovakia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9436-8425","authenticated-orcid":false,"given":"Michaela","family":"Ba\u010d\u00edkov\u00e1","sequence":"additional","affiliation":[{"name":"Department of Computers and Informatics, Faculty of Electrical Engineering and Informatics, Technical University of Ko\u0161ice, Letn\u00e1 9, 042 00 Ko\u0161ice, Slovakia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8197-1962","authenticated-orcid":false,"given":"Matej","family":"Madeja","sequence":"additional","affiliation":[{"name":"Department of Computers and Informatics, Faculty of Electrical Engineering and Informatics, Technical University of Ko\u0161ice, Letn\u00e1 9, 042 00 Ko\u0161ice, Slovakia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9293-0859","authenticated-orcid":false,"given":"Sergej","family":"Chodarev","sequence":"additional","affiliation":[{"name":"Department of Computers and Informatics, Faculty of Electrical Engineering and Informatics, Technical University of Ko\u0161ice, Letn\u00e1 9, 042 00 Ko\u0161ice, Slovakia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8199-5112","authenticated-orcid":false,"given":"J\u00e1n","family":"Juh\u00e1r","sequence":"additional","affiliation":[{"name":"Department of Computers and Informatics, Faculty of Electrical Engineering and Informatics, Technical University of Ko\u0161ice, Letn\u00e1 9, 042 00 Ko\u0161ice, Slovakia"}]}],"member":"1968","published-online":{"date-parts":[[2020,9,21]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Neitsch, A., Wong, K., and Godfrey, M. (2012, January 23\u201328). Build system issues in multilanguage software. Proceedings of the 2012 28th IEEE International Conference on Software Maintenance (ICSM), Trento, Italy.","DOI":"10.1109\/ICSM.2012.6405265"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Kerzazi, N., Khomh, F., and Adams, B. (October, January 29). Why Do Automated Builds Break? An Empirical Study. Proceedings of the 2014 IEEE International Conference on Software Maintenance and Evolution (ICSME), Victoria, BC, Canada.","DOI":"10.1109\/ICSME.2014.26"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Seo, H., Sadowski, C., Elbaum, S., Aftandilian, E., and Bowdidge, R. (2014\u20137, January 31). Programmers\u2019 Build Errors: A Case Study (at Google). Proceedings of the 36th International Conference on Software Engineering, Hyderabad, India.","DOI":"10.1145\/2568225.2568255"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"e1838","DOI":"10.1002\/smr.1838","article-title":"There and back again: Can you compile that snapshot?","volume":"29","author":"Tufano","year":"2017","journal-title":"J. Softw. Evol. Process."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Rabbani, N., Harvey, M.S., Saquif, S., Gallaba, K., and McIntosh, S. (June, January 27). Revisiting \u201cProgrammers\u2019 Build Errors\u201d in the Visual Studio Context: A Replication Study Using IDE Interaction Traces. Proceedings of the 2018 IEEE\/ACM 15th International Conference on Mining Software Repositories (MSR), Gothenburg, Sweden.","DOI":"10.1145\/3196398.3196469"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Horton, E., and Parnin, C. (2018, January 23\u201329). Gistable: Evaluating the Executability of Python Code Snippets on GitHub. Proceedings of the 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), Madrid, Spain.","DOI":"10.1109\/ICSME.2018.00031"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Rausch, T., Hummer, W., Leitner, P., and Schulte, S. (2017, January 20\u201328). An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software. Proceedings of the 14th International Conference on Mining Software Repositories, MSR \u201917, Buenos Aires, Argentina.","DOI":"10.1109\/MSR.2017.54"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"3933","DOI":"10.1007\/s10664-019-09709-6","article-title":"A study of build inflation in 30 million CPAN builds on 13 Perl versions and 10 operating systems","volume":"24","author":"Zolfagharinia","year":"2019","journal-title":"Empir. Softw. Eng."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"2102","DOI":"10.1007\/s10664-019-09695-9","article-title":"An Empirical Study of the Long Duration of Continuous Integration Builds","volume":"24","author":"Ghaleb","year":"2019","journal-title":"Empir. Softw. Eng."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhang, C., Chen, B., Chen, L., Peng, X., and Zhao, W. (2019, January 12). A large-scale empirical study of compiler errors in continuous integration. Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Tallinn, Estonia.","DOI":"10.1145\/3338906.3338917"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Beller, M., Gousios, G., and Zaidman, A. (2017, January 20\u201328). Oops, my tests broke the build: An explorative analysis of travis CI with GitHub. Proceedings of the 14th International Conference on Mining Software Repositories, MSR \u201917, Buenos Aires, Argentina.","DOI":"10.1109\/MSR.2017.62"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Madeyski, L., and Kawalerowicz, M. (2017, January 20\u201328). Continuous defect prediction: The idea and a related dataset. Proceedings of the 14th International Conference on Mining Software Repositories, MSR \u201917, Buenos Aires, Argentina.","DOI":"10.1109\/MSR.2017.46"},{"key":"ref_13","unstructured":"Brandt, C.E., Panichella, A., Zaidman, A., and Beller, M. (2017, January 20\u201328). LogChunks: A data set for build log analysis. Proceedings of the 17th International Conference on Mining Software Repositories, Seoul, Korea."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Sul\u00edr, M., and Porub\u00e4n, J. (2016, January 1). A quantitative study of java software buildability. Proceedings of the 7th International Workshop on Evaluation and Usability of Programming Languages and Tools, Amsterdam, The Netherlands.","DOI":"10.1145\/3001878.3001882"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Spolsky, J. (2004). The Joel Test: 12 Steps to Better Code. Joel on Software: And on Diverse and Occasionally Related Matters That Will Prove of Interest to Software Developers, Designers, and Managers, and to Those Who, Whether by Good Fortune or Ill Luck, Work with Them in Some Capacity, Apress.","DOI":"10.1007\/978-1-4302-0753-5_3"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"247","DOI":"10.2298\/CSIS1002247K","article-title":"Comparing general-purpose and domain-specific languages: An empirical study","volume":"7","author":"Kosar","year":"2010","journal-title":"Comput. Sci. Inf. Syst."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Gallaba, K., Macho, C., Pinzger, M., and McIntosh, S. (2018, January 3). Noise and heterogeneity in historical build data: An empirical study of travis CI. Proceedings of the 33rd ACM\/IEEE International Conference on Automated Software Engineering, Montpellier, France.","DOI":"10.1145\/3238147.3238171"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Ghaleb, T.A., da Costa, D.A., Zou, Y., and Hassan, A.E. (2019). Studying the Impact of Noises in Build Breakage Data. IEEE Trans. Softw. Eng., 1\u201314.","DOI":"10.1109\/TSE.2019.2941880"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Hassan, F., and Wang, X. (2017, January 9\u201310). Change-Aware Build Prediction Model for Stall Avoidance in Continuous Integration. Proceedings of the 11th ACM\/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM \u201917, Markham, ON, Canada.","DOI":"10.1109\/ESEM.2017.23"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ni, A., and Li, M. (2017, January 20\u201328). Cost-effective build outcome prediction using cascaded classifiers. Proceedings of the 14th International Conference on Mining Software Repositories, MSR \u201917, Buenos Aires, Argentina.","DOI":"10.1109\/MSR.2017.26"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"2218","DOI":"10.1007\/s10664-019-09765-y","article-title":"Every build you break: Developer-oriented assistance for build failure resolution","volume":"25","author":"Vassallo","year":"2020","journal-title":"Empir. Softw. Eng."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Hassan, F., Mostafa, S., Lam, E.S.L., and Wang, X. (2017, January 9\u201310). Automatic building of java projects in software repositories: A study on feasibility and challenges. Proceedings of the 2017 ACM\/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), Toronto, ON, Canada.","DOI":"10.1109\/ESEM.2017.11"},{"key":"ref_23","unstructured":"Hassan, F., and Wang, X. (June, January 27). HireBuild: An automatic approach to history-driven repair of build scripts. Proceedings of the 40th International Conference on Software Engineering, Gothenburg, Sweden."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Macho, C., McIntosh, S., and Pinzger, M. (2018, January 20\u201323). Automatically repairing dependency-related build breakage. Proceedings of the 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), Campobasso, Italy.","DOI":"10.1109\/SANER.2018.8330201"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Kostelansk\u00fd, J., and Dedera, \u013d. (2017, January 4\u20136). An evaluation of output from current Java bytecode decompilers: Is it Android which is responsible for such quality boost?. Proceedings of the 2017 Communication and Information Technologies (KIT), Vysoke Tatry, Slovakia.","DOI":"10.23919\/KIT.2017.8109451"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Lukovi\u0107, I. (2020). Issues and Lessons Learned in the Development of Academic Study Programs in Data Science. Data Analytics and Management in Data Intensive Domains, Springer International Publishing. DAMDID\/RCDL 2019.","DOI":"10.1007\/978-3-030-51913-1_15"}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/5\/3\/86\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:12:09Z","timestamp":1760177529000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/5\/3\/86"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,21]]},"references-count":26,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2020,9]]}},"alternative-id":["data5030086"],"URL":"https:\/\/doi.org\/10.3390\/data5030086","relation":{},"ISSN":["2306-5729"],"issn-type":[{"value":"2306-5729","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,21]]}}}