{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,14]],"date-time":"2026-05-14T12:16:27Z","timestamp":1778760987581,"version":"3.51.4"},"reference-count":34,"publisher":"Emerald","issue":"4","license":[{"start":{"date-parts":[[2020,6,29]],"date-time":"2020-06-29T00:00:00Z","timestamp":1593388800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJWIS"],"published-print":{"date-parts":[[2020,6,29]]},"abstract":"<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title>\n<jats:p>This study aims to highlight the challenges and opportunities of using GitHub as a data source in both research and programming education.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title>\n<jats:p>This study provides general overview of the challenges and opportunities faced while conducting empirical research using GitHub as a data source. The challenges and opportunities are framed using the input\u2013process\u2013output model of open-source software.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Findings<\/jats:title>\n<jats:p>GitHub data accessed from the application programming interface (API) can have several limitations, which can be overcome by Web scraping and using external data repositories such as GHArchive and GHTorrent. There are also several idiosyncrasies about GitHub that researchers need to be aware of to be able to use the data effectively, which can represent an opportunity for research. The challenges and opportunities are summarized for the licenses, community, development process and product of free\/libra and open-source software communities hosted on GitHub.<\/jats:p>\n<\/jats:sec>\n<jats:sec>\n<jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title>\n<jats:p>This study provides a summary of GitHub-related challenges and opportunities that researchers can leverage to improve their empirical research. Furthermore, this summary can be a valuable resource for instructors that plan to use GitHub as a data source in their data-focused programming courses.<\/jats:p>\n<\/jats:sec>","DOI":"10.1108\/ijwis-03-2020-0016","type":"journal-article","created":{"date-parts":[[2020,7,3]],"date-time":"2020-07-03T10:37:30Z","timestamp":1593772650000},"page":"451-473","source":"Crossref","is-referenced-by-count":22,"title":["Mining GitHub for research and education: challenges and opportunities"],"prefix":"10.1108","volume":"16","author":[{"given":"Mohammad","family":"AlMarzouq","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Abdullatif","family":"AlZaidan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jehad","family":"AlDallal","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"140","reference":[{"issue":"1","key":"key2020100708375387600_ref001","doi-asserted-by":"publisher","DOI":"10.1108\/IJWIS-06-2019-0030","article-title":"A new algorithm for detecting communities in social networks based on content and structure information","volume":"16","year":"2019","journal-title":"International Journal of Web Information Systems"},{"issue":"2","key":"key2020100708375387600_ref002","first-page":"8","article-title":"A precise method-method interaction-based cohesion metric for object-oriented classes","volume":"21","year":"2012","journal-title":"ACM Transactions on Software Engineering and Methodology (TOSEM)"},{"key":"key2020100708375387600_ref003","unstructured":"AlMarzouq, M., AlZaidan, A., and AlDallal, J. (2020), \u201cAn exploration of free\/libra and open source data sources and their use in the field of information systems\u201d, Working Paper, Kuwait University, Kuwait."},{"key":"key2020100708375387600_ref004","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1016\/j.dss.2015.09.004","article-title":"Taxing the development structure of open source communities: an information processing view","volume":"80","year":"2015","journal-title":"Decision Support Systems"},{"issue":"16","key":"key2020100708375387600_ref005","first-page":"756","article-title":"Open source: concepts, benefits, and challenges","volume":"2005","year":"2005","journal-title":"Communications of AIS"},{"issue":"11","key":"key2020100708375387600_ref006","article-title":"Software licenses in context: the challenge of Heterogeneously-Licensed systems","volume":"11","year":"2010","journal-title":"Journal of the Association for Information Systems"},{"key":"key2020100708375387600_ref007","first-page":"150","article-title":"A data programming CS1 course","year":"2015"},{"key":"key2020100708375387600_ref008","first-page":"1","article-title":"The promises and perils of mining git","year":"2009"},{"key":"key2020100708375387600_ref009","first-page":"322","article-title":"How do centralized and distributed version control systems impact software changes?","year":"2014"},{"key":"key2020100708375387600_ref010","volume-title":"The Mythical Man-Month","year":"1975"},{"key":"key2020100708375387600_ref011","volume-title":"Pro Git","year":"2014","edition":"2nd ed."},{"issue":"2","key":"key2020100708375387600_ref012","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1002\/spip.259","article-title":"Information systems success in free and open source software development: theory and measures","volume":"11","year":"2006","journal-title":"Software Process: Improvement and Practice"},{"issue":"1","key":"key2020100708375387600_ref013","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1080\/10919392.2015.990775","article-title":"Investigating the interrelationships among success measures of open source software projects","volume":"25","year":"2015","journal-title":"Journal of Organizational Computing and Electronic Commerce"},{"key":"key2020100708375387600_ref014","first-page":"233","article-title":"The GHTorrent dataset and tool suite","volume-title":"Proceedings of the 10th Working Conference on Mining Software Repositories","year":"2013"},{"key":"key2020100708375387600_ref015","first-page":"501","article-title":"Mining software engineering data from GitHub","year":"2017"},{"issue":"7","key":"key2020100708375387600_ref016","doi-asserted-by":"crossref","first-page":"1043","DOI":"10.1287\/mnsc.1060.0550","article-title":"Location, location, location: how network embeddedness affects project success in open source systems","volume":"52","year":"2006","journal-title":"Management Science"},{"issue":"3","key":"key2020100708375387600_ref017","doi-asserted-by":"crossref","first-page":"369","DOI":"10.1287\/isre.1080.0192","article-title":"Emergence of new project teams from open source software developer networks: impact of prior collaboration ties","volume":"19","year":"2008","journal-title":"Information Systems Research"},{"issue":"3","key":"key2020100708375387600_ref018","doi-asserted-by":"crossref","first-page":"17","DOI":"10.4018\/jitwe.2006070102","article-title":"FLOSSmole: a collaborative repository for FLOSS research data and analyses","volume":"1","year":"2006","journal-title":"International Journal of Information Technology and Web Engineering (IJITWE)"},{"issue":"5","key":"key2020100708375387600_ref019","doi-asserted-by":"crossref","first-page":"2035","DOI":"10.1007\/s10664-015-9393-5","article-title":"An in-depth study of the promises and perils of mining GitHub","volume":"21","year":"2016","journal-title":"Empirical Software Engineering"},{"key":"key2020100708375387600_ref020","doi-asserted-by":"crossref","first-page":"110394","DOI":"10.1016\/j.jss.2019.110394","article-title":"How does object-oriented code refactoring influence software quality? Research landscape and challenges","volume":"157","year":"2019","journal-title":"Journal of Systems and Software"},{"issue":"4","key":"key2020100708375387600_ref021","first-page":"14:1","article-title":"Peripheral developer participation in open source projects: an empirical analysis","volume":"6","year":"2016","journal-title":"ACM Transactions on Management Information Systems"},{"key":"key2020100708375387600_ref022","unstructured":"Licensing a repository (2020), available at: https:\/\/help.github.com\/articles\/licensing-a-repository\/"},{"issue":"1","key":"key2020100708375387600_ref023","first-page":"229","article-title":"Open source software development experiences on the students\u2019 resumes: do they count?-insights from the employers\u2019 perspectives","volume":"8","year":"2009","journal-title":"Journal of Information Technology Education: Research"},{"issue":"1","key":"key2020100708375387600_ref024","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1145\/174666.174668","article-title":"The interdisciplinary study of coordination","volume":"26","year":"1994","journal-title":"ACM Computing Surveys"},{"key":"key2020100708375387600_ref025","first-page":"364","article-title":"Using software dependencies and churn metrics to predict field failures: an empirical case study","year":"2007"},{"issue":"3","key":"key2020100708375387600_ref026","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1108\/IJWIS-10-2017-0067","article-title":"Identification of substructures in complex networks using formal concept analysis","volume":"14","year":"2018","journal-title":"International Journal of Web Information Systems"},{"issue":"1","key":"key2020100708375387600_ref027","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1016\/j.jsis.2012.07.004","article-title":"The attraction of contributors in free and open source software projects","volume":"22","year":"2013","journal-title":"The Journal of Strategic Information Systems"},{"issue":"6","key":"key2020100708375387600_ref028","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1109\/TSE.2010.81","article-title":"Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities","volume":"37","year":"2011","journal-title":"IEEE Transactions on Software Engineering"},{"issue":"2","key":"key2020100708375387600_ref029","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1287\/isre.1060.0082","article-title":"Impacts of license choice and organizational sponsorship on user interest and development activity in open source software projects","volume":"17","year":"2006","journal-title":"Information Systems Research"},{"issue":"2","key":"key2020100708375387600_ref030","doi-asserted-by":"crossref","first-page":"291","DOI":"10.2307\/25148732","article-title":"The impact of ideology on effectiveness in open source software development teams","volume":"30","year":"2006","journal-title":"MIS Quarterly"},{"key":"key2020100708375387600_ref031","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1016\/j.dss.2014.05.014","article-title":"Uncovering the relationship between OSS user support networks and OSS popularity","volume":"64","year":"2014","journal-title":"Decision Support Systems"},{"key":"key2020100708375387600_ref032","article-title":"Advances in the sourceforge research data archive","year":"2008"},{"issue":"4","key":"key2020100708375387600_ref033","doi-asserted-by":"crossref","first-page":"1131","DOI":"10.1287\/isre.2013.0479","article-title":"Research note \u2013 the impact of intellectual property rights enforcement on open source software project success","volume":"24","year":"2013","journal-title":"Information Systems Research"},{"issue":"1","key":"key2020100708375387600_ref034","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1108\/IJWIS-04-2019-0016","article-title":"A product reputation framework based on social multimedia content","volume":"16","year":"2019","journal-title":"International Journal of Web Information Systems"}],"container-title":["International Journal of Web Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IJWIS-03-2020-0016\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/IJWIS-03-2020-0016\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T22:23:50Z","timestamp":1753395830000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/ijwis\/article\/16\/4\/451-473\/165262"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,29]]},"references-count":34,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,6,29]]}},"alternative-id":["10.1108\/IJWIS-03-2020-0016"],"URL":"https:\/\/doi.org\/10.1108\/ijwis-03-2020-0016","relation":{},"ISSN":["1744-0084","1744-0084"],"issn-type":[{"value":"1744-0084","type":"print"},{"value":"1744-0084","type":"print"}],"subject":[],"published":{"date-parts":[[2020,6,29]]}}}