{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,20]],"date-time":"2026-05-20T07:05:36Z","timestamp":1779260736868,"version":"3.51.4"},"reference-count":43,"publisher":"MIT Press","issue":"2","license":[{"start":{"date-parts":[[2021,2,17]],"date-time":"2021-02-17T00:00:00Z","timestamp":1613520000000},"content-version":"vor","delay-in-days":413,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Universities are increasingly evaluated on the basis of their outputs. These are often converted to simple and contested rankings with substantial implications for recruitment, income, and perceived prestige. Such evaluation usually relies on a single data source to define the set of outputs for a university. However, few studies have explored differences across data sources and their implications for metrics and rankings at the institutional scale. We address this gap by performing detailed bibliographic comparisons between Web of Science (WoS), Scopus, and Microsoft Academic (MSA) at the institutional level and supplement this with a manual analysis of 15 universities. We further construct two simple rankings based on citation count and open access status. Our results show that there are significant differences across databases. These differences contribute to drastic changes in rank positions of universities, which are most prevalent for non-English-speaking universities and those outside the top positions in international university rankings. Overall, MSA has greater coverage than Scopus and WoS, but with less complete affiliation metadata. We suggest that robust evaluation measures need to consider the effect of choice of data sources and recommend an approach where data from multiple sources is integrated to provide a more robust data set.<\/jats:p>","DOI":"10.1162\/qss_a_00031","type":"journal-article","created":{"date-parts":[[2020,3,25]],"date-time":"2020-03-25T13:06:31Z","timestamp":1585141591000},"page":"445-478","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":61,"title":["Comparison of bibliographic data sources: Implications for the robustness of university rankings"],"prefix":"10.1162","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9656-5932","authenticated-orcid":false,"given":"Chun-Kai (Karl)","family":"Huang","sequence":"first","affiliation":[{"name":"Centre for Culture and Technology, Curtin University, Bentley 6102, Western Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0068-716X","authenticated-orcid":false,"given":"Cameron","family":"Neylon","sequence":"additional","affiliation":[{"name":"Centre for Culture and Technology, Curtin University, Bentley 6102, Western Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5370-391X","authenticated-orcid":false,"given":"Chloe","family":"Brookes-Kenworthy","sequence":"additional","affiliation":[{"name":"Centre for Culture and Technology, Curtin University, Bentley 6102, Western Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8288-5241","authenticated-orcid":false,"given":"Richard","family":"Hosking","sequence":"additional","affiliation":[{"name":"Centre for Culture and Technology, Curtin University, Bentley 6102, Western Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6551-8140","authenticated-orcid":false,"given":"Lucy","family":"Montgomery","sequence":"additional","affiliation":[{"name":"Centre for Culture and Technology, Curtin University, Bentley 6102, Western Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8705-1027","authenticated-orcid":false,"given":"Katie","family":"Wilson","sequence":"additional","affiliation":[{"name":"Centre for Culture and Technology, Curtin University, Bentley 6102, Western Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6813-8362","authenticated-orcid":false,"given":"Alkim","family":"Ozaygen","sequence":"additional","affiliation":[{"name":"Centre for Culture and Technology, Curtin University, Bentley 6102, Western Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","published-online":{"date-parts":[[2020,6,1]]},"reference":[{"key":"2025073014025537800_bib1","doi-asserted-by":"crossref","unstructured":"Anderson,  M. S., Ronning,  E. A., De Vries,  R., & Martinson,  B. C. (2007). The perverse effects of competition on scientists\u2019 work and relationships. Science and Engineering Ethics, 13(4), 437\u2013461. https:\/\/doi.org\/10.1007\/s11948-007-9042-5","DOI":"10.1007\/s11948-007-9042-5"},{"key":"2025073014025537800_bib2","doi-asserted-by":"crossref","unstructured":"Archambault,  E., Campbell,  D., Gingras,  Y., & Larivi\u00e8re,  V. (2009). Comparing bibliometric statistics obtained from the Web of Science and Scopus. Journal of the American Society for Information Science and Technology, 60(7), 1320\u20131326. https:\/\/doi.org\/10.1002\/asi.21062","DOI":"10.1002\/asi.21062"},{"key":"2025073014025537800_bib3","doi-asserted-by":"crossref","unstructured":"Bakkalbasi,  N., Bauer,  K., Glover,  J., & Wang,  L. (2006). Three options for citation tracking: Google Scholar, Scopus and Web of Science. Biomedical Digital Libraries, 3(7), 1\u20138. https:\/\/doi.org\/10.1186\/1742-5581-3-7","DOI":"10.1186\/1742-5581-3-7"},{"key":"2025073014025537800_bib4","doi-asserted-by":"crossref","unstructured":"Bar-Ilan,  J.\n           (2008). Which h-index? A comparison of WoS, Scopus and Google Scholar. Scientometrics, 74(2), 257\u2013271. https:\/\/doi.org\/10.1007\/s11192-008-0216-y","DOI":"10.1007\/s11192-008-0216-y"},{"key":"2025073014025537800_bib5","unstructured":"Brookes-Kenworthy,  C., Huang,  C.-K., Neylon,  C., Wilson,  K., Ozaygen,  A., Montgomery,  L., & Hosking,  R. (2019). Manual cross-validation data for the article: \u201cComparison of bibliographic data sources: Implications for the robustness of university rankings\u201d [Data set]. Zenodo. http:\/\/doi.org\/10.5281\/zenodo.3379703"},{"key":"2025073014025537800_bib6","doi-asserted-by":"crossref","unstructured":"De Domenico,  M., Omodei,  E., & Arenas,  A. (2016). Quantifying the diaspora of knowledge in the past century. Applied Network Science, 1(15), 1\u201313. https:\/\/doi.org\/10.1007\/s41109-016-0017-9","DOI":"10.1007\/s41109-016-0017-9"},{"key":"2025073014025537800_bib7","doi-asserted-by":"crossref","unstructured":"Donner,  P., Rimmert,  C., & Van Eck,  N. J. (2020). Comparing institutional-level bibliometric research performance indicator values based on different affiliation disambiguation systems. Quantitative Science Studies, 1(1), 150\u2013170. https:\/\/doi.org\/10.1162\/qss_a_00013","DOI":"10.1162\/qss_a_00013"},{"key":"2025073014025537800_bib8","doi-asserted-by":"crossref","unstructured":"Effendy,  S., & Yap,  R. H. C. (2017). Analysing trends in computer science research: A preliminary study using the Microsoft Academic Graph. Proceedings of the 26th International Conference on World Wide Web Companion, 1245\u20131250. https:\/\/doi.org\/10.1145\/3041021.3053064","DOI":"10.1145\/3041021.3053064"},{"key":"2025073014025537800_bib9","doi-asserted-by":"crossref","unstructured":"Falagas,  M. E., Pitsouni,  E. I., Malietzis,  G. A., & Pappas,  G. (2008). Comparison of PubMed, Scopus, Web of Science, and Google Scholar: Strengths and weaknesses. FASEB Journal, 22(2), 338\u2013342. https:\/\/doi.org\/10.1096\/fj.07-9492LSF","DOI":"10.1096\/fj.07-9492LSF"},{"key":"2025073014025537800_bib10","doi-asserted-by":"crossref","unstructured":"Fanelli,  D.\n           (2010). Do pressures to publish increase scientists\u2019 bias? An empirical support from US states data. PLOS One, 5(4), e10271. https:\/\/doi.org\/10.1371\/journal.pone.0010271","DOI":"10.1371\/journal.pone.0010271"},{"key":"2025073014025537800_bib43","doi-asserted-by":"crossref","unstructured":"Franceschini,  F., Maisano,  D., & Mastrogiacomo,  L. (2016). Empirical analysis and classification of database errors in Scopus and Web of Science. Journal of Informetrics, 10(4), 933\u2013953. https:\/\/doi.org\/10.1016\/j.joi.2016.07.003","DOI":"10.1016\/j.joi.2016.07.003"},{"key":"2025073014025537800_bib11","doi-asserted-by":"crossref","unstructured":"Giles,  C. L., Bollacker,  K., & Lawrence,  S. (1998). CiteSeer: An automatic citation indexing system. DL\u201998 Digital Libraries, 3rd ACM Conference on Digital Libraries, 89\u201398. https:\/\/doi.org\/10.1145\/276675.276685","DOI":"10.1145\/276675.276685"},{"key":"2025073014025537800_bib12","doi-asserted-by":"crossref","unstructured":"Gorraiz,  J., Melero-Fuentes,  D., Gumpenberger,  C., & Valderrama-Zuri\u00e1n,  J.-C. (2016). Availability of digital object identifiers (DOIs) in Web of Science and Scopus. Journal of Informetrics, 10(1), 98\u2013109. https:\/\/doi.org\/10.1016\/j.joi.2015.11.008","DOI":"10.1016\/j.joi.2015.11.008"},{"key":"2025073014025537800_bib13","doi-asserted-by":"crossref","unstructured":"Gusenbauer,  M.\n           (2019). Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics, 118(1), 177\u2013214. https:\/\/doi.org\/10.1007\/s11192-018-2958-5","DOI":"10.1007\/s11192-018-2958-5"},{"key":"2025073014025537800_bib14","doi-asserted-by":"crossref","unstructured":"Harzing,  A. W.\n           (2016). Microsoft Academic (Search): A Phoenix arisen from the ashes?Scientometrics, 108(3), 1637\u20131647. https:\/\/doi.org\/10.1007\/s11192-016-2026-y","DOI":"10.1007\/s11192-016-2026-y"},{"key":"2025073014025537800_bib15","doi-asserted-by":"crossref","unstructured":"Harzing,  A. W., & Alakangas,  S. (2016). Google Scholar, Scopus and the Web of Science: a longitudinal and cross-disciplinary comparison. Scientometrics, 106(2), 787\u2013804. https:\/\/doi.org\/10.1007\/s11192-015-1798-9","DOI":"10.1007\/s11192-015-1798-9"},{"key":"2025073014025537800_bib16","doi-asserted-by":"crossref","unstructured":"Harzing,  A. W., & Alakangas,  S. (2017a). Microsoft Academic: is the phoenix getting wings?Scientometrics, 110(1), 371\u2013383. https:\/\/doi.org\/10.1007\/s11192-016-2185-x","DOI":"10.1007\/s11192-016-2185-x"},{"key":"2025073014025537800_bib17","doi-asserted-by":"crossref","unstructured":"Harzing,  A. W., & Alakangas,  S. (2017b). Microsoft Academic Is one year old: The phoenix is ready to leave the nest. Scientometrics, 112(3), 1887\u20131894. https:\/\/doi.org\/10.1007\/s11192-017-2454-3","DOI":"10.1007\/s11192-017-2454-3"},{"key":"2025073014025537800_bib18","doi-asserted-by":"crossref","unstructured":"Hazelkorn,  E.\n           (2007). The impact of league tables and ranking systems on higher education decision making. Higher Education Management and Policy, 19(2), 1\u201324. https:\/\/doi.org\/10.1787\/hemp-v19-art12-en","DOI":"10.1787\/hemp-v19-art12-en"},{"key":"2025073014025537800_bib19","doi-asserted-by":"crossref","unstructured":"Herrmannova,  D., & Knoth,  P. (2016). An analysis of the Microsoft Academic Graph. D-Lib Magazine, 22(9\/10). https:\/\/doi.org\/10.1045\/september2016-herrmannova","DOI":"10.1045\/september2016-herrmannova"},{"key":"2025073014025537800_bib20","doi-asserted-by":"crossref","unstructured":"Huang,  C.-K., Neylon,  C., Brookes-Kenworthy,  C., Hosking,  R., Montgomery,  L., Wilson,  K., & Ozaygen,  A. (2019). Codes and data for the article: Comparison of bibliographic data sources: Implications for the robustness of university rankings. Zenodo. https:\/\/doi.org\/10.5281\/zenodo.3541520","DOI":"10.1101\/750075"},{"key":"2025073014025537800_bib21","doi-asserted-by":"crossref","unstructured":"Hug,  S. E., & Br\u00e4ndle,  M. P. (2017). The coverage of Microsoft Academic: analyzing the publication output of a university. Scientometrics, 113(3), 1551\u20131571. https:\/\/doi.org\/10.1007\/s11192-017-2535-3","DOI":"10.1007\/s11192-017-2535-3"},{"key":"2025073014025537800_bib22","doi-asserted-by":"crossref","unstructured":"Hug,  S. E., Ochsner,  M., & Br\u00e4ndle,  M. P. (2017). Citation analysis with Microsoft Academic. Scientometrics, 111(1), 371\u2013378. https:\/\/doi.org\/10.1007\/s11192-017-2247-8","DOI":"10.1007\/s11192-017-2247-8"},{"key":"2025073014025537800_bib23","unstructured":"Jacs\u00f3,  P.\n           (2005). As we may search \u2013 Comparison of major features of the Web of Science, Scopus, and Google Scholar citation-based and citation-enhanced databases. Current Science, 89(9), 1537\u20131547. https:\/\/www.jstor.org\/stable\/24110924"},{"key":"2025073014025537800_bib24","doi-asserted-by":"crossref","unstructured":"Kulkarni,  A. V., Aziz,  B., Shams,  I., & Busse,  J. W. (2009). Comparisons of citations in Web of Science, Scopus, and Google Scholar for articles published in general medical journals. Journal of American Medical Association, 302(10), 1092\u20131096. http:\/\/doi.org\/10.1001\/jama.2009.1307","DOI":"10.1001\/jama.2009.1307"},{"key":"2025073014025537800_bib25","doi-asserted-by":"crossref","unstructured":"Mart\u00edn-Mart\u00edn,  A., Orduna-Malea,  E., Thelwall,  M., & Delgado L\u00f3pez-C\u00f3zar,  E. (2018). Google Scholar, Web of Science, and Scopus: A systematic comparison of citations in 252 subject categories. Journal of Informetrics, 12(4), 1160\u20131177. https:\/\/doi.org\/10.1016\/j.joi.2018.09.002","DOI":"10.1016\/j.joi.2018.09.002"},{"key":"2025073014025537800_bib26","doi-asserted-by":"crossref","unstructured":"Mongeon,  P., & Paul-Hus,  A. (2016). The journal coverage of Web of Science and Scopus: A comparative analysis. Scientometrics, 106(1), 213\u2013228. https:\/\/doi.org\/10.1007\/s11192-015-1765-5","DOI":"10.1007\/s11192-015-1765-5"},{"key":"2025073014025537800_bib27","doi-asserted-by":"crossref","unstructured":"Moore,  S., Neylon,  C., Eve,  M. P., O\u2019Donnell,  D. P., & Pattinson,  D. (2017). \u201cExcellence R Us\u201d: University research and the fetishisation of excellence. Palgrave Communications, 3, 16105. https:\/\/doi.org\/10.1057\/palcomms.2016.105","DOI":"10.1057\/palcomms.2016.105"},{"key":"2025073014025537800_bib28","doi-asserted-by":"crossref","unstructured":"Neylon,  C., & Wu,  S. (2009). Article-level metrics and the evolution of scientific impact. PLOS Biology, 7(11), e1000242. https:\/\/doi.org\/10.1371\/journal.pbio.1000242","DOI":"10.1371\/journal.pbio.1000242"},{"key":"2025073014025537800_bib29","doi-asserted-by":"crossref","unstructured":"Norlander,  B., Li,  P., & West,  J. D. (2018). Estimating article influence scores for open access journals. PeerJ Preprints, 6, e26586v1. https:\/\/doi.org\/10.7287\/peerj.preprints.26586v1","DOI":"10.7287\/peerj.preprints.26586v1"},{"key":"2025073014025537800_bib30","unstructured":"Paszcza,  B.\n           (2016). Comparison of Microsoft Academic (Graph) with Web of Science, Scopus and Google Scholar. Master\u2019s thesis. University of Southampton."},{"key":"2025073014025537800_bib31","doi-asserted-by":"crossref","unstructured":"Portenoy,  J., Hullman,  J., & West,  J. D. (2016). Leveraging citation networks to visualize scholarly influence over time. Frontiers in Research Metrics and Analytics, 2, 8. https:\/\/doi.org\/10.3389\/frma.2017.00008","DOI":"10.3389\/frma.2017.00008"},{"key":"2025073014025537800_bib32","doi-asserted-by":"crossref","unstructured":"Portenoy,  J., & West,  J. D. (2017). Visualizing scholarly publications and citations to enhance author profiles. Proceedings of the 26th International Conference on World Wide Web Companion, 1279\u20131282. https:\/\/doi.org\/10.1145\/3041021.3053058","DOI":"10.1145\/3041021.3053058"},{"key":"2025073014025537800_bib33","unstructured":"Ranjbar-Sahraei,  B., van Eck,  N. J., & de Jong,  R. (2018). Accuracy of affiliation information in Microsoft Academic: Implications for institutional level research evaluation. Proceedings of the 23rd International Conference on Science and Technology Indicators, 1065\u20131067. https:\/\/openaccess.leidenuniv.nl\/handle\/1887\/65339"},{"key":"2025073014025537800_bib34","unstructured":"Sandulescu,  V., & Chiru,  M. (2016). Predicting the future relevance of research institutions \u2013 The winning solution of the KDD Cup 2016. arXiv:1609.02728v1. https:\/\/arxiv.org\/abs\/1609.02728v1"},{"key":"2025073014025537800_bib35","doi-asserted-by":"crossref","unstructured":"Shin,  J. C., & Toutkoushian,  R. K. (2011). The past, present, and future of university rankings. In: ShinJ., ToutkoushianR., & TeichlerU. (Eds.), University Rankings. The Changing Academy \u2013 The Changing Academic Profession in International Comparative Perspective, vol. 3, 1\u201316. Springer, Dordrecht. https:\/\/doi.org\/10.1007\/978-94-007-1116-7_1","DOI":"10.1007\/978-94-007-1116-7_1"},{"key":"2025073014025537800_bib36","doi-asserted-by":"crossref","unstructured":"Stergiou,  K., & Lessenich,  S. (2014). On impact factors and university rankings from birth to boycott. Ethics in Science and Environmental Politics, 13(2), 101\u2013111. https:\/\/doi.org\/10.3354\/esep00141","DOI":"10.3354\/esep00141"},{"key":"2025073014025537800_bib37","doi-asserted-by":"crossref","unstructured":"Thelwall,  M.\n           (2018). Microsoft Academic automatic document searches: Accuracy for journal articles and suitability for citation analysis. Journal of Informetrics, 12(1), 1\u20139. https:\/\/doi.org\/10.1016\/j.joi.2017.11.001","DOI":"10.1016\/j.joi.2017.11.001"},{"key":"2025073014025537800_bib38","doi-asserted-by":"crossref","unstructured":"Tsay,  M.-Y., Wu,  T.-L., & Tseng,  L.-L. (2017). Completeness and overlap in open access systems: Search engines, aggregate institutional repositories and physics-related open sources. PLOS One, 12(12), e0189751. https:\/\/doi.org\/10.1371\/journal.pone.0189751","DOI":"10.1371\/journal.pone.0189751"},{"key":"2025073014025537800_bib39","doi-asserted-by":"crossref","unstructured":"Vaccario,  G., Medo,  M., Wider,  N., & Mariani,  M. S. (2017). Quantifying and suppressing ranking bias in a large citation network. Journal of Informetrics, 11(3), 766\u2013782. https:\/\/doi.org\/10.1016\/j.joi.2017.05.014","DOI":"10.1016\/j.joi.2017.05.014"},{"key":"2025073014025537800_bib40","doi-asserted-by":"crossref","unstructured":"van Wessel,  M.\n           (2016). Evaluation by citation: Trends in publication behavior, evaluation criteria, and the strive for high impact publications. Science and Engineering Ethics, 22(1), 199\u2013225. https:\/\/doi.org\/10.1007\/s11948-015-9638-0","DOI":"10.1007\/s11948-015-9638-0"},{"key":"2025073014025537800_bib41","unstructured":"Wesley-Smith,  I., Bergstrom,  C. T., & West,  J. D. (2016). Static ranking of scholarly papers using article-level eigenfactor (ALEF). arXiv:1606.08534v1. https:\/\/arxiv.org\/abs\/1606.08534v1"},{"key":"2025073014025537800_bib42","doi-asserted-by":"crossref","unstructured":"Yang,  K., & Meho,  L. I. (2006). Citation Analysis: A Comparison of Google Scholar, Scopus, and Web of Science. Proceedings of the Association for Information Science and Technology, 43(1), 1\u201315. https:\/\/doi.org\/10.1002\/meet.14504301185","DOI":"10.1002\/meet.14504301185"}],"container-title":["Quantitative Science Studies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/qss\/article-pdf\/1\/2\/445\/1885863\/qss_a_00031.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/qss\/article-pdf\/1\/2\/445\/1885863\/qss_a_00031.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T18:03:46Z","timestamp":1753898626000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/qss\/article\/1\/2\/445\/96150\/Comparison-of-bibliographic-data-sources"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020]]},"references-count":43,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2020,6,1]]}},"URL":"https:\/\/doi.org\/10.1162\/qss_a_00031","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/750075","asserted-by":"object"}]},"ISSN":["2641-3337"],"issn-type":[{"value":"2641-3337","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020]]},"published":{"date-parts":[[2020]]}}}