{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T06:15:07Z","timestamp":1777097707417,"version":"3.51.4"},"reference-count":33,"publisher":"MIT Press","issue":"1","license":[{"start":{"date-parts":[[2024,2,8]],"date-time":"2024-02-08T00:00:00Z","timestamp":1707350400000},"content-version":"vor","delay-in-days":38,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100011033","name":"Agencia Estatal de Investigaci\u00f3n","doi-asserted-by":"publisher","award":["PID2019-106510GB-I00"],"award-info":[{"award-number":["PID2019-106510GB-I00"]}],"id":[{"id":"10.13039\/501100011033","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The main objective of this study is to compare the amount of metadata and the completeness degree of research publications in new academic databases. Using a quantitative approach, we selected a random Crossref sample of more than 115,000 records, which was then searched in seven databases (Dimensions, Google Scholar, Microsoft Academic, OpenAlex, Scilit, Semantic Scholar, and The Lens). Seven characteristics were analyzed (abstract, access, bibliographic info, document type, publication date, language, and identifiers), to observe fields that describe this information, the completeness rate of these fields, and the agreement among databases. The results show that academic search engines (Google Scholar, Microsoft Academic, and Semantic Scholar) gather less information and have a low degree of completeness. Conversely, third-party databases (Dimensions, OpenAlex, Scilit, and The Lens) have more metadata quality and a higher completeness rate. We conclude that academic search engines lack the ability to retrieve reliable descriptive data by crawling the web, and the main problem of third-party databases is the loss of information derived from integrating different sources.<\/jats:p>","DOI":"10.1162\/qss_a_00286","type":"journal-article","created":{"date-parts":[[2024,2,8]],"date-time":"2024-02-08T14:59:01Z","timestamp":1707404341000},"page":"31-49","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":44,"title":["Completeness degree of publication metadata in eight free-access scholarly databases"],"prefix":"10.1162","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8738-7276","authenticated-orcid":true,"given":"Lorena","family":"Delgado-Quir\u00f3s","sequence":"first","affiliation":[{"name":"Institute for Advanced Social Studies (IESA-CSIC), C\u00f3rdoba, Spain"},{"name":"Joint Research Unit Knowledge Transfer and Innovation (UCO-CSIC), C\u00f3rdoba, Spain"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9857-1511","authenticated-orcid":true,"given":"Jos\u00e9 Luis","family":"Ortega","sequence":"additional","affiliation":[{"name":"Institute for Advanced Social Studies (IESA-CSIC), C\u00f3rdoba, Spain"},{"name":"Joint Research Unit Knowledge Transfer and Innovation (UCO-CSIC), C\u00f3rdoba, Spain"}]}],"member":"281","published-online":{"date-parts":[[2024,3,1]]},"reference":[{"issue":"3","key":"2024052115150277100_bib1","doi-asserted-by":"publisher","first-page":"e0265545","DOI":"10.1371\/journal.pone.0265545","article-title":"The effect of data sources on the measurement of open access: A comparison of Dimensions and the Web of Science","volume":"17","author":"Basson","year":"2022","journal-title":"PLOS ONE"},{"key":"2024052115150277100_bib2","article-title":"AI2 joins forces with Microsoft Research to upgrade search tools for scientific studies","author":"Boyle","year":"2018","journal-title":"GeekWire"},{"key":"2024052115150277100_bib3","first-page":"238","article-title":"The continuum of metadata quality: Defining, expressing, exploiting","volume-title":"Metadata in practice","author":"Bruce","year":"2004"},{"issue":"1","key":"2024052115150277100_bib4","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1162\/qss_a_00183","article-title":"The Microsoft Academic Knowledge Graph enhanced: Author name disambiguation, publication classification, and embeddings","volume":"3","author":"F\u00e4rber","year":"2022","journal-title":"Quantitative Science Studies"},{"issue":"4","key":"2024052115150277100_bib5","doi-asserted-by":"publisher","first-page":"933","DOI":"10.1016\/j.joi.2016.07.003","article-title":"Empirical analysis and classification of database errors in Scopus and Web of Science","volume":"10","author":"Franceschini","year":"2016","journal-title":"Journal of Informetrics"},{"key":"2024052115150277100_bib6","volume-title":"Inclusion guidelines for webmasters","author":"Google Scholar","year":"2023"},{"key":"2024052115150277100_bib7","doi-asserted-by":"publisher","first-page":"593494","DOI":"10.3389\/frma.2020.593494","article-title":"Comparative analysis of the bibliographic data sources Dimensions and Scopus: An approach at the country and institutional levels","volume":"5","author":"Guerrero-Bote","year":"2021","journal-title":"Frontiers in Research Metrics and Analytics"},{"issue":"1","key":"2024052115150277100_bib8","doi-asserted-by":"publisher","first-page":"177","DOI":"10.1007\/s11192-018-2958-5","article-title":"Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases","volume":"118","author":"Gusenbauer","year":"2019","journal-title":"Scientometrics"},{"issue":"1","key":"2024052115150277100_bib9","doi-asserted-by":"publisher","first-page":"414","DOI":"10.1162\/qss_a_00022","article-title":"Crossref: The sustainable source of community-owned scholarly metadata","volume":"1","author":"Hendricks","year":"2020","journal-title":"Quantitative Science Studies"},{"issue":"9\/10","key":"2024052115150277100_bib10","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1045\/september2016-herrmannova","article-title":"An analysis of the Microsoft Academic Graph","volume":"22","author":"Herrmannova","year":"2016","journal-title":"D-Lib Magazine"},{"issue":"1","key":"2024052115150277100_bib11","doi-asserted-by":"publisher","first-page":"387","DOI":"10.1162\/qss_a_00020","article-title":"Dimensions: Bringing down barriers between scientometricians and data","volume":"1","author":"Herzog","year":"2020","journal-title":"Quantitative Science Studies"},{"key":"2024052115150277100_bib12","doi-asserted-by":"publisher","first-page":"23","DOI":"10.3389\/frma.2018.00023","article-title":"Dimensions: Building context for search and evaluation","volume":"3","author":"Hook","year":"2018","journal-title":"Frontiers in Research Metrics and Analytics"},{"issue":"3","key":"2024052115150277100_bib13","doi-asserted-by":"publisher","first-page":"1551","DOI":"10.1007\/s11192-017-2535-3","article-title":"The coverage of Microsoft Academic: Analyzing the publication output of a university","volume":"113","author":"Hug","year":"2017","journal-title":"Scientometrics"},{"key":"2024052115150277100_bib14","volume-title":"The Lens MetaRecord and LensID: An open identifier system for aggregated metadata and versioning of knowledge artefacts","author":"Jefferson","year":"2019"},{"issue":"3","key":"2024052115150277100_bib15","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1162\/qss_a_00210","article-title":"The availability and completeness of open funder metadata: Case study for publications funded by the Dutch Research Council","volume":"3","author":"Kramer","year":"2022","journal-title":"Quantitative Science Studies"},{"issue":"1","key":"2024052115150277100_bib16","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1016\/j.giq.2017.11.003","article-title":"Comparison of metadata quality in open data portals using the Analytic Hierarchy Process","volume":"35","author":"Kubler","year":"2018","journal-title":"Government Information Quarterly"},{"issue":"3","key":"2024052115150277100_bib17","doi-asserted-by":"publisher","first-page":"985","DOI":"10.1016\/j.joi.2018.07.008","article-title":"Missing author address information in Web of Science\u2014An explorative study","volume":"12","author":"Liu","year":"2018","journal-title":"Journal of Informetrics"},{"key":"2024052115150277100_bib18","volume-title":"Comparison of metadata quality in CrossRef, Lens, OpenAlex, Scopus, Semantic Scholar, Web of Science Core Collection databases","author":"Lutai","year":"2022"},{"issue":"3","key":"2024052115150277100_bib19","doi-asserted-by":"publisher","first-page":"2175","DOI":"10.1007\/s11192-018-2820-9","article-title":"Coverage of highly-cited documents in Google Scholar, Web of Science, and Scopus: A multidisciplinary comparison","volume":"116","author":"Mart\u00edn-Mart\u00edn","year":"2018","journal-title":"Scientometrics"},{"issue":"1","key":"2024052115150277100_bib20","doi-asserted-by":"publisher","first-page":"871","DOI":"10.1007\/s11192-020-03690-4","article-title":"Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations\u2019 COCI: a multidisciplinary comparison of coverage via citations","volume":"126","author":"Mart\u00edn-Mart\u00edn","year":"2021","journal-title":"Scientometrics"},{"key":"2024052115150277100_bib21","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1007\/s00799-009-0054-4","article-title":"Automatic evaluation of metadata quality in digital repositories","volume":"10","author":"Ochoa","year":"2009","journal-title":"International Journal on Digital Libraries"},{"key":"2024052115150277100_bib22","article-title":"When is a paper published?","author":"Ortega","year":"2022","journal-title":"The Research Whisperer"},{"key":"2024052115150277100_bib23","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2205.01833","article-title":"OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts","author":"Priem","year":"2022","journal-title":"arXiv"},{"issue":"1","key":"2024052115150277100_bib24","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1162\/qss_a_00175","article-title":"The prevalence and impact of university affiliation discrepancies between four bibliographic databases\u2014Scopus, Web of Science, Dimensions, and Microsoft Academic","volume":"3","author":"Purnell","year":"2022","journal-title":"Quantitative Science Studies"},{"key":"2024052115150277100_bib25","first-page":"1065","article-title":"Accuracy of affiliation information in Microsoft Academic: Implications for institutional level research evaluation","volume-title":"STI 2018 Conference Proceedings","author":"Ranjbar-Sahraei","year":"2018"},{"issue":"2","key":"2024052115150277100_bib26","doi-asserted-by":"publisher","DOI":"10.3145\/epi.2023.mar.09","article-title":"Which of the metadata with relevance for bibliometrics are the same and which are different when switching from Microsoft Academic Graph to OpenAlex?","volume":"32","author":"Scheidsteger","year":"2023","journal-title":"Profesional de la informaci\u00f3n"},{"issue":"6","key":"2024052115150277100_bib27","doi-asserted-by":"publisher","first-page":"1194","DOI":"10.1016\/j.ipm.2013.05.003","article-title":"Dealing with metadata quality: The legacy of digital library efforts","volume":"49","author":"Tani","year":"2013","journal-title":"Information Processing & Management"},{"issue":"3","key":"2024052115150277100_bib28","doi-asserted-by":"publisher","first-page":"570","DOI":"10.1016\/j.joi.2015.05.002","article-title":"A systematic analysis of duplicate records in Scopus","volume":"9","author":"Valderrama-Zuri\u00e1n","year":"2015","journal-title":"Journal of Informetrics"},{"key":"2024052115150277100_bib29","article-title":"Crossref as a new source of citation data: A comparison with Web of Science and Scopus","author":"van Eck","year":"2018","journal-title":"CWTS Blog"},{"issue":"1","key":"2024052115150277100_bib30","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1162\/qss_a_00112","article-title":"Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic","volume":"2","author":"Visser","year":"2021","journal-title":"Quantitative Science Studies"},{"key":"2024052115150277100_bib31","doi-asserted-by":"publisher","first-page":"739","DOI":"10.1145\/3487553.3527147","article-title":"The Semantic Scholar Academic Graph (S2AG)","volume-title":"Companion Proceedings of the Web Conference 2022","author":"Wade","year":"2022"},{"key":"2024052115150277100_bib32","article-title":"Open abstracts: Where are we?","author":"Waltman","year":"2020","journal-title":"Crossref Blog"},{"issue":"1","key":"2024052115150277100_bib33","doi-asserted-by":"publisher","first-page":"396","DOI":"10.1162\/qss_a_00021","article-title":"Microsoft Academic Graph: When experts are not enough","volume":"1","author":"Wang","year":"2020","journal-title":"Quantitative Science Studies"}],"container-title":["Quantitative Science Studies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/qss\/article-pdf\/5\/1\/31\/2373940\/qss_a_00286.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/qss\/article-pdf\/5\/1\/31\/2373940\/qss_a_00286.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,21]],"date-time":"2024-05-21T11:17:01Z","timestamp":1716290221000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/qss\/article\/5\/1\/31\/119466\/Completeness-degree-of-publication-metadata-in"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024]]},"references-count":33,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,3,1]]}},"URL":"https:\/\/doi.org\/10.1162\/qss_a_00286","relation":{"has-review":[{"id-type":"doi","id":"10.1162\/QSS_A_00286\/v2\/response1","asserted-by":"object"},{"id-type":"doi","id":"10.1162\/QSS_A_00286\/v1\/review1","asserted-by":"object"},{"id-type":"doi","id":"10.1162\/QSS_A_00286\/v1\/review2","asserted-by":"object"},{"id-type":"doi","id":"10.1162\/QSS_A_00286\/v2\/decision1","asserted-by":"object"},{"id-type":"doi","id":"10.1162\/QSS_A_00286\/v2\/review1","asserted-by":"object"},{"id-type":"doi","id":"10.1162\/QSS_A_00286\/v3\/decision1","asserted-by":"object"},{"id-type":"doi","id":"10.1162\/QSS_A_00286\/v1\/decision1","asserted-by":"object"},{"id-type":"doi","id":"10.1162\/QSS_A_00286\/v3\/response1","asserted-by":"object"}]},"ISSN":["2641-3337"],"issn-type":[{"value":"2641-3337","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024]]},"published":{"date-parts":[[2024]]}}}