{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T00:03:54Z","timestamp":1769731434439,"version":"3.49.0"},"reference-count":18,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2017,10,24]],"date-time":"2017-10-24T00:00:00Z","timestamp":1508803200000},"content-version":"vor","delay-in-days":296,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":["asistdl.onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Proc. Assoc. Info. Sci. Tech."],"published-print":{"date-parts":[[2017,1]]},"abstract":"<jats:title>ABSTRACT<\/jats:title><jats:p>Large cyberinfrastructure\u2010enabled data repositories generate massive amounts of metadata, enabling big data analytics to leverage on the intersection of technological and methodological advances in data science for the quantitative study of science. This paper introduces a definition of big metadata in the context of scientific data repositories and discusses the challenges in big metadata analytics due to the messiness, lack of structures suitable for analytics and heterogeneity in such big metadata. A methodological framework is proposed, which contains conceptual and computational workflows intercepting through collaborative documentation. The workflow\u2010based methodological framework promotes transparency and contributes to research reproducibility. The paper also describes the experience and lessons learned from a four\u2010year big metadata project involving all aspects of the workflow\u2010based methodologies. The methodological framework presented in this paper is a timely contribution to the field of scientometrics and the science of science and policy as the potential value of big metadata is drawing more attention from research and policy maker communities.<\/jats:p>","DOI":"10.1002\/pra2.2017.14505401005","type":"journal-article","created":{"date-parts":[[2017,10,24]],"date-time":"2017-10-24T03:35:36Z","timestamp":1508816136000},"page":"36-45","update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Big data, big metadata and quantitative study of science: A workflow model for big scientometrics"],"prefix":"10.1002","volume":"54","author":[{"given":"Sarah","family":"Bratt","sequence":"first","affiliation":[{"name":"Syracuse University  USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jeff","family":"Hemsley","sequence":"additional","affiliation":[{"name":"Syracuse University  USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jian","family":"Qin","sequence":"additional","affiliation":[{"name":"School of Information Studies Syracuse University  USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mark","family":"Costa","sequence":"additional","affiliation":[{"name":"Newhouse School of Public Communications Syracuse University  USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","published-online":{"date-parts":[[2017,10,24]]},"reference":[{"key":"e_1_2_6_2_1","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkw1070"},{"key":"e_1_2_6_3_1","unstructured":"BPM Center of Excellence.(2009 October 26).Business Process Management Center of Excellence Glossary [Government]. Retrieved fromhttps:\/\/www.ftb.ca.gov\/aboutFTB\/Projects\/ITSP\/BPM_Glossary.pdf"},{"key":"e_1_2_6_4_1","volume-title":"The interdependence of scientists in the era of team science: An exploratory study using temporal network analysis","author":"Costa M. R.","year":"2016"},{"key":"e_1_2_6_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2008.06.012"},{"key":"e_1_2_6_6_1","volume-title":"Big science: The growth of large\u2010scale research","author":"Galison P.","year":"1992"},{"key":"e_1_2_6_7_1","doi-asserted-by":"publisher","DOI":"10.1080\/19386380903405090"},{"key":"e_1_2_6_8_1","first-page":"xvii","volume-title":"The fourth paradigm: Data intensive scientific discovery","author":"Hey tony","year":"2009"},{"key":"e_1_2_6_9_1","unstructured":"Interactive Crowdsourced Literature Key\u2010Value Annotation Library \u2010 iCLiKVAL. (n.d.). Retrieved April 10 2017 fromhttp:\/\/iclikval.riken.jp\/"},{"key":"e_1_2_6_10_1","unstructured":"Lai ronald D'Amour A. Yu A. &Fleming L.(2011).Disambiguation and Co\u2010authorship Networks of the U.S. Patent Inventor Database (1975 \u2010 2010) [Dataset]. Retrieved fromhttps:\/\/dataverse.harvard.edu\/dataset.xhtml?persistentId=hdl:1902.1\/15705"},{"key":"e_1_2_6_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.respol.2014.01.012"},{"key":"e_1_2_6_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11192-014-1238-2"},{"key":"e_1_2_6_13_1","unstructured":"NLM.(2015).Congressional Justification FY 2015 [Document]. Retrieved fromhttps:\/\/www.nlm.nih.gov\/about\/2015CJ.html"},{"key":"e_1_2_6_14_1","doi-asserted-by":"crossref","unstructured":"Sinha A. Shen Z. Song Y. Ma H. Eide D. Hsu B. (Paul) &Wang K.(2015).An overview of Microsoft Academic Service (MAS) and applications. InProceedings of the 24th international conference on world wide web. Florence Italy: ACM.https:\/\/doi.org\/10.1145\/2740908.2742839","DOI":"10.1145\/2740908.2742839"},{"key":"e_1_2_6_15_1","doi-asserted-by":"crossref","unstructured":"Smith K. Seligman L. Rosenthal A. Kurcz C. Greer M. Macheret C. \u2026Eckstein A.(2014).\u201cBig metadata\u201d: The need for principled metadata management in big data ecosystems. InProceedings of workshop on data analytics in the cloud(pp. 13:1\u20134). Snowbird UT USA: ACM.https:\/\/doi.org\/10.1145\/2627770.2627776","DOI":"10.1145\/2627770.2627776"},{"key":"e_1_2_6_16_1","volume-title":"Introduction to Data Science","author":"Stanton J.","year":"2013"},{"key":"e_1_2_6_17_1","volume-title":"Exploratory Data Analysis","author":"Tukey J. W.","year":"1977"},{"key":"e_1_2_6_18_1","volume-title":"Science of Science (Sci2) tool user manual","author":"Weingart S.","year":"2010"},{"key":"e_1_2_6_19_1","doi-asserted-by":"publisher","DOI":"10.1089\/omi.2013.0170"}],"container-title":["Proceedings of the Association for Information Science and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fpra2.2017.14505401005","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/pra2.2017.14505401005","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/pra2.2017.14505401005","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/asistdl.onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/pra2.2017.14505401005","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,21]],"date-time":"2025-10-21T16:25:02Z","timestamp":1761063902000},"score":1,"resource":{"primary":{"URL":"https:\/\/asistdl.onlinelibrary.wiley.com\/doi\/10.1002\/pra2.2017.14505401005"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,1]]},"references-count":18,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2017,1]]}},"alternative-id":["10.1002\/pra2.2017.14505401005"],"URL":"https:\/\/doi.org\/10.1002\/pra2.2017.14505401005","archive":["Portico"],"relation":{},"ISSN":["2373-9231","2373-9231"],"issn-type":[{"value":"2373-9231","type":"print"},{"value":"2373-9231","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,1]]},"assertion":[{"value":"2017-10-24","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}