{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T00:27:35Z","timestamp":1777854455867,"version":"3.51.4"},"reference-count":49,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2011,11,4]],"date-time":"2011-11-04T00:00:00Z","timestamp":1320364800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Information Science"],"published-print":{"date-parts":[[2011,12]]},"abstract":"<jats:p>This paper provides a discussion and analysis of methodological issues encountered during a scholarly impact and bibliometric study within the field of Computer Science (TRECVid Text Retrieval and Evaluation Conference, Video Retrieval Evaluation). The purpose of this paper is to provide a reflection and analysis of the methods used to provide useful information and guidance for those who may wish to undertake similar studies, and is of particular relevance for the academic disciplines which have publication and citation norms that may not perform well using traditional tools. Scopus and Google Scholar are discussed and a detailed comparison of the effects of different search methods and cleaning methods within and between these tools for subject and author analysis is provided. The additional database capabilities and usefulness of \u2018Scopus More\u2019 in addition to \u2018Scopus General\u2019 are discussed and evaluated. Scopus paper coverage is found to favourably compare with Google Scholar but Scholar consistently has superior performance at finding citations to those papers. These additional citations significantly increase the citation totals and also change the relative ranking of papers. Publish or Perish, a software wrapper for Google Scholar, is also examined and its limitations and some possible solutions are described. Data cleaning methods, including duplicate checks, expert domain checking of bibliographic data, and content checking of retrieved papers, are compared and their relative effects on paper and citation count discussed. Google Scholar and Scopus are also compared as tools for collecting bibliographic data for visualizations of developing trends and, owing to the comparative ease of collecting abstracts, Scopus is found far more effective.<\/jats:p>","DOI":"10.1177\/0165551511420032","type":"journal-article","created":{"date-parts":[[2011,11,4]],"date-time":"2011-11-04T21:24:40Z","timestamp":1320441880000},"page":"577-593","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":5,"title":["A bibliometric study of Video Retrieval Evaluation Benchmarking (TRECVid): A methodological analysis"],"prefix":"10.1177","volume":"37","author":[{"given":"Clare V.","family":"Thornley","sequence":"first","affiliation":[{"name":"Department of Information Studies, University College London, UK"}]},{"given":"Shane J.","family":"McLoughlin","sequence":"additional","affiliation":[{"name":"School of Information and Library Studies, University College Dublin, Ireland"}]},{"given":"Andrea C.","family":"Johnson","sequence":"additional","affiliation":[{"name":"School of Information and Library Studies, University College Dublin, Ireland"}]},{"given":"Alan F.","family":"Smeaton","sequence":"additional","affiliation":[{"name":"CLARITY: Centre for Sensor Web Technologies, School of Computing, Dublin City University, Ireland"}]}],"member":"179","published-online":{"date-parts":[[2011,11,4]]},"reference":[{"key":"bibr1-0165551511420032","doi-asserted-by":"crossref","unstructured":"Thornley CV, Johnson AC, Smeaton AF, Lee H. The scholarly impact of TRECVid (2003\u20132009). Journal of the American Society for Information Science and Technology 2011; 62(4): 613\u2013627. Available from http:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/asi.21494","DOI":"10.1002\/asi.21494"},{"key":"bibr2-0165551511420032","doi-asserted-by":"crossref","unstructured":"Harzing AW, van der Wal R. A Google Scholar h-index for journals: an alternative metric to measure journal impact in economics and business [Internet]. Journal of the American Society for Information Science and Technology 2009; 60(1): 41\u201346. Available from: http:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/asi.20953\/full","DOI":"10.1002\/asi.20953"},{"key":"bibr3-0165551511420032","doi-asserted-by":"crossref","unstructured":"Franceschet M. The role of conference publications in CS [Internet]. Communications of the ACM 2010; 53(12): 129. Available from: http:\/\/portal.acm.org\/citation.cfm?doid=1859204.1859234","DOI":"10.1145\/1859204.1859234"},{"key":"bibr4-0165551511420032","unstructured":"Moed HF, Visser MS. Developing bibliometric indicators of research performance in computer science: An exploratory study [Internet]. Leiden: CWTS, 2007. Available from: http:\/\/ict.nwo.nl\/files.nsf\/pages\/NWOA_78NJ63\/$file\/CWTS_Computer_Science_Study.pdf"},{"key":"bibr5-0165551511420032","doi-asserted-by":"crossref","unstructured":"Freyne J, Coyle L, Smyth B, Cunningham P. Relative status of journal and conference publications in computer science [Internet]. Communications of the ACM 2010; 53(11): 124. Available from: http:\/\/doi.acm.org\/10.1145\/1839676.1839701","DOI":"10.1145\/1839676.1839701"},{"key":"bibr6-0165551511420032","first-page":"182","volume":"23","author":"Garfield E","year":"1986","journal-title":"Current Contents"},{"key":"bibr7-0165551511420032","volume-title":"Citation analysis in research evaluation","author":"Moed H.F","year":"2005"},{"key":"bibr8-0165551511420032","volume-title":"Bibliometrics and citation analysis: from the science citation index to cybermetrics","author":"De Bellis N","year":"2009"},{"key":"bibr9-0165551511420032","doi-asserted-by":"crossref","unstructured":"Bar-Ilan J. Web of Science with the Conference Proceedings Citation Indexes: the case of computer science [Internet]. Scientometrics 2010; 83(3): 809\u2013824. Available from: http:\/\/www.springerlink.com\/index\/10.1007\/s11192-009-0145-4","DOI":"10.1007\/s11192-009-0145-4"},{"key":"bibr10-0165551511420032","doi-asserted-by":"crossref","unstructured":"Ding Y, Cronin B. Popular and\/or prestigious? Measures of scholarly esteem [Internet]. Information Processing & Management 2011; 47(1): 80\u201396. Available from: http:\/\/linkinghub.elsevier.com\/retrieve\/pii\/S0306457310000087","DOI":"10.1016\/j.ipm.2010.01.002"},{"key":"bibr11-0165551511420032","doi-asserted-by":"crossref","unstructured":"Franceschet M. The difference between popularity and prestige in the sciences and in the social sciences: a bibliometric analysis [Internet]. Journal of Informetrics 2010; 4(1): 55\u201363. Available from: http:\/\/linkinghub.elsevier.com\/retrieve\/pii\/S1751157709000698","DOI":"10.1016\/j.joi.2009.08.001"},{"key":"bibr12-0165551511420032","unstructured":"Li J, Sanderson M, Willett P, Norris M. Ranking of library and information science researchers: comparison of data sources for correlating citation data, and expert judgments [Internet]. Journal of Informetrics 2010; 4554\u20134563. Available from: http:\/\/linkinghub.elsevier.com\/retrieve\/pii\/S1751157710000593"},{"key":"bibr13-0165551511420032","doi-asserted-by":"crossref","unstructured":"Gargouri Y, Hajjem C, Larivi\u00e8re V, Gingras Y, Carr L, Brody T, Self-selected or mandated, open access increases citation impact for higher quality research [Internet]. PLoS ONE 2010; 5(10): e13636. Available from: http:\/\/dx.plos.org\/10.1371\/journal.pone.0013636","DOI":"10.1371\/journal.pone.0013636"},{"key":"bibr14-0165551511420032","doi-asserted-by":"crossref","unstructured":"Henderson M, Shurville S, Fernstrom K. The quantitative crunch: the impact of bibliometric research quality assessment exercises on academic development at small conferences [Internet]. Campus-Wide Information Systems 2009; 26(3): 149\u2013167. Available from: http:\/\/www.emeraldinsight.com\/10.1108\/10650740910967348","DOI":"10.1108\/10650740910967348"},{"key":"bibr15-0165551511420032","unstructured":"CIBER. Evaluating the usage and impact of e-journals in the UK: Bibliometric indicators for case study institutions. Available from: http:\/\/www.ucl.ac.uk\/infostudies\/research\/ciber\/value\/Bibliometric-indicators.pdf 2008; (November): 1\u201318."},{"key":"bibr16-0165551511420032","doi-asserted-by":"crossref","unstructured":"Raan AFJ. Performance-related differences of bibliometric statistical properties of research groups: cumulative advantages and hierarchically layered networks [Internet]. Journal of the American Society for Information Science and Technology 2006; 57(14): 1919\u20131935. Available from: http:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/asi.20389\/full","DOI":"10.1002\/asi.20389"},{"key":"bibr17-0165551511420032","unstructured":"Harzing A. Citation analysis across disciplines: the impact of different data sources and citation metrics [Internet] 2010. Available from: http:\/\/www.harzing.com\/data_metrics_comparison.htm"},{"key":"bibr18-0165551511420032","unstructured":"Royal Irish Academy (RIA). The appropriateness of key performance indicators to research in arts and humanities disciplines: Ireland\u2019s contribution to the European debate. [Internet]. Dublin: Royal Irish Academy, 2011 Available from: http:\/\/www.ria.ie\/getmedia\/2d1c1172-fc9d-4492-aa3b-97581f10c035\/Key-Performance-Indicators-2011-Full-PDF.pdf.aspx."},{"key":"bibr19-0165551511420032","unstructured":"Harzing A. Publish or perish, 2010: version 3.0, available from www.harzing.com\/pop.htm"},{"key":"bibr20-0165551511420032","doi-asserted-by":"crossref","unstructured":"Meho LI, Rogers Y. Citation counting, citation ranking, and\n                      h\n                      -index of human-computer interaction researchers: a comparison of Scopus and Web of Science [Internet]. Journal of the American Society for\u2013Information Science and Technology 2008; 59(11): 1711\u20131726. Available from: http:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/asi.20874\/full","DOI":"10.1002\/asi.20874"},{"key":"bibr21-0165551511420032","doi-asserted-by":"crossref","unstructured":"Schroeder R. Pointing users toward citation searching: using Google Scholar and Web of Science [Internet]. Libraries and the Academy 2007; 7(2): 243\u2013248. Available from: http:\/\/muse.jhu.edu\/content\/crossref\/journals\/portal_libraries_and_the_academy\/v007\/7.2schroeder.html","DOI":"10.1353\/pla.2007.0022"},{"key":"bibr22-0165551511420032","doi-asserted-by":"crossref","unstructured":"Jacs\u00f3 P. Google Scholar revisited [Internet]. Online Information Review 2008; 32(1): 102\u2013114. Available from: http:\/\/www.emeraldinsight.com\/journals.htm?articleid=1711361&show=abstract","DOI":"10.1108\/14684520810866010"},{"key":"bibr23-0165551511420032","doi-asserted-by":"crossref","unstructured":"Levine-Clark M, Gil E. A comparative analysis of social sciences citation tools [Internet]. Online Information Review 2009; 33(5): 986\u2013996. Available from: http:\/\/www.emeraldinsight.com\/10.1108\/14684520911001954","DOI":"10.1108\/14684520911001954"},{"key":"bibr24-0165551511420032","doi-asserted-by":"crossref","unstructured":"Meho LI, Yang K. Impact of data sources on citation counts and rankings of LIS faculty: Web of Science versus Scopus and Google Scholar [Internet]. Journal of the American Society for Information Science and Technology 2007; 58(13): 2105\u20132125.Available from: http:\/\/doi.wiley.com\/10.1002\/asi.20677","DOI":"10.1002\/asi.20677"},{"key":"bibr25-0165551511420032","unstructured":"Emerald. Academic search engines Part 3 [Internet], 2009. Available from: http:\/\/www.emeraldinsight.com\/librarians\/info\/viewpoints\/search_engines.htm?part=3&"},{"key":"bibr26-0165551511420032","doi-asserted-by":"crossref","unstructured":"Hirsch JE. An index to quantify an individuals scientific research output. [Internet]. Proceedings of the National Academy of Sciences of the United States of America 2005; 102(46): 16569\u201316572.Available from: http:\/\/www.pubmedcentral.nih.gov\/articlerender.fcgi?artid=1283832&tool=pmcentrez&rendertype=abstract","DOI":"10.1073\/pnas.0507655102"},{"key":"bibr27-0165551511420032","doi-asserted-by":"crossref","unstructured":"Brody T, Harnad S. Earlier web usage statistics as predictors of later citation impact [Internet]. Journal of the American Society for Information Science and Technology 2006; 440(8): 1060\u20131072. Available from: http:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/asi.20373\/full","DOI":"10.1002\/asi.20373"},{"key":"bibr28-0165551511420032","unstructured":"Swan A. The open access citation advantage [Internet], 2010. Available from: http:\/\/eprints.ecs.soton.ac.uk\/18516"},{"key":"bibr29-0165551511420032","doi-asserted-by":"crossref","unstructured":"Franceschet M. The skewness of computer science [Internet]. Information Processing & Management 2011; 47(1): 117\u2013124. Available from: http:\/\/linkinghub.elsevier.com\/retrieve\/pii\/S0306457310000233","DOI":"10.1016\/j.ipm.2010.03.003"},{"key":"bibr30-0165551511420032","doi-asserted-by":"crossref","unstructured":"Wainer J, Przibisczki de Oliveira H, Anido R. Patterns of bibliographic references in the ACM published papers [Internet]. Information Processing & Management 2011; 47(1): 135\u2013142. Available from: http:\/\/linkinghub.elsevier.com\/retrieve\/pii\/S0306457310000610","DOI":"10.1016\/j.ipm.2010.07.002"},{"key":"bibr31-0165551511420032","doi-asserted-by":"crossref","unstructured":"Jacso P. Deflated, inflated and phantom citation counts [Internet]. Online Information Review 2006; 30(3): 297\u2013309. Available from: http:\/\/www.emeraldinsight.com\/10.1108\/14684520610675816","DOI":"10.1108\/14684520610675816"},{"key":"bibr32-0165551511420032","doi-asserted-by":"crossref","unstructured":"Bar-Ilan J. Which\n                      h\n                      -index? A comparison of WoS, Scopus and Google Scholar [Internet]. Scientometrics 2007; 74(2): 257\u2013271. Available from: http:\/\/www.springerlink.com\/index\/10.1007\/s11192-008-0216-y","DOI":"10.1007\/s11192-008-0216-y"},{"key":"bibr33-0165551511420032","doi-asserted-by":"crossref","unstructured":"Yuan J, Wang H, Xiao L, Zheng W, Li J, Lin F, A formal study of shot boundary detection [Internet]. IEEE Transactions on Circuits and Systems for Video Technology 2007; 17(2): 168\u2013186. Available from: http:\/\/ieeexplore.ieee.org\/lpdocs\/epic03\/wrapper.htm?arnumber=4079667","DOI":"10.1109\/TCSVT.2006.888023"},{"key":"bibr34-0165551511420032","doi-asserted-by":"crossref","unstructured":"Hui SC, Fong a CM. Document retrieval from a citation database using conceptual clustering and co-word analysis [Internet]. Online Information Review 2004; 28(1): 22\u201332. Available from: http:\/\/www.emeraldinsight.com\/10.1108\/14684520410522420","DOI":"10.1108\/14684520410522420"},{"key":"bibr35-0165551511420032","doi-asserted-by":"crossref","unstructured":"Sugimoto CR, McCain KW. Visualizing changes over time: a history of information retrieval through the lens of descriptor tri-occurrence mapping [Internet]. Journal of Information Science 2010; 36(4): 481\u2013493. Available from: http:\/\/jis.sagepub.com\/cgi\/doi\/10.1177\/0165551510369992","DOI":"10.1177\/0165551510369992"},{"key":"bibr36-0165551511420032","unstructured":"Taporware [hompage on the Internet]. 2011. Available from: http:\/\/taporware.mcmaster.ca\/"},{"key":"bibr37-0165551511420032","unstructured":"The R Project for Statistical Computing [homepage on the Internet]. 2011. Available from: http:\/\/www.r-project.org\/"},{"key":"bibr38-0165551511420032","doi-asserted-by":"crossref","unstructured":"Franceschet M. A comparison of bibliometric indicators for computer science scholars and journals on Web of Science and Google Scholar [Internet]. Scientometrics 2010; 83(1): 243\u2013258. Available from: http:\/\/www.akademiai.com\/index\/t5444739wt4pv550.pdf","DOI":"10.1007\/s11192-009-0021-2"},{"key":"bibr39-0165551511420032","doi-asserted-by":"crossref","unstructured":"Thelwall M. Bibliometrics to webometrics [Internet]. Journal of Information Science 2008; 34(4): 605\u2013621. Available from: http:\/\/jis.sagepub.com\/cgi\/doi\/10.1177\/0165551507087238","DOI":"10.1177\/0165551507087238"},{"key":"bibr40-0165551511420032","doi-asserted-by":"crossref","unstructured":"Bornmann L, Mutz R, Neuhaus C, Daniel H. Citation counts for research evaluation: standards of good practice for analyzing bibliometric data and presenting and interpreting results [Internet]. Ethics in Science and Environmental Politics 2008; 8: 893\u2013102. Available from: http:\/\/www.int-res.com\/abstracts\/esep\/v8\/n1\/p93-102\/","DOI":"10.3354\/esep00084"},{"key":"bibr41-0165551511420032","doi-asserted-by":"crossref","unstructured":"Neuhaus C, Daniel H-D. Data sources for performing citation analysis: an overview [Internet]. Journal of Documentation 2008; 64(2): 193\u2013210. Available from: http:\/\/www.emeraldinsight.com\/10.1108\/00220410810858010","DOI":"10.1108\/00220410810858010"},{"key":"bibr42-0165551511420032","doi-asserted-by":"crossref","unstructured":"Cronin B. Bibliometrics and beyond: some thoughts on web-based citation analysis [Internet]. Journal of Information Science 2001; 27(1): 1\u20137. Available from: http:\/\/jis.sagepub.com\/cgi\/doi\/10.1177\/016555150102700101","DOI":"10.1177\/016555150102700101"},{"key":"bibr43-0165551511420032","unstructured":"Open Researcher and Contributor ID [homepage on the Internet]. 2011. Available from: http:\/\/www.orcid.org\/"},{"key":"bibr44-0165551511420032","unstructured":"\u2018Datacite\u2019 [homepage on the Internet]. 2011. Available from: http:\/\/www.datacite.org\/"},{"key":"bibr45-0165551511420032","doi-asserted-by":"crossref","unstructured":"Kousha K, Thelwall M, Rezaie S. Using the web for research evaluation: The Integrated Online Impact indicator [Internet]. Journal of Informetrics 2010; 4(1): 124\u2013135. Available from: http:\/\/linkinghub.elsevier.com\/retrieve\/pii\/S1751157709000777","DOI":"10.1016\/j.joi.2009.10.003"},{"key":"bibr46-0165551511420032","unstructured":"Google Refine [homepage on the Internet]. 2011. Available from: http:\/\/code.google.com\/p\/google-refine\/"},{"key":"bibr47-0165551511420032","unstructured":"Data Wrangler [homepage on the Internet]. 2011. Available from: http:\/\/vis.stanford.edu\/wrangler\/"},{"key":"bibr48-0165551511420032","doi-asserted-by":"crossref","unstructured":"P\u00f5der E. Let\u2019s correct that small mistake [Internet]. Journal of the American Society for Information Science 2010; 61(12): 2593\u20132594. Available from: http:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/asi.21438\/abstract","DOI":"10.1002\/asi.21438"},{"key":"bibr49-0165551511420032","doi-asserted-by":"crossref","unstructured":"Lane J. Let\u2019s make science metrics more scientific. [Internet]. Nature 2010; 464(7288): 488\u2013489. Available from: http:\/\/www.ncbi.nlm.nih.gov\/pubmed\/20336116","DOI":"10.1038\/464488a"}],"updated-by":[{"DOI":"10.1177\/0165551512443757","type":"correction","label":"Correction","source":"publisher","updated":{"date-parts":[[2012,4,11]],"date-time":"2012-04-11T00:00:00Z","timestamp":1334102400000}}],"container-title":["Journal of Information Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551511420032","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0165551511420032","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:08:10Z","timestamp":1777504090000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0165551511420032"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,11,4]]},"references-count":49,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2011,12]]}},"alternative-id":["10.1177\/0165551511420032"],"URL":"https:\/\/doi.org\/10.1177\/0165551511420032","relation":{},"ISSN":["0165-5515","1741-6485"],"issn-type":[{"value":"0165-5515","type":"print"},{"value":"1741-6485","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,11,4]]}}}