{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T15:08:31Z","timestamp":1770563311368,"version":"3.49.0"},"reference-count":22,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2018,2,22]],"date-time":"2018-02-22T00:00:00Z","timestamp":1519257600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGIR Forum"],"published-print":{"date-parts":[[2018,2,22]]},"abstract":"<jats:p>This paper points out some mistakes that can be frequently found in IR publications: MRR and ERR violate basic requirements for a metric, MAP is based on unrealistic assumptions, the numbers shown overstate the precision of the result, relative improvements of arithmetic means are inappropriate, the simple holdout method yields unreliable results, hypotheses are often formulated after the experiment, significance tests frequently ignore the multiple comparisons problem, effect sizes are ignored, reproducibility of the experiments might be nearly impossible, and sometimes authors claim proof by experimentation.<\/jats:p>","DOI":"10.1145\/3190580.3190586","type":"journal-article","created":{"date-parts":[[2018,2,23]],"date-time":"2018-02-23T16:40:01Z","timestamp":1519404001000},"page":"32-41","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":95,"title":["Some Common Mistakes In IR Evaluation, And How They Can Be Avoided"],"prefix":"10.1145","volume":"51","author":[{"given":"Norbert","family":"Fuhr","sequence":"first","affiliation":[{"name":"University of Duisburg-Essen, Germany"}]}],"member":"320","published-online":{"date-parts":[[2018,2,22]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"303","article-title":"Multiple hypothesis testing: A review","volume":"68","author":"Austin Stefanie R","year":"2014","unstructured":"Stefanie R Austin , Isaac Dialsingh , and Naomi Altman . Multiple hypothesis testing: A review . J. Indian Soc. Of Agricultural Stat , 68 : 303 -- 314 , 2014 . Stefanie R Austin, Isaac Dialsingh, and Naomi Altman. Multiple hypothesis testing: A review. J. Indian Soc. Of Agricultural Stat, 68:303--314, 2014.","journal-title":"J. Indian Soc. Of Agricultural Stat"},{"key":"e_1_2_1_2_1","first-page":"9","volume-title":"Braschler. CLEF 2001 - Overview of Results. In Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Fo- rum, CLEF 2001","author":"Martin","year":"2001","unstructured":"Martin Braschler. CLEF 2001 - Overview of Results. In Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Fo- rum, CLEF 2001 , Darmstadt, Germany , September 3-4, 2001 , Revised Papers. pages 9 -- 26 . Martin Braschler. CLEF 2001 - Overview of Results. In Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Fo- rum, CLEF 2001, Darmstadt, Germany, September 3-4, 2001, Revised Papers. pages 9--26."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766462.2767812"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2094072.2094076"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1645953.1646033"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1037\/0003-066X.49.12.997"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1214\/ss\/1032280214"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2964797.2964808"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1008992.1009079"},{"key":"e_1_2_1_10_1","volume-title":"Understanding pubmed user search behavior through log analysis","author":"Dogan R. Islamaj","year":"2009","unstructured":"R. Islamaj Dogan , G. C. Murray , A. Neveol , and Z. Lu . Understanding pubmed user search behavior through log analysis . Database : The Journal of Biological Databases and Curation , 2009 . R. Islamaj Dogan, G. C. Murray, A. Neveol, and Z. Lu. Understanding pubmed user search behavior through log analysis. Database: The Journal of Biological Databases and Curation, 2009."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/582415.582418"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/1394399"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1416950.1416952"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-16354-3_82"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1183614.1183630"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390334.1390453"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-54798-0_6"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1835449.1835542"},{"key":"e_1_2_1_19_1","first-page":"227","volume-title":"Proceedings of The Sixth Text REtrieval Conference, TREC 1997","author":"Singhal Amit","year":"1997","unstructured":"Amit Singhal , John Choi , Donald Hindle , and Fernando C. N. Pereira . At&t at TREC-6: SDR track. In D. Harman and E. M. Voorhees, editors , Proceedings of The Sixth Text REtrieval Conference, TREC 1997 , Gaithersburg, Maryland, USA , November 19-21, 1997 , pages 227 -- 232 , Gaithersburg, Md. 20899, 1997. National Institute of Standards and Technology. Amit Singhal, John Choi, Donald Hindle, and Fernando C. N. Pereira. At&t at TREC-6: SDR track. In D. Harman and E. M. Voorhees, editors, Proceedings of The Sixth Text REtrieval Conference, TREC 1997, Gaithersburg, Maryland, USA, November 19-21, 1997, pages 227--232, Gaithersburg, Md. 20899, 1997. National Institute of Standards and Technology."},{"issue":"2684","key":"e_1_2_1_20_1","first-page":"677","article-title":"On the theory of scales of measurement. Science","volume":"103","author":"Stevens S.S.","year":"1946","unstructured":"S.S. Stevens . On the theory of scales of measurement. Science , New Series 103 ( 2684 ): 677 -- 680 , June 1946 . S.S. Stevens. On the theory of scales of measurement. Science, New Series 103(2684):677--680, June 1946.","journal-title":"New Series"},{"key":"e_1_2_1_21_1","volume-title":"Foundations of Behavioral Statistics: An Insight-Based Approach","author":"Thompson Bruce","year":"2006","unstructured":"Bruce Thompson . Foundations of Behavioral Statistics: An Insight-Based Approach . The Guilford Press , 2006 . Bruce Thompson. Foundations of Behavioral Statistics: An Insight-Based Approach. The Guilford Press, 2006."},{"key":"e_1_2_1_22_1","volume-title":"Data Mining: Practical Machine Learning Tools and Techniques","author":"Witten Ian","year":"2011","unstructured":"Ian H. Witten , Eibe Frank , and Mark A. Hall . Data Mining: Practical Machine Learning Tools and Techniques . Morgan Kaufmann Publishers Inc ., San Francisco, CA, USA, 3 rd edition, 2011 . Ian H.Witten, Eibe Frank, and Mark A. Hall. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 3rd edition, 2011.","edition":"3"}],"container-title":["ACM SIGIR Forum"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3190580.3190586","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,18]],"date-time":"2023-04-18T04:18:06Z","timestamp":1681791486000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3190580.3190586"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,2,22]]},"references-count":22,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2018,2,22]]}},"alternative-id":["10.1145\/3190580.3190586"],"URL":"https:\/\/doi.org\/10.1145\/3190580.3190586","relation":{},"ISSN":["0163-5840"],"issn-type":[{"value":"0163-5840","type":"print"}],"subject":[],"published":{"date-parts":[[2018,2,22]]},"assertion":[{"value":"2018-02-22","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}