{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,2]],"date-time":"2026-06-02T04:32:10Z","timestamp":1780374730480,"version":"3.54.1"},"reference-count":66,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2021,4,22]],"date-time":"2021-04-22T00:00:00Z","timestamp":1619049600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,4,22]],"date-time":"2021-04-22T00:00:00Z","timestamp":1619049600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Classif"],"published-print":{"date-parts":[[2021,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Kappa coefficients are commonly used for quantifying reliability on a categorical scale, whereas correlation coefficients are commonly applied to assess reliability on an interval scale. Both types of coefficients can be used to assess the reliability of ordinal rating scales. In this study, we compare seven reliability coefficients for ordinal rating scales: the kappa coefficients included are Cohen\u2019s kappa, linearly weighted kappa, and quadratically weighted kappa; the correlation coefficients included are intraclass correlation ICC(3,1), Pearson\u2019s correlation, Spearman\u2019s rho, and Kendall\u2019s tau-b. The primary goal is to provide a thorough understanding of these coefficients such that the applied researcher can make a sensible choice for ordinal rating scales. A second aim is to find out whether the choice of the coefficient matters. We studied to what extent we reach the same conclusions about inter-rater reliability with different coefficients, and to what extent the coefficients measure agreement in a similar way, using analytic methods, and simulated and empirical data. Using analytical methods, it is shown that differences between quadratic kappa and the Pearson and intraclass correlations increase if agreement becomes larger. Differences between the three coefficients are generally small if differences between rater means and variances are small. Furthermore, using simulated and empirical data, it is shown that differences between all reliability coefficients tend to increase if agreement between the raters increases. Moreover, for the data in this study, the same conclusion about inter-rater reliability was reached in virtually all cases with the four correlation coefficients. In addition, using quadratically weighted kappa, we reached a similar conclusion as with any correlation coefficient a great number of times. Hence, for the data in this study, it does not really matter which of these five coefficients is used. Moreover, the four correlation coefficients and quadratically weighted kappa tend to measure agreement in a similar way: their values are very highly correlated for the data in this study.<\/jats:p>","DOI":"10.1007\/s00357-021-09386-5","type":"journal-article","created":{"date-parts":[[2021,4,22]],"date-time":"2021-04-22T07:03:09Z","timestamp":1619074989000},"page":"519-543","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":88,"title":["A Comparison of Reliability Coefficients for Ordinal Rating Scales"],"prefix":"10.1007","volume":"38","author":[{"given":"Alexandra","family":"de Raadt","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7302-640X","authenticated-orcid":false,"given":"Matthijs J.","family":"Warrens","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Roel J.","family":"Bosker","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Henk A. L.","family":"Kiers","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2021,4,22]]},"reference":[{"key":"9386_CR1","first-page":"561","volume":"23","author":"V Abraira","year":"1999","unstructured":"Abraira, V., & P\u00e9rez de Vargas, A. (1999). Generalization of the kappa coefficient for ordinal categorical data, multiple observers and incomplete designs. Q\u00fcestii\u00f3, 23, 561\u2013571.","journal-title":"Q\u00fcestii\u00f3"},{"key":"9386_CR2","doi-asserted-by":"publisher","first-page":"3","DOI":"10.2307\/3315487","volume":"27","author":"M Banerjee","year":"1999","unstructured":"Banerjee, M. (1999). Beyond kappa: a review of interrater agreement measures. Canadian Journal of Statistics-Revue Canadienne de Statistique, 27, 3\u201323.","journal-title":"Canadian Journal of Statistics-Revue Canadienne de Statistique"},{"key":"9386_CR3","doi-asserted-by":"publisher","first-page":"1144","DOI":"10.3758\/BRM.41.4.1144","volume":"41","author":"KJ Berry","year":"2009","unstructured":"Berry, K.J., Johnston, J.E., Zahran, S., & Mielke, P.W. (2009). Stuart\u2019s tau measure of effect size for ordinal variables: some methodological considerations. Behavior Research Methods, 41, 1144\u20131148.","journal-title":"Behavior Research Methods"},{"key":"9386_CR4","doi-asserted-by":"publisher","first-page":"723","DOI":"10.1002\/(SICI)1097-0258(20000315)19:5<723::AID-SIM379>3.0.CO;2-A","volume":"19","author":"NJM Blackman","year":"2000","unstructured":"Blackman, N.J.M., & Koval, J.J. (2000). Interval estimation for Cohen\u2019s kappa as a measure of agreement. Statistics in Medicine, 19, 723\u2013741.","journal-title":"Statistics in Medicine"},{"key":"9386_CR5","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1097\/00001648-199603000-00016","volume":"7","author":"H Brenner","year":"1996","unstructured":"Brenner, H., & Kliebsch, U. (1996). Dependence of weighted kappa coefficients on the number of categories. Epidemiology, 7, 199\u2013202.","journal-title":"Epidemiology"},{"key":"9386_CR6","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1177\/001316446002000104","volume":"20","author":"J Cohen","year":"1960","unstructured":"Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37\u201346.","journal-title":"Educational and Psychological Measurement"},{"key":"9386_CR7","doi-asserted-by":"publisher","first-page":"213","DOI":"10.1037\/h0026256","volume":"70","author":"J Cohen","year":"1968","unstructured":"Cohen, J. (1968). Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213\u2013220.","journal-title":"Psychological Bulletin"},{"key":"9386_CR8","doi-asserted-by":"publisher","first-page":"322","DOI":"10.1037\/0033-2909.88.2.322","volume":"88","author":"AJ Conger","year":"1980","unstructured":"Conger, A.J. (1980). Integration and generalization of kappas for multiple raters. Psychological Bulletin, 88, 322\u2013328.","journal-title":"Psychological Bulletin"},{"key":"9386_CR9","doi-asserted-by":"publisher","first-page":"1391","DOI":"10.2214\/ajr.184.5.01841391","volume":"184","author":"PE Crewson","year":"2005","unstructured":"Crewson, P.E. (2005). Fundamentals of clinical research for radiologists. Reader agreement studies. American Journal of Roentgenology, 184, 1391\u20131397.","journal-title":"American Journal of Roentgenology"},{"key":"9386_CR10","doi-asserted-by":"publisher","first-page":"1047","DOI":"10.2307\/2529886","volume":"38","author":"M Davies","year":"1982","unstructured":"Davies, M., & Fleiss, J.L. (1982). Measuring agreement for multinomial data. Biometrics, 38, 1047\u20131051.","journal-title":"Biometrics"},{"key":"9386_CR11","doi-asserted-by":"publisher","first-page":"558","DOI":"10.1177\/0013164418823249","volume":"79","author":"A De Raadt","year":"2019","unstructured":"De Raadt, A., Warrens, M.J., Bosker, R.J., & Kiers, H.A.L. (2019). Kappa coefficients for missing data. Educational and Psychological Measurement, 79, 558\u2013576.","journal-title":"Educational and Psychological Measurement"},{"key":"9386_CR12","doi-asserted-by":"publisher","first-page":"f2125","DOI":"10.1136\/bmj.f2125","volume":"346","author":"HCW De Vet","year":"2013","unstructured":"De Vet, H.C.W., Mokkink, L.B., Terwee, C.B., Hoekstra, O.S., & Knol, D.L. (2013). Clinicians are right not to like Cohen\u2019s kappa. British Medical Journal, 346, f2125.","journal-title":"British Medical Journal"},{"key":"9386_CR13","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1037\/met0000079","volume":"21","author":"JC De Winter","year":"2016","unstructured":"De Winter, J.C., Gosling, S.D., & Potter, J. (2016). Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: a tutorial using simulations and empirical data. Psychological Methods, 21, 273\u2013290.","journal-title":"Psychological Methods"},{"key":"9386_CR14","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1007\/BF02294581","volume":"58","author":"RF Fagot","year":"1993","unstructured":"Fagot, R.F. (1993). A generalized family of coefficients of relational agreement for numerical scales. Psychometrika, 58, 357\u2013370.","journal-title":"Psychometrika"},{"key":"9386_CR15","doi-asserted-by":"publisher","first-page":"378","DOI":"10.1037\/h0031619","volume":"76","author":"JL Fleiss","year":"1971","unstructured":"Fleiss, J.L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378\u2013382.","journal-title":"Psychological Bulletin"},{"key":"9386_CR16","doi-asserted-by":"publisher","first-page":"613","DOI":"10.1177\/001316447303300309","volume":"33","author":"JL Fleiss","year":"1973","unstructured":"Fleiss, J.L., & Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 33, 613\u2013619.","journal-title":"Educational and Psychological Measurement"},{"key":"9386_CR17","doi-asserted-by":"publisher","first-page":"1055","DOI":"10.1016\/0895-4356(93)90173-X","volume":"46","author":"P Graham","year":"1993","unstructured":"Graham, P., & Jackson, R. (1993). The analysis of ordinal agreement data: beyond weighted kappa. Journal of Clinical Epidemiology, 46, 1055\u20131062.","journal-title":"Journal of Clinical Epidemiology"},{"key":"9386_CR18","volume-title":"Handbook of inter-rater reliability: the definitive guide to measuring the extent of agreement among multiple raters","author":"KL Gwet","year":"2012","unstructured":"Gwet, K.L. (2012). Handbook of inter-rater reliability: the definitive guide to measuring the extent of agreement among multiple raters, 3rd edn. Gaithersburg: Advanced Analytics.","edition":"3rd edn."},{"key":"9386_CR19","doi-asserted-by":"publisher","first-page":"87","DOI":"10.2478\/v10117-011-0021-1","volume":"30","author":"J Hauke","year":"2011","unstructured":"Hauke, J., & Kossowski, T. (2011). Comparison of values of Pearson\u2019s and Spearman\u2019s correlation coefficient on the same sets of data. Quaestiones Geographicae, 30, 87\u201393.","journal-title":"Quaestiones Geographicae"},{"key":"9386_CR20","first-page":"334","volume":"84","author":"ND Holmquist","year":"1967","unstructured":"Holmquist, N.D., McMahan, C.A., & Williams, O.D. (1967). Variability in classification of carcinoma in situ of the uterine cervix. Archives of Pathology, 84, 334\u2013345.","journal-title":"Archives of Pathology"},{"key":"9386_CR21","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1037\/0033-2909.84.2.289","volume":"84","author":"L Hubert","year":"1977","unstructured":"Hubert, L. (1977). Kappa revisited. Psychological Bulletin, 84, 289\u2013297.","journal-title":"Psychological Bulletin"},{"key":"9386_CR22","volume-title":"Rank correlation methods","author":"MG Kendall","year":"1955","unstructured":"Kendall, M.G. (1955). Rank correlation methods, 2nd edn. New York City: Hafner Publishing Co.","edition":"2nd edn."},{"key":"9386_CR23","volume-title":"Rank correlation methods","author":"MG Kendall","year":"1962","unstructured":"Kendall, M.G. (1962). Rank correlation methods, 3rd edn. Liverpool: Charles Birchall & Sons Ltd.","edition":"3rd edn."},{"key":"9386_CR24","first-page":"142","volume":"34","author":"K Krippendorff","year":"1978","unstructured":"Krippendorff, K. (1978). Reliability of binary attribute data. Biometrics, 34, 142\u2013144.","journal-title":"Biometrics"},{"key":"9386_CR25","volume-title":"Content analysis: an introduction to its methodology","author":"K Krippendorff","year":"2013","unstructured":"Krippendorff, K. (2013). Content analysis: an introduction to its methodology, 3rd edn. Thousand Oaks: Sage.","edition":"3rd edn."},{"key":"9386_CR26","doi-asserted-by":"publisher","first-page":"303","DOI":"10.1148\/radiol.2282011860","volume":"228","author":"HL Kundel","year":"2003","unstructured":"Kundel, H.L., & Polansky, M. (2003). Measurement of observer agreement. Radiology, 228, 303\u2013308.","journal-title":"Radiology"},{"key":"9386_CR27","doi-asserted-by":"publisher","first-page":"159","DOI":"10.2307\/2529310","volume":"33","author":"JR Landis","year":"1977","unstructured":"Landis, J.R., & Koch, G.G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159\u2013174.","journal-title":"Biometrics"},{"key":"9386_CR28","doi-asserted-by":"publisher","first-page":"365","DOI":"10.1037\/h0031643","volume":"76","author":"RJ Light","year":"1971","unstructured":"Light, R.J. (1971). Measures of response agreement for qualitative data: some generalizations and alternatives. Psychological Bulletin, 76, 365\u2013377.","journal-title":"Psychological Bulletin"},{"key":"9386_CR29","doi-asserted-by":"publisher","first-page":"161","DOI":"10.1093\/aje\/126.2.161","volume":"126","author":"M Maclure","year":"1987","unstructured":"Maclure, M., & Willett, W.C. (1987). Misinterpretation and misuse of the kappa statistic. Journal of Epidemiology, 126, 161\u2013169.","journal-title":"Journal of Epidemiology"},{"key":"9386_CR30","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1037\/1082-989X.1.1.30","volume":"1","author":"KO McGraw","year":"1996","unstructured":"McGraw, K.O., & Wong, S.P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1, 30\u201346.","journal-title":"Psychological Methods"},{"key":"9386_CR31","doi-asserted-by":"publisher","first-page":"276","DOI":"10.11613\/BM.2012.031","volume":"22","author":"ML McHugh","year":"2012","unstructured":"McHugh, M.L. (2012). Interrater reliability: the kappa statistic. Biochemia Medica, 22, 276\u2013282.","journal-title":"Biochemia Medica"},{"key":"9386_CR32","doi-asserted-by":"publisher","first-page":"655","DOI":"10.2466\/pr0.101.2.655-660","volume":"101","author":"PW Mielke","year":"2007","unstructured":"Mielke, P.W., Berry, K.J., & Johnston, J.E. (2007). The exact variance of weighted kappa with multiple raters. Psychological Reports, 101, 655\u2013660.","journal-title":"Psychological Reports"},{"key":"9386_CR33","doi-asserted-by":"publisher","first-page":"606","DOI":"10.2466\/pr0.102.2.606-613","volume":"102","author":"PW Mielke","year":"2008","unstructured":"Mielke, P.W., Berry, K.J., & Johnston, J.E. (2008). Resampling probability values for weighted kappa with multiple raters. Psychological Reports, 102, 606\u2013613.","journal-title":"Psychological Reports"},{"key":"9386_CR34","first-page":"3769","volume":"46","author":"N Moradzadeh","year":"2017","unstructured":"Moradzadeh, N., Ganjali, M., & Baghfalaki, T. (2017). Weighted kappa as a function of unweighted kappas. Communications in Statistics - Simulation and Computation, 46, 3769\u20133780.","journal-title":"Communications in Statistics - Simulation and Computation"},{"key":"9386_CR35","first-page":"69","volume":"24","author":"MM Mukaka","year":"2012","unstructured":"Mukaka, M.M. (2012). A guide to appropriate use of correlation coefficient in medical research. Malawi Medical Journal, 24, 69\u201371.","journal-title":"Malawi Medical Journal"},{"key":"9386_CR36","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1080\/02664769723918","volume":"24","author":"SR Mu\u00f1oz","year":"1997","unstructured":"Mu\u00f1oz, S.R., & Bangdiwala, S.I. (1997). Interpretation of kappa and B statistics measures of agreement. Journal of Applied Statistics, 24, 105\u2013111.","journal-title":"Journal of Applied Statistics"},{"key":"9386_CR37","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1016\/j.jsp.2012.12.003","volume":"51","author":"RI Parker","year":"2013","unstructured":"Parker, R.I., Vannest, K.J., & Davis, J.L. (2013). Reliability of multi-category rating scales. Journal of School Psychology, 51, 217\u2013229.","journal-title":"Journal of School Psychology"},{"key":"9386_CR38","unstructured":"R Core Team. (2019). R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria."},{"key":"9386_CR39","doi-asserted-by":"publisher","first-page":"59","DOI":"10.2307\/2685263","volume":"42","author":"JL Rodgers","year":"1988","unstructured":"Rodgers, J.L., & Nicewander, W.A. (1988). Thirteen ways to look at the correlation coefficient. The American Statistician, 42, 59\u201366.","journal-title":"The American Statistician"},{"key":"9386_CR40","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1086\/266577","volume":"19","author":"WA Scott","year":"1955","unstructured":"Scott, W.A. (1955). Reliability of content analysis: the case of nominal scale coding. Public Opinion Quarterly, 19, 321\u2013325.","journal-title":"Public Opinion Quarterly"},{"key":"9386_CR41","doi-asserted-by":"publisher","first-page":"453","DOI":"10.1007\/BF02294066","volume":"51","author":"HJA Schouten","year":"1986","unstructured":"Schouten, H.J.A. (1986). Nominal scale agreement among observers. Psychometrika, 51, 453\u2013466.","journal-title":"Psychometrika"},{"key":"9386_CR42","doi-asserted-by":"publisher","first-page":"243","DOI":"10.1177\/0013164403260197","volume":"64","author":"C Schuster","year":"2004","unstructured":"Schuster, C. (2004). A note on the interpretation of weighted kappa and its relations to other rater agreement statistics for metric scales. Educational and Psychological Measurement, 64, 243\u2013253.","journal-title":"Educational and Psychological Measurement"},{"key":"9386_CR43","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1007\/s11336-003-1110-4","volume":"70","author":"C Schuster","year":"2005","unstructured":"Schuster, C., & Smith, D.A. (2005). Dispersion weighted kappa: an integrative framework for metric and nominal scale agreement coefficients. Psychometrika, 70, 135\u2013146.","journal-title":"Psychometrika"},{"key":"9386_CR44","doi-asserted-by":"publisher","first-page":"420","DOI":"10.1037\/0033-2909.86.2.420","volume":"86","author":"PE Shrout","year":"1979","unstructured":"Shrout, P.E., & Fleiss, J.L. (1979). Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin, 86, 420\u2013428.","journal-title":"Psychological Bulletin"},{"key":"9386_CR45","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1016\/j.jamcollsurg.2009.09.031","volume":"1","author":"M Shiloach","year":"2010","unstructured":"Shiloach, M., Frencher, S.K., Steeger, J.E., Rowell, K.S., Bartzokis, K., Tomeh, M.G., & Hall, B.L. (2010). Toward robust information: data quality and inter-rater reliability in American college of surgeons national surgical quality improvement program. Journal of the American College of Surgeons, 1, 6\u201316.","journal-title":"Journal of the American College of Surgeons"},{"key":"9386_CR46","volume-title":"Nonparametric statistics for the behavioral sciences","author":"S Siegel","year":"1988","unstructured":"Siegel, S., & Castellan, N.J. (1988). Nonparametric statistics for the behavioral sciences. New York: McGraw-Hill."},{"key":"9386_CR47","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1093\/ptj\/85.3.257","volume":"85","author":"J Sim","year":"2005","unstructured":"Sim, J., & Wright, C.C. (2005). The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Physical Therapy, 85, 257\u2013268.","journal-title":"Physical Therapy"},{"key":"9386_CR48","doi-asserted-by":"publisher","first-page":"733","DOI":"10.1097\/00005650-198608000-00008","volume":"24","author":"KL Soeken","year":"1986","unstructured":"Soeken, K.L., & Prescott, P.A. (1986). Issues in the use of kappa to estimate reliability. Medical Care, 24, 733\u2013741.","journal-title":"Medical Care"},{"key":"9386_CR49","doi-asserted-by":"publisher","first-page":"394","DOI":"10.1016\/j.learninstruc.2007.03.005","volume":"17","author":"J-W Strijbos","year":"2007","unstructured":"Strijbos, J.-W., & Stahl, G. (2007). Methodological issues in developing a multi-dimensional coding procedure for small-group chat communication. Learning and Instruction, 17, 394\u2013404.","journal-title":"Learning and Instruction"},{"key":"9386_CR50","doi-asserted-by":"crossref","unstructured":"Tinsley, H.E.A., & Weiss, D.J. (2000). Interrater reliability and agreement. In Tinsley, H.E.A., & Brown, S.D. (Eds.) Handbook of applied multivariate statistics and mathematical modeling (pp. 94\u2013124). Academic Press: New York.","DOI":"10.1016\/B978-012691360-6\/50005-7"},{"key":"9386_CR51","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1016\/j.stamet.2008.06.001","volume":"6","author":"S Vanbelle","year":"2009","unstructured":"Vanbelle, S., & Albert, A. (2009). A note on the linearly weighted kappa coefficient for ordinal scales. Statistical Methodology, 6, 157\u2013163.","journal-title":"Statistical Methodology"},{"key":"9386_CR52","doi-asserted-by":"publisher","first-page":"399","DOI":"10.1007\/s11336-014-9439-4","volume":"81","author":"S Vanbelle","year":"2016","unstructured":"Vanbelle, S. (2016). A new interpretation of the weighted kappa coefficients. Psychometrika, 81, 399\u2013410.","journal-title":"Psychometrika"},{"key":"9386_CR53","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1080\/09243453.2013.794845","volume":"25","author":"W Van de Grift","year":"2007","unstructured":"Van de Grift, W. (2007). Measuring teaching quality in several European countries. School Effectiveness and School Improvement, 25, 295\u2013311.","journal-title":"School Effectiveness and School Improvement"},{"key":"9386_CR54","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1016\/j.tate.2017.02.018","volume":"65","author":"EA Van der Scheer","year":"2017","unstructured":"Van der Scheer, E.A., Glas, C.A.W., & Visscher, A.J. (2017). Changes in teachers\u2019 instructional skills during an intensive data-based decision making intervention. Teaching and Teacher Education, 65, 171\u2013182.","journal-title":"Teaching and Teacher Education"},{"key":"9386_CR55","first-page":"360","volume":"37","author":"AJ Viera","year":"2005","unstructured":"Viera, A.J., & Garrett, J.M. (2005). Understanding interobserver agreement: the kappa statistic. Family Medicine, 37, 360\u2013363.","journal-title":"Family Medicine"},{"key":"9386_CR56","doi-asserted-by":"publisher","first-page":"271","DOI":"10.1007\/s11634-010-0073-4","volume":"4","author":"MJ Warrens","year":"2010","unstructured":"Warrens, M.J. (2010). Inequalities between multi-rater kappas. Advances in Data Analysis and Classification, 4, 271\u2013286.","journal-title":"Advances in Data Analysis and Classification"},{"key":"9386_CR57","doi-asserted-by":"publisher","first-page":"268","DOI":"10.1016\/j.stamet.2010.09.004","volume":"8","author":"MJ Warrens","year":"2011","unstructured":"Warrens, M.J. (2011). Weighted kappa is higher than Cohen\u2019s kappa for tridiagonal agreement tables. Statistical Methodology, 8, 268\u2013272.","journal-title":"Statistical Methodology"},{"key":"9386_CR58","doi-asserted-by":"publisher","first-page":"315","DOI":"10.1007\/s11336-012-9258-4","volume":"77","author":"MJ Warrens","year":"2012","unstructured":"Warrens, M.J. (2012a). Some paradoxical results for the quadratically weighted kappa. Psychometrika, 77, 315\u2013323.","journal-title":"Psychometrika"},{"key":"9386_CR59","doi-asserted-by":"publisher","first-page":"330","DOI":"10.1016\/j.stamet.2011.08.008","volume":"9","author":"MJ Warrens","year":"2012","unstructured":"Warrens, M.J. (2012b). A family of multi-rater kappas that can always be increased and decreased by combining categories. Statistical Methodology, 9, 330\u2013340.","journal-title":"Statistical Methodology"},{"key":"9386_CR60","doi-asserted-by":"publisher","first-page":"407","DOI":"10.1016\/j.stamet.2011.11.001","volume":"9","author":"MJ Warrens","year":"2012","unstructured":"Warrens, M.J. (2012c). Equivalences of weighted kappas for multiple raters. Statistical Methodology, 9, 407\u2013422.","journal-title":"Statistical Methodology"},{"key":"9386_CR61","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1016\/j.stamet.2012.05.004","volume":"10","author":"MJ Warrens","year":"2013","unstructured":"Warrens, M.J. (2013). Conditional inequalities between Cohen\u2019s kappa and weighted kappas. Statistical Methodology, 10, 14\u201322.","journal-title":"Statistical Methodology"},{"key":"9386_CR62","doi-asserted-by":"publisher","first-page":"179","DOI":"10.1007\/s00357-014-9156-9","volume":"31","author":"MJ Warrens","year":"2014","unstructured":"Warrens, M.J. (2014). Corrected Zegers-ten Berge coefficients are special cases of Cohen\u2019s weighted kappa. Journal of Classification, 31, 179\u2013193.","journal-title":"Journal of Classification"},{"key":"9386_CR63","doi-asserted-by":"publisher","first-page":"197","DOI":"10.4172\/2161-0487.1000197","volume":"5","author":"MJ Warrens","year":"2015","unstructured":"Warrens, M.J. (2015). Five ways to look at Cohen\u2019s kappa. Journal of Psychology & Psychotherapy, 5, 197.","journal-title":"Journal of Psychology & Psychotherapy"},{"key":"9386_CR64","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1016\/j.jclinepi.2017.03.005","volume":"85","author":"MJ Warrens","year":"2017","unstructured":"Warrens, M.J. (2017). Transforming intraclass correlations with the Spearman-Brown formula. Journal of Clinical Epidemiology, 85, 14\u201316.","journal-title":"Journal of Clinical Epidemiology"},{"key":"9386_CR65","doi-asserted-by":"publisher","first-page":"307","DOI":"10.1111\/1469-7610.00023","volume":"43","author":"L Wing","year":"2002","unstructured":"Wing, L., Leekam, S.R., Libby, S.J., Gould, J., & Larcombe, M. (2002). The diagnostic interview for Social and Communication disorders: background, inter-rater reliability and clinical use. Journal of Child Psychology and Psychiatry, 43, 307\u2013325.","journal-title":"Journal of Child Psychology and Psychiatry"},{"key":"9386_CR66","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1016\/j.sigpro.2012.08.005","volume":"93","author":"W Xu","year":"2013","unstructured":"Xu, W., Hou, Y., Hung, Y.S., & Zou, Y. (2013). A comparative analysis of Spearman\u2019s rho and Kendall\u2019s tau in normaland contaminated normal models. Signal Processing, 93, 261\u2013276.","journal-title":"Signal Processing"}],"container-title":["Journal of Classification"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00357-021-09386-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00357-021-09386-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00357-021-09386-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,29]],"date-time":"2022-10-29T22:05:00Z","timestamp":1667081100000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00357-021-09386-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,22]]},"references-count":66,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2021,10]]}},"alternative-id":["9386"],"URL":"https:\/\/doi.org\/10.1007\/s00357-021-09386-5","relation":{},"ISSN":["0176-4268","1432-1343"],"issn-type":[{"value":"0176-4268","type":"print"},{"value":"1432-1343","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,4,22]]},"assertion":[{"value":"24 March 2021","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 April 2021","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}