Cohens Kappa (κ) calculates inter-Observer`s compliance based on the expected concordance as follows: example sas (19.3_agreement_Cohen.sas): Two radiologists evaluated 85 patients for liver damage. The ratings were identified on an ordinary scale as much as possible: these two themes – knowledge of the objectives and consideration of the theory – are the main keys to a successful analysis of the data of the agreement. Below are some other more specific questions regarding the choice of appropriate methods for a given study. Quantifying compliance in another way inevitably involves a model of how assessments are issued and why assessors agree or disagree. This model is either explicit, as with latent structural models, or implicitly, as in the case of Kappa coefficients. In this context, two fundamental principles are obvious: this statistic compares the observed compliance with the expected agreement, calculated assuming that the evaluations are independent. You cannot reliably compare Kappa values from different studies, as Kappa is sensitive to the prevalence of different categories. In other words, if one category is observed more often in one study than another, Kappa may indicate a difference in concordance between reviewers that is not due to reviewers. Note that a strong agreement implies a strong association, but a strong association may not imply strong convergence. For example, if Siskel classifies most films in the Con category, while Ebert classifies them in the pro category, the association might be strong, but there is certainly no agreement.

You can also think about the situation in which one examiner is tougher than the other. The first always gives a rating less than the sweetest. In this case too, the association is very strong, but the approval may be insignificant. The purpose of data analysis would also be taken into consideration. . . .