Created
February 10, 2013 13:13
-
-
Save ShinNoNoir/4749548 to your computer and use it in GitHub Desktop.
Simple implementation of the Fleiss' kappa measure in Python
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def fleiss_kappa(ratings, n, k): | |
''' | |
Computes the Fleiss' kappa measure for assessing the reliability of | |
agreement between a fixed number n of raters when assigning categorical | |
ratings to a number of items. | |
Args: | |
ratings: a list of (item, category)-ratings | |
n: number of raters | |
k: number of categories | |
Returns: | |
the Fleiss' kappa score | |
See also: | |
http://en.wikipedia.org/wiki/Fleiss'_kappa | |
''' | |
items = set() | |
categories = set() | |
n_ij = {} | |
for i, c in ratings: | |
items.add(i) | |
categories.add(c) | |
n_ij[(i,c)] = n_ij.get((i,c), 0) + 1 | |
N = len(items) | |
p_j = {} | |
for c in categories: | |
p_j[c] = sum(n_ij.get((i,c), 0) for i in items) / (1.0*n*N) | |
P_i = {} | |
for i in items: | |
P_i[i] = (sum(n_ij.get((i,c), 0)**2 for c in categories)-n) / (n*(n-1.0)) | |
P_bar = sum(P_i.itervalues()) / (1.0*N) | |
P_e_bar = sum(p_j[c]**2 for c in categories) | |
kappa = (P_bar - P_e_bar) / (1 - P_e_bar) | |
return kappa | |
example = ( [( 1,5)] * 14 + | |
[( 2,2)] * 2 + [( 2,3)] * 6 + [( 2,4)] * 4 + [( 2,5)] * 2 + | |
[( 3,3)] * 3 + [( 3,4)] * 5 + [( 3,5)] * 6 + | |
[( 4,2)] * 3 + [( 4,3)] * 9 + [( 4,4)] * 2 + | |
[( 5,1)] * 2 + [( 5,2)] * 2 + [( 5,3)] * 8 + [( 5,4)] * 1 + [( 5,5)] * 1 + | |
[( 6,1)] * 7 + [( 6,2)] * 7 + | |
[( 7,1)] * 3 + [( 7,2)] * 2 + [( 7,3)] * 6 + [( 7,4)] * 3 + | |
[( 8,1)] * 2 + [( 8,2)] * 5 + [( 8,3)] * 3 + [( 8,4)] * 2 + [( 8,5)] * 2 + | |
[( 9,1)] * 6 + [( 9,2)] * 5 + [( 9,3)] * 2 + [( 9,4)] * 1 + | |
[(10,2)] * 2 + [(10,3)] * 2 + [(10,4)] * 3 + [(10,5)] * 7 ) | |
print '%.03f' % fleiss_kappa(example, 14, 5) # 0.210 |
A trivial +1 to the 2nd "trivial comment".
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Another trivial comment: I would suggest using "k" as a parameter and let the user decide how many categories there are. Just because nobody voted for a category doesn't mean it wasn't available.