Created
March 3, 2016 22:25
-
-
Save Mistobaan/337222ac3acbfc00bdac to your computer and use it in GitHub Desktop.
Confusion Metrics written in tensorflow format
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# from https://cloud.google.com/solutions/machine-learning-with-financial-time-series-data | |
def tf_confusion_metrics(model, actual_classes, session, feed_dict): | |
predictions = tf.argmax(model, 1) | |
actuals = tf.argmax(actual_classes, 1) | |
ones_like_actuals = tf.ones_like(actuals) | |
zeros_like_actuals = tf.zeros_like(actuals) | |
ones_like_predictions = tf.ones_like(predictions) | |
zeros_like_predictions = tf.zeros_like(predictions) | |
tp_op = tf.reduce_sum( | |
tf.cast( | |
tf.logical_and( | |
tf.equal(actuals, ones_like_actuals), | |
tf.equal(predictions, ones_like_predictions) | |
), | |
"float" | |
) | |
) | |
tn_op = tf.reduce_sum( | |
tf.cast( | |
tf.logical_and( | |
tf.equal(actuals, zeros_like_actuals), | |
tf.equal(predictions, zeros_like_predictions) | |
), | |
"float" | |
) | |
) | |
fp_op = tf.reduce_sum( | |
tf.cast( | |
tf.logical_and( | |
tf.equal(actuals, zeros_like_actuals), | |
tf.equal(predictions, ones_like_predictions) | |
), | |
"float" | |
) | |
) | |
fn_op = tf.reduce_sum( | |
tf.cast( | |
tf.logical_and( | |
tf.equal(actuals, ones_like_actuals), | |
tf.equal(predictions, zeros_like_predictions) | |
), | |
"float" | |
) | |
) | |
tp, tn, fp, fn = \ | |
session.run( | |
[tp_op, tn_op, fp_op, fn_op], | |
feed_dict | |
) | |
tpr = float(tp)/(float(tp) + float(fn)) | |
fpr = float(fp)/(float(tp) + float(fn)) | |
accuracy = (float(tp) + float(tn))/(float(tp) + float(fp) + float(fn) + float(tn)) | |
recall = tpr | |
precision = float(tp)/(float(tp) + float(fp)) | |
f1_score = (2 * (precision * recall)) / (precision + recall) | |
print 'Precision = ', precision | |
print 'Recall = ', recall | |
print 'F1 Score = ', f1_score | |
print 'Accuracy = ', accuracy |
I think for multi-labels, there is no real true negative, so in the replay from @carlthome, the calculation of tn is no need and actually wrong
You're not supposed to use the sigmoid for a one-hot encoding because you get numerical instability. You need to use softmax EVERY TIME for one-hot else you will get erroneous results form the floating-point rounding errors.
The fpr
is wrong. It should be fpr = float(fp)/(float(fp) + float(tn))
https://en.wikipedia.org/wiki/False_positive_rate
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Correct me if I'm wrong, but building on the reply by @carlthome above, it looks like you could further simplify some of your performance measures with things like:
The
accuracy
operation above gives you the sum of all cases whereactual == predicted
(equivalent totp + tn
), divided by the total number of samples (equivalent totp + fp + fn + tn
). The error measure gives you the inverse of this, so(fp + fn) / total_samples
.I am not sure which method ends up being faster, but if all you need is
accuracy
orerror
(as defined here), this saves you having to find all oftp
,tn
,fp
, andfn
.