-
-
Save mohapatras/4a057cfdbbc1350ff7097137fcdc12ce to your computer and use it in GitHub Desktop.
Use Youden index to determine cut-off for classification
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
from sklearn.metrics import roc_curve | |
def sensivity_specifity_cutoff(y_true, y_score): | |
'''Find data-driven cut-off for classification | |
Cut-off is determied using Youden's index defined as sensitivity + specificity - 1. | |
Parameters | |
---------- | |
y_true : array, shape = [n_samples] | |
True binary labels. | |
y_score : array, shape = [n_samples] | |
Target scores, can either be probability estimates of the positive class, | |
confidence values, or non-thresholded measure of decisions (as returned by | |
“decision_function” on some classifiers). | |
References | |
---------- | |
Ewald, B. (2006). Post hoc choice of cut points introduced bias to diagnostic research. | |
Journal of clinical epidemiology, 59(8), 798-801. | |
Steyerberg, E.W., Van Calster, B., & Pencina, M.J. (2011). Performance measures for | |
prediction models and markers: evaluation of predictions and classifications. | |
Revista Espanola de Cardiologia (English Edition), 64(9), 788-794. | |
Jiménez-Valverde, A., & Lobo, J.M. (2007). Threshold criteria for conversion of probability | |
of species presence to either–or presence–absence. Acta oecologica, 31(3), 361-369. | |
''' | |
fpr, tpr, thresholds = roc_curve(y_true, y_score) | |
idx = np.argmax(tpr - fpr) | |
return thresholds[idx] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment