Skip to content

Instantly share code, notes, and snippets.

@audhiaprilliant
Created December 24, 2020 02:58
Show Gist options
  • Select an option

  • Save audhiaprilliant/faab148eabe49d4a1a32fd15491e21c1 to your computer and use it in GitHub Desktop.

Select an option

Save audhiaprilliant/faab148eabe49d4a1a32fd15491e21c1 to your computer and use it in GitHub Desktop.
How to choose the optimal threshold for imbalanced classification
# Calculate the G-mean
gmean = np.sqrt(tpr * (1 - fpr))
# Find the optimal threshold
index = np.argmax(gmean)
thresholdOpt = round(thresholds[index], ndigits = 4)
gmeanOpt = round(gmean[index], ndigits = 4)
fprOpt = round(fpr[index], ndigits = 4)
tprOpt = round(tpr[index], ndigits = 4)
print('Best Threshold: {} with G-Mean: {}'.format(thresholdOpt, gmeanOpt))
print('FPR: {}, TPR: {}'.format(fprOpt, tprOpt))
# Create data viz
plotnine.options.figure_size = (8, 4.8)
(
ggplot(data = df_fpr_tpr)+
geom_point(aes(x = 'FPR',
y = 'TPR'),
size = 0.4)+
# Best threshold
geom_point(aes(x = fprOpt,
y = tprOpt),
color = '#981220',
size = 4)+
geom_line(aes(x = 'FPR',
y = 'TPR'))+
geom_text(aes(x = fprOpt,
y = tprOpt),
label = 'Optimal threshold \n for class: {}'.format(thresholdOpt),
nudge_x = 0.14,
nudge_y = -0.10,
size = 10,
fontstyle = 'italic')+
labs(title = 'ROC Curve')+
xlab('False Positive Rate (FPR)')+
ylab('True Positive Rate (TPR)')+
theme_minimal()
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment