Skip to content

Instantly share code, notes, and snippets.

@audhiaprilliant
Created December 24, 2020 03:03
Show Gist options
  • Select an option

  • Save audhiaprilliant/e720f6d4988ef8681efbed5aceefc9fa to your computer and use it in GitHub Desktop.

Select an option

Save audhiaprilliant/e720f6d4988ef8681efbed5aceefc9fa to your computer and use it in GitHub Desktop.
How to choose the optimal threshold for imbalanced classification
# Calculate the f-score
fscore = (2 * precision * recall) / (precision + recall)
# Find the optimal threshold
index = np.argmax(fscore)
thresholdOpt = round(thresholds[index], ndigits = 4)
fscoreOpt = round(fscore[index], ndigits = 4)
recallOpt = round(recall[index], ndigits = 4)
precisionOpt = round(precision[index], ndigits = 4)
print('Best Threshold: {} with F-Score: {}'.format(thresholdOpt, fscoreOpt))
print('Recall: {}, Precision: {}'.format(recallOpt, precisionOpt))
# Create a data viz
plotnine.options.figure_size = (8, 4.8)
(
ggplot(data = df_recall_precision)+
geom_point(aes(x = 'Recall',
y = 'Precision'),
size = 0.4)+
# Best threshold
geom_point(aes(x = recallOpt,
y = precisionOpt),
color = '#981220',
size = 4)+
geom_line(aes(x = 'Recall',
y = 'Precision'))+
# Annotate the text
geom_text(aes(x = recallOpt,
y = precisionOpt),
label = 'Optimal threshold \n for class: {}'.format(thresholdOpt),
nudge_x = 0.18,
nudge_y = 0,
size = 10,
fontstyle = 'italic')+
labs(title = 'Recall Precision Curve')+
xlab('Recall')+
ylab('Precision')+
theme_minimal()
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment