Created
February 12, 2019 17:44
-
-
Save brshallo/e53aeec73d91bcacc606343ca30fdb5c to your computer and use it in GitHub Desktop.
Plot showing relationship between entropy and gini in relation to proportion of event (and that gini and entropy follow same pattern).
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(tidyverse) | |
df_metrics <- tibble( | |
prob = seq.int(0.001, 0.999, length.out = 999), | |
entropy = -2 * (prob * log(prob) + (1-prob)*log(1-prob)), | |
gini_index = 4 * prob * (1 - prob) | |
) %>% | |
gather(entropy, gini_index, key = "purity_metric", value = "value") | |
ggplot(df_metrics, aes(x = prob, y = value, colour = purity_metric))+ | |
geom_line()+ | |
facet_wrap(~purity_metric, scales = "free_y", ncol = 1)+ | |
labs(title = "Plot of entropy and gini vs proportion of binary outcome after one split", | |
subtitle = "number of regions = 2 ; number of outcomes = 2", | |
x = "proportion K") |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment