Skip to content

Instantly share code, notes, and snippets.

@audhiaprilliant
Last active April 23, 2022 12:33
Show Gist options
  • Select an option

  • Save audhiaprilliant/cc2cb84cefd94a0d13a15257c6682ad7 to your computer and use it in GitHub Desktop.

Select an option

Save audhiaprilliant/cc2cb84cefd94a0d13a15257c6682ad7 to your computer and use it in GitHub Desktop.
How to Automatically Build Stopwords
# Data viz
plotnine.options.figure_size = (10, 4.8)
(
ggplot(
data = df[:20]
)+
geom_bar(
aes(
x = 'word',
y = 'actual_freq'
),
stat = 'identity',
width = 0.5,
fill = '#981220'
)+
geom_line(
aes(
x = 'word',
y = 'zipf_freq',
group = 1
),
size = 1
)+
scale_x_discrete(
limits = df[:20]['word'].tolist()
)+
labs(
title = 'Most Common Words in English Literature'
)+
xlab(
xlab = 'Word'
)+
ylab(
ylab = 'Frequency'
)+
theme_minimal()+
theme(
axis_text_x = element_text(
rotation = 90,
hjust = 0.25
)
)
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment