Skip to content

Instantly share code, notes, and snippets.

@thoughtfulbloke
Created July 14, 2021 03:42
Show Gist options
  • Save thoughtfulbloke/9cf860499429f925e9203c01a026c837 to your computer and use it in GitHub Desktop.
Save thoughtfulbloke/9cf860499429f925e9203c01a026c837 to your computer and use it in GitHub Desktop.
Applying topic analysis of nzpol and relating it to user accounts
library(rtweet)
library(dplyr)
library(lubridate)
library(ggplot2)
library(tidytext)
library(topicmodels)
search_term <- "#nzpol"
corpus <- search_tweets(search_term, n=18000)
nzpol_utm <- corpus %>% select(screen_name, text) %>%
unnest_tokens(word,text, token="tweets") %>%
anti_join(stop_words) %>%
group_by(screen_name) %>%
count(screen_name, word) %>%
cast_dtm(screen_name, word, n)
nzpol_lda <- LDA(nzpol_utm, k = 12, control = list(seed = 1234))
user_topics <- tidy(nzpol_lda, matrix = "beta")
top_terms <- user_topics %>%
group_by(topic) %>%
slice_max(beta, n = 20) %>%
ungroup() %>%
arrange(topic, -beta)
# then just looked through top terms to form an opinion of what the
# topics were
users_gamma <- tidy(nzpol_lda, matrix = "gamma")
user_focus <- users_gamma %>%
group_by(document) %>%
mutate(focus = gamma - mean(gamma)) %>%
arrange(desc(focus)) %>%
slice(1) %>%
ungroup() %>%
rename(screen_name=document)
nzpol_focus <- corpus %>%
group_by(screen_name, account_created_at) %>%
summarise(account_age = as.numeric(difftime(Sys.time(), account_created_at[1], units="mins"))) %>% ungroup() %>%
inner_join(user_focus) %>%
mutate(age_ntile = ntile(account_age, 20),
age_group = ifelse(age_ntile == 1, "newest_0.05", "older accounts")) %>%
filter(ntile(focus, 20) == 20) %>%
count(topic, age_group) %>%
group_by(age_group) %>%
mutate(percentage = n/sum(n)) %>%
ungroup()
ggplot(nzpol_focus, aes(x=factor(topic), y=percentage, fill=age_group)) +
geom_col(position=position_dodge()) + theme_minimal()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment