This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import glob | |
| import json | |
| from pymongo import MongoClient | |
| # fill in hostname and port | |
| HOST = "hostname" | |
| PORT = 27017 | |
| client = MongoClient(HOST, PORT) | |
| # fill in dbname and colname |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # install tidyverse if you don't have it | |
| # install.packages("tidyverse") | |
| library(tidyverse) | |
| ## Read the csv from a URL | |
| url <- "http://assets.datacamp.com/course/compfin/sbuxPrices.csv" | |
| df <- read_csv(url) | |
| ## lubridate package to format the date | |
| # if you get an error below, are you sure you have lubridate? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(tidyverse); library(lubridate) | |
| url <- "http://nodeassets.nbcnews.com/russian-twitter-trolls/tweets.csv" | |
| tweets <- read_csv(url) | |
| user.url <- "http://nodeassets.nbcnews.com/russian-twitter-trolls/users.csv" | |
| users <- read_csv(user.url) | |
| tweets %>% | |
| count(Date = as.Date(created_str)) %>% |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| --- | |
| title: "Analyzing Russian Trolls: Tidyverse & Text" | |
| author: "Ryan Wesslen" | |
| date: "2/21/2018" | |
| output: html_document | |
| --- | |
| ```{r setup, include=FALSE} | |
| knitr::opts_chunk$set(echo = TRUE, warning = FALSE) | |
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| labels <- sageLabels(ctmFit, n = 5) | |
| topicsNames <- sapply(1:30, function(x) paste0(labels$marginal$frex[x,], collapse = " + ")) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| ## issues: | |
| ## -only gets replies within last ~7 days to the post due to public REST API limits | |
| ## -counts don't necessarily align with total replies via browser, perhaps due to private accounts (?) | |
| get_replies <- function(tweetid){ | |
| # get status information for given tweet | |
| t <- rtweet::lookup_statuses(statuses = tweetid, token = ryan_rtweets) | |
| # use search API to find all tweets directed to the poster | |
| # and keep only replied to that status |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(quanteda); library(ggrepel); library(tidyverse) | |
| ggplotWordcloud <- function(df, maxWords = 50){ | |
| corpus(df$text) %>% | |
| dfm(remove_punct = TRUE, remove = stopwords("English")) %>% | |
| topfeatures(n = maxWords) %>% | |
| as.tibble() %>% | |
| rownames_to_column(var = "word") %>% | |
| slice(1:maxWords) %>% |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Project 1: Charlotte Protest | |
| library(tidyverse); library(lubridate); library(xts); library(dygraphs); library(quanteda) | |
| # 10% sample | |
| protestData <- readRDS("../Protest.RData") %>% | |
| mutate(time = paste0(substr(postedTime, 1, 13), "00:00 EDT")) %>% # convert to hourly | |
| mutate(time = ymd_hms(time)) %>% | |
| select(time, verb, postedTime, body) | |
| # get daily counts |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(rtweet) | |
| library(dplyr) | |
| files <- list.files() | |
| files <- files[grep(".json", files)] | |
| getPoints <- function(file){ | |
| parse_stream(file) %>% | |
| lat_lng("bbox_coords") %>% # keep bounding box coords | |
| filter(is_retweet == FALSE & !is.na(lat)) %>% # keep posts and point lat/longs |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(tidyverse); library(rtweet) | |
| get_timeline(c("senthomtillis","thomtillis"), n = 3200) %>% | |
| mutate(keyword = grepl("bipartisan", text, ignore.case = TRUE)) %>% | |
| filter(keyword) %>% | |
| group_by(screen_name) %>% | |
| ts_plot("3 months", trim = 1L) + | |
| labs(x= "Date", y= NULL, title = "Thom Tillis' Tweets mentioning `bipartisan*` by Twitter account") + | |
| theme(text = element_text(size = 12), | |
| legend.position = c(0.3,0.5)) |