Skip to content

Instantly share code, notes, and snippets.

@IanHopkinson
Last active December 21, 2015 09:09
Show Gist options
  • Save IanHopkinson/6283234 to your computer and use it in GitHub Desktop.
Save IanHopkinson/6283234 to your computer and use it in GitHub Desktop.
R used to generate views of ScraperWiki's Twitter Search tool
#!/usr/bin/Rscript
# Script to create r-view 2013-08-14
# Ian Hopkinson
source('scraperwiki_utils.R')
NumberOfTweets<-function(){
query = 'select count(*) from tweets'
number = ScraperWikiSQL(query)
return(number)
}
TweetsHistogram<-function(){
library("ggplot2")
library("scales")
#threshold = 20
bin = 60 # Size of the time bins in seconds
query = 'select created_at from tweets order by created_at limit 40000'
dates_raw = ScraperWikiSQL(query)
posix = strptime(dates_raw$created_at, "%Y-%m-%d %H:%M:%S+00:00")
num = as.POSIXct(posix)
Dates = data.frame(num)
p = qplot(num, data = Dates, binwidth = bin)
# This gets us out the histogram count values
counts = ggplot_build(p)$data[[1]]$count
timeticks = ggplot_build(p)$data[[1]]$x
# Calculate limits, method 1 - simple min and max of range
start = min(num)
finish = max(num)
minor = waiver() # Default breaks
major = waiver()
p = p+scale_x_datetime(limits = c(start, finish ),
breaks = major, minor_breaks = minor)
p = p + theme_bw() + xlab(NULL) + theme(axis.text.x = element_text(angle=45,
hjust = 1,
vjust = 1))
p = p + xlab('Date') + ylab('Tweets per minute') + ggtitle('Tweets per minute (Limited to 40000 tweets in total)')
return(p)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment