Skip to content

Instantly share code, notes, and snippets.

@Protonk
Created April 12, 2011 17:53
Show Gist options
  • Select an option

  • Save Protonk/916025 to your computer and use it in GitHub Desktop.

Select an option

Save Protonk/916025 to your computer and use it in GitHub Desktop.
just the facts ma'am
library(XML)
#There are probably smarter ways to do this, but this works well enough
url.prefix<-"http://www.priceofweed.com/prices/United%20States/"
url.suffix<-".html"
state.uri<- paste(url.prefix,state.name,url.suffix,sep="");
#This function is quick and dirty (emphasis on dirty). It works as well
#as it does because priceofweed.com has very structured web pages. The first
#table on each page is the summary (trimmed mean) of reported prices.
#No attempt is made to check for consistency but as you can see we don't encounter any problems
weed.grab<- function(sleep=1) {
weed.name<- c("HQ.price", "MQ.price","LQ.price","HQ.n","MQ.n","LQ.n")
weed.mat<- matrix(0,50,6,dimnames=list(state.name,weed.name))
for (i in 1:50) {
state.int<- unlist(readHTMLTable(doc=state.uri[i] , as.data.frame=FALSE)[[1]])
weed.mat[i,]<- c(as.numeric(substring(state.int[6:8],2)),as.numeric(state.int[10:12]))
rm(state.int)
Sys.sleep(sleep)
}
weed.prices.df<- as.data.frame(weed.mat)
names(weed.prices.df)<- c("High Quality","Med. Quality","Low Quality","HQ.n","MQ.n","LQ.n")
return(weed.prices.df)
};
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment