Skip to content

Instantly share code, notes, and snippets.

View chengjun's full-sized avatar
🏠
Working from home

Cheng-Jun Wang chengjun

🏠
Working from home
View GitHub Profile
@chengjun
chengjun / iworkfortheinternet.R
Created December 15, 2011 04:03 — forked from sckott/iworkfortheinternet.R
Code for searching Twitter using the twitteR #rstats package.
require(plyr); require(stringr); require(ggplot2); require(lubridate); require(twitteR)
datout_1 <- searchTwitter("I work for the internet", n = 1500, since='2011-11-11', until='2011-12-12')
datout_2 <- searchTwitter("I work for the internet", n = 1500, since='2011-11-13', until='2011-12-14')
datoutdf <- ldply(c(datout_1, datout_2), function(x) x$toDataFrame(), .progress="text")
actual <- grep("I work for the internet", datoutdf[,1], ignore.case=T)
datoutdf2 <- datoutdf[actual,]
datoutdf2$newtime <- round_date(datoutdf2[,4], "hour")
@chengjun
chengjun / download.py
Created January 6, 2012 02:35 — forked from thomasjensen/download.py
Download blog posts from R-bloggers
from BeautifulSoup import BeautifulSoup
import mechanize
import time
url = "http://www.r-bloggers.com/"
br = mechanize.Browser()
page = br.open(url)
@chengjun
chengjun / yen and us10y.r
Created January 15, 2012 01:47 — forked from timelyportfolio/yen and us10y.r
yen and us10y
require(quantmod)
#get Japanese Yen daily from Fred http://research.stlouisfed.org/fred2
getSymbols("DEXJPUS",src="FRED")
#get US 10y Yield from Fred
getSymbols("DGS10", src="FRED")
Yen10y <- na.omit(merge(DEXJPUS,DGS10))
#define colors
@chengjun
chengjun / archiveTwitter.py
Created January 30, 2012 03:47 — forked from mjbommar/archiveTwitter.py
Archive tweets from a search term going backwards through search.
'''
@author Michael J Bommarito II
@date Feb 26, 2011
@license Simplified BSD, (C) 2011.
This script demonstrates how to use Python to archive historical tweets.
'''
import codecs
import csv
@chengjun
chengjun / archiveTwitter.py
Created January 30, 2012 06:19 — forked from mjbommar/archiveTwitter.py
Archive tweets from a search term going backwards through search.
'''
@author Michael J Bommarito II
@date Feb 26, 2011
@license Simplified BSD, (C) 2011.
This script demonstrates how to use Python to archive historical tweets.
'''
import codecs
import csv
@chengjun
chengjun / archiveHashtag.r
Created January 30, 2012 06:21 — forked from mjbommar/archiveHashtag.r
Archive a twitter hashtag.
#@author Michael J Bommarito
#@contact [email protected]
#@date Feb 20, 2011
#@ip Simplified BSD, (C) 2011.
# This is a simple example of an R script that will retrieve
# public tweets from a given hashtag.
library(RJSONIO)
# This function loads stored tag data to determine the current max_id.
@chengjun
chengjun / Thread Network Analysis
Created January 30, 2012 06:26
Thread Network Analysis
CC = Import["D://Mathematica//bbs.xls"]
DD = CC[[1]]
Tally[Transpose[DD][[4]]] // MatrixForm
ListLogLogPlot[Sort[Transpose[Tally[Transpose[DD][[4]]]][[2]], Greater]]
tt = Union[Transpose[DD][[1]]];
First[CC] // TableForm;
thread[n_] := Select[DD, #[[1]] == n &]
k = thread[2]
First[k][[3]]
@chengjun
chengjun / japan_trade_yen.r
Created March 8, 2012 23:18 — forked from timelyportfolio/japan_trade_yen.r
japan trade and yen
require(quantmod)
#get data from Japan Ministry of Finance website in csv form
url = "http://www.customs.go.jp/toukei/suii/html/data/d41ma.csv"
japantrade <- read.csv(url,skip=2,stringsAsFactors=FALSE)
#start cleaning data and getting in xts form
japantrade.xts <- japantrade[2:NROW(japantrade),]
#remove trailing 0 for future data
japantrade.xts <- japantrade.xts[which(japantrade.xts[,2]!=0),]
@chengjun
chengjun / mixing_matrix.R
Created April 18, 2012 01:43 — forked from gweissman/mixing_matrix.R
Calculate mixing matrix in igraph by vertex characteristic
# calculate the mixing matrix of in igraph graph object 'mygraph', by some vertex attribute 'attrib'
# can change the default use.density=FALSE to return a matrix with raw number of edges rather than density
mixmat <- function(mygraph, attrib, use.density=TRUE) {
require(igraph)
# get unique list of characteristics of the attribute
attlist <- sort(unique(get.vertex.attribute(mygraph,attrib)))
@chengjun
chengjun / caschools-analysis.rmd
Created May 20, 2012 01:15 — forked from jeromyanglim/caschools-analysis.rmd
California schools analysis demonstrating use of R Markdown
`r opts_chunk$set(cache=TRUE)`
This is a quick set of analyses of the California Test Score dataset. The post was produced using R Markdown in RStudio 0.96. The main purpose of this post is to provide a case study of using R Markdown to prepare a quick reproducible report. It provides examples of using plots, output, in-line R code, and markdown. The post is designed to be read along side the R Markdown source code, which is available as a gist on github.
<!-- more -->
### Preliminaries
* This post builds on my earlier post which provided a guide for [Getting Started with R Markdown, knitr, and RStudio 0.96](jeromyanglim.blogspot.com/2012/05/getting-started-with-r-markdown-knitr.html)
* The dataset analysed comes from the `AER` package which is an accompaniment to the book [Applied Econometrics with R](http://www.amazon.com/Applied-Econometrics-R-Use/dp/0387773169) written by [Christian Kleiber](http://wwz.unibas.ch/personen/profil/person/kleiber/) and [Achim Zeileis](http://eeecon.uibk.ac.at/~zeileis/