Cheng-Jun Wang chengjun

🏠

Working from home

Associate Professor at Nanjing University

chengjun / iworkfortheinternet.R

Created December 15, 2011 04:03 — forked from sckott/iworkfortheinternet.R

Code for searching Twitter using the twitteR #rstats package.

	require(plyr); require(stringr); require(ggplot2); require(lubridate); require(twitteR)

	datout_1 <- searchTwitter("I work for the internet", n = 1500, since='2011-11-11', until='2011-12-12')
	datout_2 <- searchTwitter("I work for the internet", n = 1500, since='2011-11-13', until='2011-12-14')
	datoutdf <- ldply(c(datout_1, datout_2), function(x) x$toDataFrame(), .progress="text")

	actual <- grep("I work for the internet", datoutdf[,1], ignore.case=T)
	datoutdf2 <- datoutdf[actual,]

	datoutdf2$newtime <- round_date(datoutdf2[,4], "hour")

chengjun / download.py

Created January 6, 2012 02:35 — forked from thomasjensen/download.py

Download blog posts from R-bloggers

	from BeautifulSoup import BeautifulSoup
	import mechanize
	import time

	url = "http://www.r-bloggers.com/"

	br = mechanize.Browser()

	page = br.open(url)

chengjun / yen and us10y.r

Created January 15, 2012 01:47 — forked from timelyportfolio/yen and us10y.r

yen and us10y

	require(quantmod)

	#get Japanese Yen daily from Fred http://research.stlouisfed.org/fred2
	getSymbols("DEXJPUS",src="FRED")
	#get US 10y Yield from Fred
	getSymbols("DGS10", src="FRED")

	Yen10y <- na.omit(merge(DEXJPUS,DGS10))

	#define colors

chengjun / archiveTwitter.py

Created January 30, 2012 03:47 — forked from mjbommar/archiveTwitter.py

Archive tweets from a search term going backwards through search.

	'''
	@author Michael J Bommarito II
	@date Feb 26, 2011
	@license Simplified BSD, (C) 2011.

	This script demonstrates how to use Python to archive historical tweets.
	'''

	import codecs
	import csv

chengjun / archiveTwitter.py

Created January 30, 2012 06:19 — forked from mjbommar/archiveTwitter.py

Archive tweets from a search term going backwards through search.

	'''
	@author Michael J Bommarito II
	@date Feb 26, 2011
	@license Simplified BSD, (C) 2011.

	This script demonstrates how to use Python to archive historical tweets.
	'''

	import codecs
	import csv

chengjun / archiveHashtag.r

Created January 30, 2012 06:21 — forked from mjbommar/archiveHashtag.r

Archive a twitter hashtag.

	#@author Michael J Bommarito
	#@contact [email protected]
	#@date Feb 20, 2011
	#@ip Simplified BSD, (C) 2011.
	# This is a simple example of an R script that will retrieve
	# public tweets from a given hashtag.

	library(RJSONIO)

	# This function loads stored tag data to determine the current max_id.

chengjun / Thread Network Analysis

Created January 30, 2012 06:26

Thread Network Analysis

	CC = Import["D://Mathematica//bbs.xls"]
	DD = CC[[1]]
	Tally[Transpose[DD][[4]]] // MatrixForm
	ListLogLogPlot[Sort[Transpose[Tally[Transpose[DD][[4]]]][[2]], Greater]]
	tt = Union[Transpose[DD][[1]]];
	First[CC] // TableForm;
	thread[n_] := Select[DD, #[[1]] == n &]

	k = thread[2]
	First[k][[3]]

chengjun / japan_trade_yen.r

Created March 8, 2012 23:18 — forked from timelyportfolio/japan_trade_yen.r

japan trade and yen

	require(quantmod)

	#get data from Japan Ministry of Finance website in csv form
	url = "http://www.customs.go.jp/toukei/suii/html/data/d41ma.csv"
	japantrade <- read.csv(url,skip=2,stringsAsFactors=FALSE)

	#start cleaning data and getting in xts form
	japantrade.xts <- japantrade[2:NROW(japantrade),]
	#remove trailing 0 for future data
	japantrade.xts <- japantrade.xts[which(japantrade.xts[,2]!=0),]

chengjun / mixing_matrix.R

Created April 18, 2012 01:43 — forked from gweissman/mixing_matrix.R

Calculate mixing matrix in igraph by vertex characteristic

	# calculate the mixing matrix of in igraph graph object 'mygraph', by some vertex attribute 'attrib'
	# can change the default use.density=FALSE to return a matrix with raw number of edges rather than density

	mixmat <- function(mygraph, attrib, use.density=TRUE) {

	require(igraph)

	# get unique list of characteristics of the attribute
	attlist <- sort(unique(get.vertex.attribute(mygraph,attrib)))

chengjun / caschools-analysis.rmd

Created May 20, 2012 01:15 — forked from jeromyanglim/caschools-analysis.rmd

California schools analysis demonstrating use of R Markdown

	`r opts_chunk$set(cache=TRUE)`

	This is a quick set of analyses of the California Test Score dataset. The post was produced using R Markdown in RStudio 0.96. The main purpose of this post is to provide a case study of using R Markdown to prepare a quick reproducible report. It provides examples of using plots, output, in-line R code, and markdown. The post is designed to be read along side the R Markdown source code, which is available as a gist on github.

	<!-- more -->

	### Preliminaries
	* This post builds on my earlier post which provided a guide for [Getting Started with R Markdown, knitr, and RStudio 0.96](jeromyanglim.blogspot.com/2012/05/getting-started-with-r-markdown-knitr.html)
	* The dataset analysed comes from the `AER` package which is an accompaniment to the book [Applied Econometrics with R](http://www.amazon.com/Applied-Econometrics-R-Use/dp/0387773169) written by [Christian Kleiber](http://wwz.unibas.ch/personen/profil/person/kleiber/) and [Achim Zeileis](http://eeecon.uibk.ac.at/~zeileis/