benmarwick’s gists

benmarwick / simple-table-markdown.r

Created December 4, 2013 23:19

How to make a simple table using R markdown that includes a caption.

	```{r table-simple, echo=FALSE, message=FALSE, warnings=FALSE, results='asis'}
	require(pander)
	panderOptions('table.split.table', Inf)
	set.caption("My great data")
	my.data <- "
	Tables \| Are \| Cool
	col 3 is \| right-aligned \| $1600
	col 2 is \| centered \| $12
	zebra stripe \| are neat \| $1"
	df <- read.delim(textConnection(my.data),header=FALSE,sep="\|",strip.white=TRUE,stringsAsFactors=FALSE)

benmarwick / uw_archy_phd_theses.Rmd

Last active January 26, 2017 02:02

Basic analysis of data on UW Archaeology PhD theses of the last ten years

	# Basic analysis of UW Archaeology PhD theses

	There are no clear guidelines about the length or structure of a PhD thesis in
	archaeology at UW. To answer these questions, we decided to make a quick study
	of the norms evident in PhD theses produced in the last ten years.

	## Methods

	We made counts of the total number of pages for each theses and the number of pages per chapter. We also made a note of the year the thesis was passed to examine trends over time. We entered the data into a google sheet.

benmarwick / oxcal_formatter.r

Created December 11, 2013 19:22

Uses R to format radiocarbon dates for calibration by OxCal

	# read data in, three columns Name = lab code, Date = radiocarbon age, Uncertainty = error
	dates <- read.csv('F:/My Documents/My UW/Research/1308 Sulawesi/Dates/TalimbueDatesForOxcal-2.csv', stringsAsFactors = FALSE)

	# construct OxCal format
	oxcal_format <- paste0('R_Date(\"', gsub("^\\s+\|\\s+$", "", dates$Name), '\",', dates$Date, ',', dates$Uncertainty, ');')
	# inspect
	cat(oxcal_format)

	# write formatted dates to text file
	write.table(oxcal_format, file = 'oxcal_format.txt', row.names = FALSE, col.names = FALSE, quote = FALSE)

benmarwick / aj-temper-levels.Rmd

Last active December 31, 2015 02:28

	```{r}
	# bring data into R
	require(gdata) # must have Perl installed first: http://strawberryperl.com/
	data <- read.xls("F:/My Documents/My UW/Teaching/Graduate Students/Amy Jordan/shell vs grit charts.xls", sheet = 'main', stringsAsFactors = FALSE)
	```


	```{r}
	# check variables
	unique(data$Level)

benmarwick / correlation-plots.R

Created January 4, 2014 05:51

	#Title: An example of the correlation of x and y for various distributions of (x,y) pairs
	#Tags: Mathematics; Statistics; Correlation
	#Author: Denis Boigelot
	#Packets needed : mvtnorm (rmvnorm), #RSVGTipsDevice (devSVGTips)
	#How to use: output()
	#
	#This is an translated version in R of an Matematica 6 code by Imagecreator.
	# from http://en.wikipedia.org/wiki/File:Correlation_examples2.svg

	library(mvtnorm)

benmarwick / shakespeare_plays_genres.Rmd

Last active October 30, 2024 22:14

Quick and basic cluster analysis of Shakespeare's plays using R and full text from http://shakespeare.mit.edu/

	Quick and dirtly look at Shakespeare's plays
	====

	Introduction
	----
	I was recently inpsired by the recent posts of Andrew Collier ([1](http://www.exegetic.biz/blog/2013/09/text-mining-the-complete-works-of-william-shakespeare/) and [2](http://www.exegetic.biz/blog/2013/09/clustering-the-words-of-william-shakespeare/)) and an earlier post by [Matt Jockers](http://www.matthewjockers.net/2009/02/13/machine-classifying-novels-and-plays-by-genre/) to take a recreational look at the plays of Shakespeare.

	Motivated by Jockers, the specific topic I was interested in is the genres of the plays. For example, are the genres discrete or is there lots of overlap? Are the genres equal in variation or is one genre very focused and other very diverse? What are the key attributes that define the genres? And can I reproduce Jockers' use of high frequency words to identify genres? Related to Jockers' work on high frequency words is an earlier study by [Brainerd (1979)](http://www.jstor.org/stable/30207229) who used pronouns

benmarwick / docs-per-topic.rmd

Last active August 29, 2015 13:56

How to find the topic with the highest proportion in a set of documents (after a topic model has been generated with the R package mallet)

	Which documents belong to each topic?

	Documents don't belong to a single topic, there is a distribution of topics
	over each document.

	But we can Find the topic with the highest proportion for each document.
	That top-ranking topic might be called the 'topic' for the document, but note
	that all docs have all topics to varying proportions

	Assume that we start with `topic_docs` from the output of the mallet package

benmarwick / gist:9204077

Last active August 29, 2015 13:56

RCloud - https://github.com/att/rcloud - setup on ubuntu

	## Shell:

	git clone --recursive https://github.com/cscheid/rcloud.git

	sudo apt-get install libxt-dev libcurl4-openssl-dev libcairo2-dev libreadline-dev git

	Create github app according to instructions here: https://github.com/att/rcloud

	Edit conf/rcloud.conf according to instructions here: https://github.com/att/rcloud

benmarwick / test.R

Last active March 23, 2022 02:29

Convert a folder of text files into a single CSV file with one column for the file names and one column of the text of the file. A function in R.

	# test it by creating some small text files to run the function on

	txt <- c("here is", "some text", "to test", "this function with", "'including a leading quote", '"and another leading quote')
	# make text files
	dir.create("testdir")
	for(i in 1:length(txt)){
	writeLines(txt[i], paste0("testdir/outfile-", i, ".txt"))
	}

	# run the function and then look in the CSV file that is produced.

benmarwick / csv2txts.R

Last active February 9, 2021 15:06

Convert a single CSV file (one text per row) into separate text files. A function in R.

	#' Making several text files from a single CSV file
	#'
	#' Convert a single CSV file (one text per row) into
	#' separate text files. A function in R.
	#'
	#' To use this function for the first time run:
	#' install.packages("devtools")
	#' then thereafter you just need to load the function
	#' fom github like so:
	#' library(devtools) # windows users need Rtools installed, mac users need XCode installed

Ben Marwick benmarwick