Skip to content

Instantly share code, notes, and snippets.

@agoldst
agoldst / woolf-bennett.R
Created February 8, 2013 17:25
quickie plots for the Woolf-Bennett ULTIMATE SHOWDOWN
# the metadata.R script (for read.citations()) is part of
# this git repository:
# http://github.com/agoldst/dfr-analysis
# So change this path as needed
source("~/Developer/dfr-analysis/metadata.R")
bennett.df <- read.citations("bennett.csv")
woolf.df <- read.citations("woolf.csv")
# Now bind the two together, using columns to flag AB and VW hits
@agoldst
agoldst / threepercent-post.Rmd
Last active December 15, 2015 01:59
calculations going into a blogpost about the Three Percent translation counts
As long promised, here are some links to the data I showed a table of during our discussion of Casanova about U.S. literary translation.
By kind permission of Chad Post, I can make available an aggregate data file of all the literature translations catalogued by Three Percent. I've decided to put the data file, together with some scripts and information about the munging, in a [github repository](http://github.com/agoldst/threepercent). The data consists of a single CSV file with one line for each title: [all_titles.csv](https://github.com/agoldst/threepercent/blob/master/all_titles.csv) ([Wikipedia on CSV format](http://en.wikipedia.org/wiki/Comma-separated_values)).
I have produced this by exporting the first "sheet" of each of the five yearly spreadsheets available at [the Three Percent Translation Database](http://www.rochester.edu/College/translation/threepercent/index.php?s=database) and then combining the files. According to Chad Post, updated data will be available soon, at which point I can reprodu
@agoldst
agoldst / html_clean.hs
Created April 21, 2013 16:12
a clean-up operation on tex4ht/biblatex output
import Text.Pandoc
{-
This script uses the Pandoc library to do two transformations
needed on the way from my mixed markdown/LaTeX syllabus sources to a
single HTML file:
1. Transform the slightly garbled html produced by tex4ht from LaTeX
source containing a biblatex bibliography by getting rid of definition
# for this file, clone http://github.com/agoldst/dfr-analysis
source("~/Developer/dfr-analysis/metadata.R")
library(plyr)
library(stringr)
wordcounts_v <- function (f) {
frm <- scan(f,what=list(word=character(),weight=integer()),sep=",",skip=1,quiet=T)
result <- frm$weight
names(result) <- frm$word
result
@agoldst
agoldst / emp2.md
Last active October 26, 2024 07:37
Markdown etc. demo: source for a blogpost on…markdown.

Empowerment Part II

The actual "empowerment" (modest but real) comes in getting a more detailed understanding of the way the systems we already use handle text, and in learning more ways to manipulate that text, beyond the confines of any single program. The business of plain-text-slinging, a minor craft on its own, nonetheless forms a natural starting point for thinking more deeply about analyzing digitized texts, expressing yourself in "code" of various kinds, and composing in the digital medium.

Downloads

In order to do the workshop on your own, first install Pandoc and LaTeX (links above). Komodo Edit is optional; any text editor will do, though I'll occasionally refer to details in Komodo (menu items, etc.) that may be slightly different in other editors. See below for text editor suggestions.

The handout from the workshop (PDF)

@agoldst
agoldst / emp2-handout.md
Created November 24, 2013 00:20
Markdown source for a handout on markdown, HTML, and LaTeX, typesettable with pandoc + xelatex.

% DH@RU Workshop: Empowerment Part II % Andrew Goldstone ([email protected]) % November 20, 2013

Markdown

Text conventions

*emphasis* or _emphasis_; **strong emphasis**
@agoldst
agoldst / laureates.R
Created May 19, 2014 16:02
Query the Nobelprize.org API to create a CSV table of all the literature laureates.
library("httr")
r_lits <- GET("http://api.nobelprize.org/v1/prize.json",query=list(category="literature"))
laureates <- content(r_lits,"parsed")$prizes # JSON
ids <- sapply(laureates,function (psn) {
psn$laureates[[1]]$id
})
@agoldst
agoldst / jekyll-pandoc.rb
Last active August 29, 2015 14:03
Jekyll 2 plugin for pandoc-ruby
require 'jekyll'
require 'pandoc-ruby' # add pandoc-ruby to your Gemfile
# Plugin for using pandoc as Jekyll markdown processor
# http://jekyllrb.com/docs/extras/ q.v.
# install in jekyll _plugins/ folder
# or Octopress plugins/
# In _config.yml, specify
# markdown: Pandoc # capital P
@agoldst
agoldst / dh2014-slides.R
Created July 11, 2014 12:24
R code used to produce the slides for this DH 2014 presentation: http://andrewgoldstone.com/blog/2014/07/02/dh2014/ . Generated by knitr::purl()
opts_chunk$set(echo=F,warning=F,prompt=F,comment="",
autodep=T,cache=T,dev="tikz",
fig.width=4.5,fig.height=3,size ='footnotesize',
dev.args=list(pointsize=12))
options(width=70)
options(tikzDefaultEngine="xetex")
options(tikzXelatexPackages=c(
"\\usepackage{tikz}\n",
@agoldst
agoldst / mallet-inference.R
Last active November 12, 2023 07:23
Functions for using MALLET's topic-inference capability from R: given an existing topic model, estimate topic proportions for new documents
# mallet-inference.R
#
# functions for using MALLET's topic-inference functionality: given an
# existing topic model, estimate topic proportions for new documents
#
# source() this file
#
# Workflow
# --------
#