Andy Teucher ateucher

PostgreSQL & PostGIS Cheatsheet

This is a collection of information on PostgreSQL and PostGIS for what I tend to use most often.

There are packages for this now!

2017-08-03: Since I wrote this in 2014, the universe, specifically Kirill Müller (https://github.com/krlmlr), has provided better solutions to this problem. I now recommend that you use one of these two packages:

rprojroot: This is the main package with functions to help you express paths in a way that will "just work" when developing interactively in an RStudio Project and when you render your file.
here: A lightweight wrapper around rprojroot that anticipates the most likely scenario: you want to write paths relative to the top-level directory, defined as an RStudio project or Git repo. TRY THIS FIRST.

I love these packages so much I wrote an ode to here.

I use these packages now instead of what I describe below. I'll leave this gist up for historical interest. 😆

Getting Yosemite and R to play nice

Here are some tips on getting R development working with Yosemite. Contribute what you know below and I'll add it in.

`homebrew`

I went ahead and re-installed all of my homebrew. You can find out what you have installed with

I'm going to start off by describing a pretty common data analysis scenario, and then talk about how using R can help:

You have a lot of individual spreadsheet files containing your data, and you need it all together, so you copy and paste each one into a master file.
Next you do a bunch of data cleaning in the master spreadsheet - fixing date formats, unit conversions, transformations, etc.
You then import the data into your favourite statistics program, run your analysis, and
copy the outputs back into a spreadsheet or other graphing program to plot your results.
You give the results to a colleague to review and she comes back with some concerns that something doesn't look quite right with the results. She also suggests that a different modelling technique would be more appropriate.
You comb through the original data and realize that in some of the files one column was misaligned, and so in copying and pasting these into the master dataset this error was compounded over many rows.
In ad

Finessing Excel's stupid line endings

I am sheepish to admit a certain type of routine Microsoft Excel use.

Current example: I am marking for STAT 545. I use R to create a comma delimited marking sheet, by joining the official class list and peer reviews. The sheet contains variables, initially set to NA, where the TAs and I enter official marks and optional comments.

This is where Excel comes in. I like its visual organization of this comma delimited file much more than, say, using a plain text editor. I use the ability to hide columns, resize columns, wrap text, and (gasp!) even fill rows with grey to indicate I am done.

I keep saving the file as comma delimited and I put up with Excel's incessant freak out about "losing features". This is not a one time thing. I need to save and commit this file many times before it is considered done.

	`derivSimulCI` <- function(mod, n = 200, eps = 1e-7, newdata, term,
	samples = 10000) {
	stopifnot(require("MASS"))
	if(inherits(mod, "gamm"))
	mod <- mod$gam
	m.terms <- attr(terms(mod), "term.labels")
	if(missing(newdata)) {
	newD <- sapply(model.frame(mod)[, m.terms, drop = FALSE],
	function(x) seq(min(x), max(x) - (2*eps), length = n))
	names(newD) <- m.terms

	#linux commands basics
	#http://software-carpentry.org/v5/novice/shell/index.html
	# practise, practise, practise, google, google, google and you will get it :)

	pwd # print working directory
	cd # change directory
	sudo # super user privilege
	chmod 775 # change the privileges http://en.wikipedia.org/wiki/Chmod
	git clone # version control! get to know git and github! http://git-scm.com/
	sudo bash # bad habit

	enumerate <- function(X, FUN, ...) {
	result <- vector("list", length(X))
	for (i in seq_along(result)) {
	tmp <- FUN(X[[i]], i, ...)
	if (is.null(tmp))
	result[i] <- list(NULL)
	else
	result[[i]] <- tmp
	}
	result

	#!/bin/bash
	# rename TMS tiles to the XYZ schema
	# no quoting, since all files have simple numeric names
	# do not run this anywhere else than INSIDE your tiles directory

	# run it like this: find . -name "*.png" -exec ./tms2xyz.sh {} \;

	filename=$1

	tmp=${filename#*/} # remove to first /

	rgb2hex <- function(r,g,b) sprintf('#%s',paste(as.hexmode(c(r,g,b)),collapse = ''))

	rgb2hex(255,0,0)
	# returns '#ff0000'