Chris Harland cwharland

Correlation is not causation (???)
No causation without manipulation. (Holland)
All models are wrong, some are useful. (Box)
Statistics is the science of uncertainty. (arguably Tukey)
Statistics is the science of learning from experience, especially experience that arrives a little bit at a time. (Efron)

Why py.test?

py.test Assertions

IMO, py.test tests read better, because of the assert magic. When comparing two Python objects, py.test performs introspection on them for the comparison. As the end user, you don't really need to care about that; you just need to care that your test suite is much more readable. Compare the following:

def test_my_thing():
    # Assume we make some things we want to compare
    assert expected_list == result_list
 assert expected_set == result_set

Project Title

One Paragraph of project description goes here

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

An introduction to DynamoDB

DynamoDB is a powerful, fully managed, low latency, NoSQL database service provided by Amazon. DynamoDB allows you to pay for dedicated throughput, with predictable performance for "any level of request traffic". Scalability is handled for you, and data is replicated across multiple availability zones automatically. Amazon handles all of the pain points associated with managing a distributed datastore for you, including replication, load balancing, provisioning, and backups. All that is left is for you to take your data, and its access patterns, and make it work in the denormalized world of NoSQL.

Modeling your data

The single most important part of using DynamoDB begins before you ever put data into it: designing the table(s) and keys. Keys (Amazon calls them primary keys) can be composed of one attribute, called a hash key, or a compound key called the hash and range key. The key is used to uniquely identify an item in a table. The choice of the primary key is particularl

Moved to tdhopper.com.

	; Configuration for Airflow webserver and scheduler in Supervisor

	[program:airflow]
	command=/bin/airflow webserver
	stopsignal=QUIT
	stopasgroup=true
	user=airflow
	stdout_logfile=/var/log/airflow/airflow-stdout.log
	stderr_logfile=/var/log/airflow/airflow-stderr.log
	environment=HOME="/home/airflow",AIRFLOW_HOME="/etc/airflow",TMPDIR="/storage/airflow_tmp"

	# Example makefile with some dummy rules

	.PHONY: all
	## Make ALL the things; this includes: building the target, testing it, and
	## deploying to server.
	all: test deploy

	.PHONY: build
	# No documentation; target will be omitted from help display
	build:

	MAP_SLACK_ATTACHMENTS = [
	{
	"fallback": "{{params.map}} {{ task_instance.xcom_pull(task_ids=params.map, key='slack_status') }}",
	"pretext": "{{params.map}} update {{ task_instance.xcom_pull(task_ids=params.map, key='slack_status') }}",
	"fields": [
	{
	"title": "Copied",
	"value": "{{ task_instance.xcom_pull(task_ids=params.map, key='copied') }}",
	"short": True
	}

	# Note – this is not a bash script (some of the steps require reboot)
	# I named it .sh just so Github does correct syntax highlighting.
	#
	# This is also available as an AMI in us-east-1 (virginia): ami-cf5028a5
	#
	# The CUDA part is mostly based on this excellent blog post:
	# http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/

	# Install various packages
	sudo apt-get update

	---------- Forwarded message ----------
	From: chris wiggins <chris.wiggins@[YYY].edu>
	Date: Wed, Aug 1, 2012 at 7:26 PM
	Subject: stats history
	To: hadley@[XXX].edu
	Cc: chris wiggins <chris.wiggins@[YYY].edu>


	Dear Hadley: