Skip to content

Instantly share code, notes, and snippets.

@criccomini
criccomini / airflow-supervisord.conf
Created June 22, 2016 14:54
airflow-supervisord.conf
; Configuration for Airflow webserver and scheduler in Supervisor
[program:airflow]
command=/bin/airflow webserver
stopsignal=QUIT
stopasgroup=true
user=airflow
stdout_logfile=/var/log/airflow/airflow-stdout.log
stderr_logfile=/var/log/airflow/airflow-stderr.log
environment=HOME="/home/airflow",AIRFLOW_HOME="/etc/airflow",TMPDIR="/storage/airflow_tmp"
@klmr
klmr / Makefile
Last active April 7, 2026 12:48
Self-documenting makefiles
# Example makefile with some dummy rules
.PHONY: all
## Make ALL the things; this includes: building the target, testing it, and
## deploying to server.
all: test deploy
.PHONY: build
# No documentation; target will be omitted from help display
build:
@abridgett
abridgett / airflow_eg.py
Last active April 29, 2021 22:03
airflow XCOM notification example
MAP_SLACK_ATTACHMENTS = [
{
"fallback": "{{params.map}} {{ task_instance.xcom_pull(task_ids=params.map, key='slack_status') }}",
"pretext": "{{params.map}} update {{ task_instance.xcom_pull(task_ids=params.map, key='slack_status') }}",
"fields": [
{
"title": "Copied",
"value": "{{ task_instance.xcom_pull(task_ids=params.map, key='copied') }}",
"short": True
}
@johnmyleswhite
johnmyleswhite / statistical_maxims.md
Created December 1, 2015 15:25
Statistical Maxims
  • Correlation is not causation (???)
  • No causation without manipulation. (Holland)
  • All models are wrong, some are useful. (Box)
  • Statistics is the science of uncertainty. (arguably Tukey)
  • Statistics is the science of learning from experience, especially experience that arrives a little bit at a time. (Efron)
@erikbern
erikbern / install-tensorflow.sh
Last active April 14, 2026 02:32
Installing TensorFlow on EC2
# Note – this is not a bash script (some of the steps require reboot)
# I named it .sh just so Github does correct syntax highlighting.
#
# This is also available as an AMI in us-east-1 (virginia): ami-cf5028a5
#
# The CUDA part is mostly based on this excellent blog post:
# http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/
# Install various packages
sudo apt-get update
@wrobstory
wrobstory / pytest.md
Last active October 20, 2015 22:21
PyTest4Tim

Why py.test?

py.test Assertions

IMO, py.test tests read better, because of the assert magic. When comparing two Python objects, py.test performs introspection on them for the comparison. As the end user, you don't really need to care about that; you just need to care that your test suite is much more readable. Compare the following:

def test_my_thing():
    # Assume we make some things we want to compare
    assert expected_list == result_list
 assert expected_set == result_set
@PurpleBooth
PurpleBooth / README-Template.md
Last active May 11, 2026 16:34
A template to make good README.md

Project Title

One Paragraph of project description goes here

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

---------- Forwarded message ----------
From: chris wiggins <chris.wiggins@[YYY].edu>
Date: Wed, Aug 1, 2012 at 7:26 PM
Subject: stats history
To: hadley@[XXX].edu
Cc: chris wiggins <chris.wiggins@[YYY].edu>
Dear Hadley:
@jlafon
jlafon / dynamodb.md
Created December 3, 2014 05:03
An Introduction to Amazon's DynamoDB

An introduction to DynamoDB

DynamoDB is a powerful, fully managed, low latency, NoSQL database service provided by Amazon. DynamoDB allows you to pay for dedicated throughput, with predictable performance for "any level of request traffic". Scalability is handled for you, and data is replicated across multiple availability zones automatically. Amazon handles all of the pain points associated with managing a distributed datastore for you, including replication, load balancing, provisioning, and backups. All that is left is for you to take your data, and its access patterns, and make it work in the denormalized world of NoSQL.

Modeling your data

The single most important part of using DynamoDB begins before you ever put data into it: designing the table(s) and keys. Keys (Amazon calls them primary keys) can be composed of one attribute, called a hash key, or a compound key called the hash and range key. The key is used to uniquely identify an item in a table. The choice of the primary key is particularl