- Correlation is not causation (???)
- No causation without manipulation. (Holland)
- All models are wrong, some are useful. (Box)
- Statistics is the science of uncertainty. (arguably Tukey)
- Statistics is the science of learning from experience, especially experience that arrives a little bit at a time. (Efron)
| ; Configuration for Airflow webserver and scheduler in Supervisor | |
| [program:airflow] | |
| command=/bin/airflow webserver | |
| stopsignal=QUIT | |
| stopasgroup=true | |
| user=airflow | |
| stdout_logfile=/var/log/airflow/airflow-stdout.log | |
| stderr_logfile=/var/log/airflow/airflow-stderr.log | |
| environment=HOME="/home/airflow",AIRFLOW_HOME="/etc/airflow",TMPDIR="/storage/airflow_tmp" |
| # Example makefile with some dummy rules | |
| .PHONY: all | |
| ## Make ALL the things; this includes: building the target, testing it, and | |
| ## deploying to server. | |
| all: test deploy | |
| .PHONY: build | |
| # No documentation; target will be omitted from help display | |
| build: |
| MAP_SLACK_ATTACHMENTS = [ | |
| { | |
| "fallback": "{{params.map}} {{ task_instance.xcom_pull(task_ids=params.map, key='slack_status') }}", | |
| "pretext": "{{params.map}} update {{ task_instance.xcom_pull(task_ids=params.map, key='slack_status') }}", | |
| "fields": [ | |
| { | |
| "title": "Copied", | |
| "value": "{{ task_instance.xcom_pull(task_ids=params.map, key='copied') }}", | |
| "short": True | |
| } |
| # Note – this is not a bash script (some of the steps require reboot) | |
| # I named it .sh just so Github does correct syntax highlighting. | |
| # | |
| # This is also available as an AMI in us-east-1 (virginia): ami-cf5028a5 | |
| # | |
| # The CUDA part is mostly based on this excellent blog post: | |
| # http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/ | |
| # Install various packages | |
| sudo apt-get update |
py.test Assertions
IMO, py.test tests read better, because of the assert magic. When comparing two Python objects, py.test performs introspection on them for the comparison. As the end user, you don't really need to care about that; you just need to care that your test suite is much more readable. Compare the following:
def test_my_thing():
# Assume we make some things we want to compare
assert expected_list == result_list
assert expected_set == result_set| ---------- Forwarded message ---------- | |
| From: chris wiggins <chris.wiggins@[YYY].edu> | |
| Date: Wed, Aug 1, 2012 at 7:26 PM | |
| Subject: stats history | |
| To: hadley@[XXX].edu | |
| Cc: chris wiggins <chris.wiggins@[YYY].edu> | |
| Dear Hadley: |
DynamoDB is a powerful, fully managed, low latency, NoSQL database service provided by Amazon. DynamoDB allows you to pay for dedicated throughput, with predictable performance for "any level of request traffic". Scalability is handled for you, and data is replicated across multiple availability zones automatically. Amazon handles all of the pain points associated with managing a distributed datastore for you, including replication, load balancing, provisioning, and backups. All that is left is for you to take your data, and its access patterns, and make it work in the denormalized world of NoSQL.
The single most important part of using DynamoDB begins before you ever put data into it: designing the table(s) and keys. Keys (Amazon calls them primary keys) can be composed of one attribute, called a hash key, or a compound key called the hash and range key. The key is used to uniquely identify an item in a table. The choice of the primary key is particularl
Moved to tdhopper.com.