Selena Deckelmann selenamarie

Title: Schema liberation with JSON and PLV8

JSON is the lingua franca of the web. Postgres supports JSON natively and the PLV8 Javascript engine as an extension. Full Javascript is supported inside the database, including direct access to tables and important database features.

And, developer-friendly features make it incredibly easy to transform your existing, normalized schemas into liberated JSON ones!

Tour of what putting a document store inside of Postgres looks like, including a look at a production use-case from Mozilla. crash-stats.mozilla.com is backed by a 2 TB Postgres database cluster that's adding about about 5 GB of JSON per week.

	Goal: Transform a raw SQL file into SQLAlchemy models and functions

	caveats: I started from zero - never have created models from scratch with SQLAlchemy before.

	= Initial reflection
	http://code.google.com/p/sqlautocode/

	tl;dr: Could use some work, but saved me a bit of time.

	Did a good job of:

	shell -${SHELL}
	caption always "%n(%t) : %C"
	defscrollback 1024
	startup_message off
	hardstatus on
	hardstatus alwayslastline

	screen -t collector 1 ~/bin/startup-socorro 1
	screen -t processor 2 ~/bin/startup-socorro 2
	screen -t monitor 3 ~/bin/startup-socorro 3

	#!/usr/bin/env python

	import tweepy
	from BeautifulSoup import BeautifulSoup as parser
	import urllib
	import sys


	import argparse
	import ConfigParser

	== PostgreSQL

	Braintree
	* https://www.braintreepayments.com/braintrust/scaling-postgresql-at-braintree-four-years-of-evolution

	Instagram
	* http://www.databasesoup.com/2012/04/sharding-postgres-with-instagram.html
	* http://media.postgresql.org/sfpug/instagram_sfpug.pdf
	* http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram#_=_
	* http://instagram-engineering.tumblr.com/post/40781627982/handling-growth-with-postgres-5-tips-from-instagram

	PYTHONPATH=. alembic revision --autogenerate -m "Adding raw_crashes and raw_adu.received_at"

	[ correct migration generated! ]

	* modify to import JSON type from socorro's model

	from socorro.external.postgresql.models import JSON
	sa.Column(u'raw_crash', JSON(), nullable=False),
	(sa.json => JSON())

	# http://nonantolando.blogspot.tw/2012/10/using-json-on-ubuntu-1204-server-with.html

	sudo apt-get install libv8-3.7.12.22 libv8-dev

	# Install d8
	# http://askubuntu.com/questions/141252/is-there-a-package-for-an-executable-version-of-v8

	svn checkout http://v8.googlecode.com/svn/trunk/ ./v8
	cd v8
	make dependencies

	CREATE SCHEMA liberated;

	CREATE OR REPLACE FUNCTION public.liberate()
	RETURNS boolean
	LANGUAGE plv8
	AS $function$

	var tables = plv8.execute(
	"select relname FROM pg_catalog.pg_class"
	+ " c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace"

	You are now connected to database "postgres" as user "postgres".
	postgres:5432# create table birds (bird_info json);
	CREATE TABLE
	Time: 162.227 ms
	postgres:5432# insert into birds VALUES( '{"name": "scrubjay", "where": "my backyard", "when": "2013-05-24"}');
	INSERT 0 1
	Time: 0.616 ms
	postgres:5432# select * from birds;
	bird_info
	--------------------------------------------------------------------

	with exploitability_count AS (
	select count(*) from exploitability_reports
	WHERE
	report_date BETWEEN %(start_date)s AND %(end_date)s

	)

	SELECT
	exploitability_count.count
	signature,