Justin Palmer caged

Examples of ordered set aggregates in Postgres.

SELECT round(avg(pie)::numeric, 2),
       percentile_cont(array[0.25, 0.5, 0.75, 0.95]) WITHIN GROUP (ORDER BY pie) AS percentiles
FROM player_stats_advanced
WHERE permode = 'pergame';

round | percentiles

	location
	Luverne, Alabama
	Madison, Alabama
	Theodore, Alabama
	Oneonta, Alabama
	Odenville, Alabama
	Heflin, Alabama
	Jasper, Alabama
	Midfield, Alabama
	Greenville, Alabama

Datasets first, APIs second - Doing any kind of aggregate analysis usually requires working with complete datasets. REST APIs aren't ideal for this use case. APIs are not data, they are a means of exposing it.
Machine-friendly retrieval of raw datasets - Avoid the assumption that there's a human, using a web browser, manually clicking a link. For example, scripts that fetch new daily crime data via curl would be a likely scenario. Make it easy for machines by removing authentication, unnecessary redirects, JavaScript-based retrieval or POST-style retrieval.
Document long column names - Shapefile attributes are limited to 10 characters. This makes many attributes difficult to decipher without associated metadata. For example, here are a few attributes from a Garbage Collection dataset. Include a file with the long column name mappings and include both the long and short name in the metada

	<!DOCTYPE html>
	<meta charset="utf-8">
	<style>

	.axis path,
	.axis line {
	fill: none;
	stroke: #000;
	shape-rendering: crispEdges;
	}

	drop table if exists inspection_point_buffers;

	-- Group identical overlapping points and count how many occupy
	-- the space.
	create temporary table inspection_overlappoing_points as
	select a.geom as geom,
	count(*)
	from latest_inspections a,
	latest_inspections b
	where st_equals(a.geom, b.geom)

	<!doctype html>
	<head>
	<meta charset="utf-8">
	<style>
	body {
	font-family: OpenSans, Helvetica;
	}

	.title {
	margin: 0;

	column_name
	--------------------------------------------------------
	crash_id
	record_type
	vehicle_id
	participant_id
	participant_display_seq
	vehicle_coded_seq
	participant_vehicle_seq
	serial_

	drop table if exists combined_geometries;

	with boston_area_geometries as
	( select name,
	msa_code,
	geom
	from divisions
	inner join
	( select distinct on (msa_code) msa_code
	from area_definitions ) ad on ad.msa_code = nctadvfp ),

	select array_to_string(array_agg(i), '') from
	(select (regexp_matches('Letter-1-2', '[A-Z0-9]', 'g'))[1] i) t;

	#= require d3

	# Draw timeseries graphs to the screen. Each element can contain a set of
	# data-* attributes used to configure the graph. The graph should always include
	# a data-url attribute pointing to an endpoint for time series JSON data.
	#
	# Any graph that includes a data-realtime attribute will update automatically.
	#
	# Examples:
	# <div class="js-graph" data-url="/graphite?target=github.unicorn.{browser,api}.cpu_time.mean&from=-1hour" data-realtime></div>