Created
March 13, 2010 08:41
-
-
Save freshtonic/331205 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- A Postgres 8.4 query against a table called 'trade_summaries' | |
-- This query makes use of WINDOW FUNCTIONS to provide access to the previous trading | |
-- day on the row for the current trading day. | |
-- The end result of the query is a set of market trading days with the interesting column being | |
-- the number of standard deviations that the difference in total trading value for the current | |
-- day versus the previous day is from the average difference between trading days. | |
-- What makes this query more verbose than it has to be is eacerbated by three limitations of Postgres: | |
-- (NOTE: there may be a good technical or theoretical 'correctness' reason for these limitations, | |
-- but I am obviously not smart enough to figure it out!) | |
-- 1. You can't refer to calculated columns within the same select statement. | |
-- To get around this you are forced to expand the expression rather than use its alias. | |
-- 2. When forced to expand an expression that is a window function, you'll find that you can't | |
-- embed it in an aggregate function (Postgres doesn't allow it). | |
-- 3. You can't nest window function expressions. | |
-- | |
-- So to get around the above limitations, I have a quadruple nested query, which is more verbose than | |
-- I'd like. | |
-- | |
-- Any Postgres experts like to suggest some improvements? | |
SELECT | |
standard_deviations.day_id AS day_id, | |
standard_deviations.cal_date AS cal_date, | |
standard_deviations.current_day_metric AS current_day_metric, | |
standard_deviations.previous_day_metric AS previous_day_metric, | |
standard_deviations.standard_deviation AS standard_deviation, | |
standard_deviations.average_difference AS average_difference, | |
standard_deviations.difference AS difference, | |
(abs(standard_deviations.difference - standard_deviations.average_difference) / standard_deviations.standard_deviation ) AS count_standard_devations | |
FROM | |
( | |
SELECT | |
differences.day_id AS day_id, | |
differences.cal_date AS cal_date, | |
differences.current_day_metric AS current_day_metric, | |
differences.previous_day_metric AS previous_day_metric, | |
stddev(differences.difference) OVER year_window AS standard_deviation, | |
avg(differences.difference) OVER year_window AS average_difference, | |
differences.difference AS difference | |
FROM | |
( | |
SELECT | |
daily_summaries.day_id as day_id, | |
daily_summaries.cal_date as cal_date, | |
daily_summaries.metric as current_day_metric, | |
lag(daily_summaries.metric,1) OVER year_window AS previous_day_metric, | |
abs(daily_summaries.metric - lag(daily_summaries.metric,1) OVER year_window) AS difference | |
FROM | |
( | |
SELECT | |
min(days.id) AS day_id, | |
min(days.cal_date) AS cal_date, | |
sum(value_open + value_between + value_close) AS metric | |
FROM | |
trade_summaries AS ts | |
INNER JOIN | |
days ON (ts.day_id = days.id) | |
WHERE | |
days.is_trading_day = true | |
AND days.year = 2007 | |
GROUP BY | |
days.cal_date | |
ORDER BY days.cal_date ASC | |
) AS daily_summaries | |
WINDOW | |
year_window as (PARTITION BY extract(year FROM cal_date) ORDER BY extract(year FROM cal_date) ASC) | |
) AS differences | |
WINDOW | |
year_window as (PARTITION BY extract(year FROM cal_date) ORDER BY extract(year FROM cal_date) ASC) | |
) AS standard_deviations | |
WINDOW | |
year_window as (PARTITION BY extract(year FROM cal_date) ORDER BY extract(year FROM cal_date) ASC) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment