Skip to content

Instantly share code, notes, and snippets.

@FanchenBao
Created February 21, 2022 20:17
Show Gist options
  • Save FanchenBao/88ce9a912c876e79006f269b8cd968c5 to your computer and use it in GitHub Desktop.
Save FanchenBao/88ce9a912c876e79006f269b8cd968c5 to your computer and use it in GitHub Desktop.
PostgreSQL Weighted Average And Weighted Standard Deviation

Regarding Math Notation

You can copy and paste the LaTeX notation in a LaTeX editor to view the math notation. A good online LaTeX editor tool is HostMath.

Definition

Let $w_i$ be the weight of ith random variable $x_i$, then weighted average is expressed as

$$ \mu_x = \frac{\sum_{i=1}^n{w_ix_i}}{\sum_{i=1}^nw_i} $$

Weighted standard deviation is expressed as

$$ \sigma_x = \sqrt{\frac{\sum_{i=1}^n{w_i(x_i - \mu_x)^2}}{\frac{n - 1}{n} \sum_{i=1}^n{w_i}}} $$

PostgreSQL

Set up

CREATE TABLE data (
    w real,   /* Weight */
    x real);  /* Random variable */

Handling weighted average

SELECT sum(w * x) / sum(w) AS weighted_avg FROM data;

Handling weighted standard deviation

Weighted standard deviation is not easy to do in Postgres following the definition above, because we have to store $\mu_x$. Fortunately, the equation for weighted standard deviation can be rearranged into this

$$ \sigma_x = \sqrt{\frac{\sum_{i=1}^n{w_ix_i^2} - (\sum_{i=1}^n{w_ix_i})^2 / \sum_{i=1}^n{w_i}}{\frac{n - 1}{n} \sum_{i=1}^n{w_i}}} $$

This equation does not contain $\mu_x$, and thus can be computed easily in Postgres.

SELECT
    sqrt((sum(w * x^2) - (sum(w * x))^2 / sum(w)) / ((count(*) - 1) * sum(w) / count(*))) AS std
FROM data;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment