Skip to content

Instantly share code, notes, and snippets.

@pierdom
Last active September 6, 2017 08:07
Show Gist options
  • Select an option

  • Save pierdom/f65e9e8c50956896b5116a23f7ec2557 to your computer and use it in GitHub Desktop.

Select an option

Save pierdom/f65e9e8c50956896b5116a23f7ec2557 to your computer and use it in GitHub Desktop.
[From rows to arrays in Postgres and Hive] grouping by a key #datascience #bigdata #postgresql #hive #sql

Original table:

 k  | v
----------
 A  | 1
 A  | 2
 A  | 3
 B  | 6
 B  | 7

I want arrays of values of v grouping by the key k like this:

 k | my_array
--------------
 A | {1,2,3}
 B | {6,7}

In Hive I would do like this:

select k, collect_list(v) as my_array
from my_table
group by k

In Postgres we need to define an aggregate function:

CREATE AGGREGATE array_accum (anyelement) (
  sfunc = array_append,
  stype = anyarray,
  initcond = '{}'

And then call it:

select k, array_accum(v) as my_array
from my_table
group by k
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment