Skip to content

Instantly share code, notes, and snippets.

@kmatt
Created November 11, 2024 15:21
Show Gist options
  • Save kmatt/b9152dd76ead686906df97802cb05f86 to your computer and use it in GitHub Desktop.
Save kmatt/b9152dd76ead686906df97802cb05f86 to your computer and use it in GitHub Desktop.
DuckDB rowhash
SELECT tbl::TEXT, HASH(tbl::TEXT), MD5(tbl::TEXT) FROM tbl;
D create table tbl as (select 1 as a, 2 as b, 3 as c);
D select tbl::text, hash(tbl::text), md5(tbl::text) from tbl;
┌──────────────────────────┬────────────────────────────┬──────────────────────────────────┐
│ CAST(tbl AS VARCHAR) │ hash(CAST(tbl AS VARCHAR)) │ md5(CAST(tbl AS VARCHAR)) │
│ varchar │ uint64 │ varchar │
├──────────────────────────┼────────────────────────────┼──────────────────────────────────┤
│ {'a': 1, 'b': 2, 'c': 3} │ 6764392534128998287 │ e31681d6e7ab078c9679fcd4f50136eb │
└──────────────────────────┴────────────────────────────┴──────────────────────────────────┘
The text of rows can be aggregated and hashed to a single value, unknown if there are limits to how many rows (string size) this can support:
SELECT md5(string_agg(tbl::text, '')) FROM tbl;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment