Skip to content

Instantly share code, notes, and snippets.

@Nipsuli
Created August 21, 2024 07:32
Show Gist options
  • Save Nipsuli/d3bdc5535c4dc1b797d149ebd5d80191 to your computer and use it in GitHub Desktop.
Save Nipsuli/d3bdc5535c4dc1b797d149ebd5d80191 to your computer and use it in GitHub Desktop.
data load with meta columns example
SELECT
/* Meta columns to have same pattern across sources
* ___hash for data deduplication and creating unique id that's stable
* ___source_pk to be able to PARTITION BY to get the correct data per object
* ___as_of telling the business validity. In case something is a range already in the source system, then split into two rows
* ___loaded_at to indicate when this particular table was loaded
* ___original_loaded_at (optional) in case we have some indicator when the data originally has arrived into our system
*/
TO_HEX(SHA256(TO_JSON_STRING(my_table))) AS ___hash,
my_table.id as ___source_pk,
TIMESTAMP(my_table.order_date) AS ___as_of,
CURRENT_TIMESTAMP() AS ___loaded_at,
_etl_loaded_at AS ___original_loaded_at,
/* Full rest of the table
* In case of some column gets dropped, just nullify it from now on
* In case new columns appear, just add them
*/
*
FROM `jaffle.orders` my_table
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment