razhangwei/hive.md

Last active February 5, 2021 21:19

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/razhangwei/a4093a617b2ebf9c9525c19ecf16cfc6.js"></script>
Save razhangwei/a4093a617b2ebf9c9525c19ecf16cfc6 to your computer and use it in GitHub Desktop.

Download ZIP

Hive / Spark SQL

Raw

hive.md

Useful udf:

Use Daiquery 'interactive spark' to debug the query first.
HiveInsertOperatorWithSchema does not support CTE in select query; need to put it in preselect
empty map with types: FB_CAST(NULL, 'MAP<INT, ARRAY<DOUBLE>>')
array: FB_ARRAY_APPLY, FB_ARRAY_AGGREGATE, FB_ARRAY_GET, FB_ARRAY_SORT,
FB_PREV
aggregate: FB_COLLECT
sample:

DISTRIBUTE BY ASC

LAMBDA(x TYPE) SOME_EXPR(x)

different names for primitive types: FLOAT, STRING
type composition: e.g., MAP<INT, FLOAT>, ARRAY<FLOAT>
type conversion: FB_CAST(a, 'MAP<INT, ARRAY<BIGINT>>')
dynamic partition inserts:

INSERT OVERWRITE TABLE <OUTPUT_TBL>
	PARTITION(ds = '<DATEID>', pipeline = '<PIPELINE>', version, type)
...
SELECT 
   ...,
   version, 
   type

Reference:

Language manual: https://fburl.com/wiki/vr669g3h

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment