- For basics on Querying I recommend Become a SELECT star! zine
- Databricks Functions
Description | ORACLE SQL | Databricks |
---|---|---|
Date | date '2020-01-01' |
'2020-01-01' |
Timestamp | timestamp '2020-01-01 00:00:00' |
'2020-01-01 00:00:00' |
To Day | trunc(<column>) |
to_date(<column>) |
To Hour | trunc(<column>, 'HH24') |
date_trunc(<column>, 'HOUR') |
To Minute | trunc(<column>, 'MI') |
date_trunc(<column>, 'MINUTE') |
To Second | trunc(<column>, 'SS') |
date_trunc(<column>, 'SECOND') |
Extract Hour | extract(hour from <column>) |
extract(hour from <column>) |
- Parallel
/*+ parallel(8) */
- Use limiting
WHERE
clauses>= time
- Use
WITH
instead ofJOIN
s/Sub Queries - Use
EXISTS
instead ofIN
- Use
>=
and<=
instead ofBETWEEN
- Use
Explain
to get cost of query
- Dates are not equal
- The timestamp part might not be equal
- NULLs
- Use
COALESCE
for default - Not included in
NOT IN
- Make sure to use
IS
and not==
- Use
- Mixing
AND
s andOR
s- Use parenthesis to clarify order
- Read and write to same row
- Creates deadlock
- Don't subquery the same table being updated
- Refer to
UPDATE
'sFROM
- Parsing JSON
- Use
JSON_VALUE(<column>, '$.<key>')
- Databricks
get_json_object(<column>, '$.<key>')
- Use
- Parsing XML
- Use
EXTRACTVALUE(<column>,'/<entity>/@<attribute>')
- Use
Oracle Alternative To Seconds to_char(<column>, 'YYYY-MM-DD HH24:MI:SS')
WITH tempname (arg) as
(sub table)
query using sub table
UPDATE
/*+ parallel(8) */
table t
SET
t.column = 'val'
WHERE
EXISTS (
SELECT
1
FROM
table2 t2
WHERE
t1.id = t2.id
AND t2.status IN ('S', 'D')
);