Skip to content

Instantly share code, notes, and snippets.

@randyzwitch
Last active August 29, 2015 14:02
Show Gist options
  • Save randyzwitch/9abeb66d8637d1a0007c to your computer and use it in GitHub Desktop.
Save randyzwitch/9abeb66d8637d1a0007c to your computer and use it in GitHub Desktop.
Hive Full Table Scan vs. Predicate Pushdown on Outer Join
--#### Assume sales Hive table partitioned by day_id ####--
--Full Table Scan
select
employees.id,
b.sales
from employees
left join sales on (employees.id = sales.employee_id)
where day_id between '2014-03-01' and '2014-05-31';
--Partitioned-based query
select
employees.id,
b.sales
from employees
left join sales on (employees.id = sales.employee_id and sales.day_id between '2014-03-01' and '2014-05-31');
@gauravkumar37
Copy link

It works for left joins but doesn't seem to work with full outer join.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment