Created
June 12, 2014 16:03
-
-
Save randyzwitch/858de8f765d5908f78d6 to your computer and use it in GitHub Desktop.
Hive Percentile using Cross Join
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--Hive expects that you want to calculate your percentiles by account_number and sales | |
--This code will generate an error about a missing GROUP BY statement | |
select | |
account_number, | |
sales, | |
CASE WHEN sales > a.sales_90th_percentile from sales THEN 1 ELSE 0 END as top10pct_sales | |
from sales | |
cross join (select percentile_approx(sales, .9) as sales_90th_percentile from sales) a; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
I am looking for 25th percentile of sales column. Below is the query i wrote
select MAX(a.sales_25th_percentile)
FROM sales
CROSS JOIN
(SELECT percentile_approx(sales, 0.75) as sales_25th_percentile FROM sales) a
;
Please comment or suggest alternative.