Skip to content

Instantly share code, notes, and snippets.

@neilkod
Created August 23, 2010 18:26
Show Gist options
  • Save neilkod/546013 to your computer and use it in GitHub Desktop.
Save neilkod/546013 to your computer and use it in GitHub Desktop.
# load the raw data
raw = load 'emp.txt' using PigStorage('\t') as (empno:int,ename:chararray,job:chararray,sal:int,deptno:int);
# group the raw data by deptno. There are only 3 departments(10,20,30)
grpd = group raw by deptno;
# for each deptno(grpd), sort the data by sal in descending order, then limit
# to 3 rows and return the output.
top3sal =
foreach grpd
{ordered = order raw by sal desc;
limited = limit ordered 3;
generate flatten(limited.(sal,ename,empno,deptno));
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment