Created
October 6, 2011 14:48
-
-
Save benjaminhawkeslewis/1267584 to your computer and use it in GitHub Desktop.
My first Pig script - filter input by UUID, then sort by score
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- Filter rows in $inputFile by UUIDs present in $filterByFile, then sort by score | |
input = LOAD '$inputFile' USING PigStorage('\t') AS (uuid:chararray,score:double); | |
filter_by = LOAD '$filterByFile' USING PigStorage('\t') AS (uuid:chararray); | |
filtered = JOIN input BY uuid, filter_by by uuid; | |
wanted_fields = FOREACH filtered GENERATE $0 as uuid:chararray, $1 as score:double; | |
ordered = ORDER wanted_fields BY score DESC; | |
STORE ordered INTO '$outputDir' USING PigStorage('\t'); |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment