Created
October 3, 2012 20:35
-
-
Save ottomata/3829695 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
LOG_FIELDS = LOAD '$input' USING PigStorage(' ') AS (hostname:chararray, udplog_sequence:chararray, timestamp:chararray, request_time:chararray, remote_addr:chararray, http_status:chararray, bytes_sent:chararray, request_method:chararray, uri:chararray, proxy_host:chararray, content_type:chararray, referer:chararray, x_forwarded_for:chararray, user_agent); | |
STATUS = FOREACH LOG_FIELDS GENERATE http_status; | |
FILTERED_STATUS = FILTER STATUS BY ($0 matches '.*(404|200|302).*'); | |
STATUS_COUNT = FOREACH (GROUP FILTERED_STATUS BY $0 PARALLEL 28) GENERATE $0, COUNT($1) as num; | |
STATUS_COUNT_SORTED = ORDER STATUS_COUNT BY num DESC; | |
STORE STATUS_COUNT_SORTED into '$output'; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment