Created
October 10, 2012 19:56
-
-
Save ottomata/3868011 to your computer and use it in GitHub Desktop.
Group By referrer, filter on BannerController
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| LOG_FIELDS = LOAD '$input' USING PigStorage(' ') AS (hostname:chararray, udplog_sequence:chararray, timestamp:chararray, request_time:chararray, remote_addr:chararray, http_status:chararray, bytes_sent:chararray, request_method:chararray, uri:chararray, proxy_host:chararray, content_type:chararray, referer:chararray, x_forwarded_for:chararray, user_agent); | |
| LOG_FIELDS = FILTER LOG_FIELDS BY (uri matches '.*BannerController.*'); | |
| REFERER = FOREACH LOG_FIELDS GENERATE referer; | |
| COUNT = FOREACH (GROUP REFERER BY $0 PARALLEL 7) GENERATE $0, COUNT($1) as num; | |
| COUNT_SORTED = ORDER COUNT BY num DESC; | |
| DUMP COUNT_SORTED; | |
| STORE URI_COUNT_SORTED into '$output'; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment