Skip to content

Instantly share code, notes, and snippets.

@neilkod
Created September 9, 2011 14:53
Show Gist options
  • Save neilkod/1206432 to your computer and use it in GitHub Desktop.
Save neilkod/1206432 to your computer and use it in GitHub Desktop.
python_from_pig.py
#!/opt/cnet-python/default-2.6/bin/python
from org.apache.pig.scripting import *
P = Pig.compile("""
raw = load '$input_file' using PigStorage();
grpd = GROUP raw ALL;
cntd = FOREACH grpd GENERATE COUNT(raw);
store cntd INTO '$output_dir' USING PigStorage();
""")
input_file='data/warehouse/facts/page_events/day=2011-07-31/part-00087.gz'
output_dir='python_cntd4'
stats = P.bind().runSingle()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment