Skip to content

Instantly share code, notes, and snippets.

@t3rmin4t0r
Last active August 29, 2015 14:16
Show Gist options
  • Save t3rmin4t0r/5b74c7c6e468f684f705 to your computer and use it in GitHub Desktop.
Save t3rmin4t0r/5b74c7c6e468f684f705 to your computer and use it in GitHub Desktop.
AM history parser for reducer skew checks
import sys
import re
def Counter(name):
pattern = re.compile("%s=([^,]*)" % name)
# warning closure
def get(l):
m = pattern.search(l)
if m:
return m.group(1)
return None
return get
for l in sys.stdin:
groups = Counter("REDUCE_INPUT_GROUPS")
records = Counter("REDUCE_INPUT_RECORDS")
time = Counter("timeTaken")
node = Counter("nodeId")
vertex = Counter("vertexName")
attempt = Counter("taskAttemptId")
start = Counter("startTime")
print "vertex, attempt, start, time, groups, records"
if ("TASK_ATTEMPT_FINISHED" in l):
print ",".join(map(str, [vertex(l), attempt(l), start(l), time(l), groups(l), records(l)]))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment