Skip to content

Instantly share code, notes, and snippets.

@t3rmin4t0r
Created July 2, 2014 17:37
Show Gist options
  • Save t3rmin4t0r/f07f769f4f0b4318a7b8 to your computer and use it in GitHub Desktop.
Save t3rmin4t0r/f07f769f4f0b4318a7b8 to your computer and use it in GitHub Desktop.
ORC stripe verifier
import sys
import re
S_RE = re.compile(r'Stripe: offset: ([0-9]*) data: ([0-9]*) rows: ([0-9]*).*')
items = [m.groups() for m in [S_RE.search(l) for l in sys.stdin] if m]
parsed = [(int(a),int(b), int(c)) for (a,b,c) in items]
stripe_size = 256*1024*1024
for (start, len, rows) in parsed:
if (start / stripe_size) != ((start+len) / stripe_size):
print start+len, "overflows", start, "block"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment