Skip to content

Instantly share code, notes, and snippets.

@u8sand
Last active February 18, 2019 16:38
Show Gist options
  • Save u8sand/13b59028a465c5a51201703490e75a24 to your computer and use it in GitHub Desktop.
Save u8sand/13b59028a465c5a51201703490e75a24 to your computer and use it in GitHub Desktop.
Online Office at my work likes to modify urls of raw text.. This script reverses the process.
#!/usr/bin/env python3
# Usage:
# no-proofpoint < broken_file.txt > fixed_file.txt
import re, sys
proofpoint_re = re.compile(r'https?://.+?/url\?u=(.+?)&d=.+?&e=')
encode_re = re.compile(r'-(\w{2})')
stdin = sys.stdin.read()
prev = 0
for match in proofpoint_re.finditer(stdin):
sys.stdout.write(stdin[prev:match.start()])
url = match.group(1).replace('_', '/')
url_prev = 0
for m in encode_re.finditer(url):
sys.stdout.write(url[url_prev:m.start()])
sys.stdout.write(chr(int(m.group(1), base=16)))
url_prev = m.end()
sys.stdout.write(url[url_prev:])
prev = match.end()
sys.stdout.write(stdin[prev:])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment