Last active
February 18, 2019 16:38
-
-
Save u8sand/13b59028a465c5a51201703490e75a24 to your computer and use it in GitHub Desktop.
Online Office at my work likes to modify urls of raw text.. This script reverses the process.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
# Usage: | |
# no-proofpoint < broken_file.txt > fixed_file.txt | |
import re, sys | |
proofpoint_re = re.compile(r'https?://.+?/url\?u=(.+?)&d=.+?&e=') | |
encode_re = re.compile(r'-(\w{2})') | |
stdin = sys.stdin.read() | |
prev = 0 | |
for match in proofpoint_re.finditer(stdin): | |
sys.stdout.write(stdin[prev:match.start()]) | |
url = match.group(1).replace('_', '/') | |
url_prev = 0 | |
for m in encode_re.finditer(url): | |
sys.stdout.write(url[url_prev:m.start()]) | |
sys.stdout.write(chr(int(m.group(1), base=16))) | |
url_prev = m.end() | |
sys.stdout.write(url[url_prev:]) | |
prev = match.end() | |
sys.stdout.write(stdin[prev:]) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment