Skip to content

Instantly share code, notes, and snippets.

@aboettger
Last active August 29, 2015 14:03
Show Gist options
  • Save aboettger/6a86df1e2ffa09addbe2 to your computer and use it in GitHub Desktop.
Save aboettger/6a86df1e2ffa09addbe2 to your computer and use it in GitHub Desktop.
Lokomotivbaureihen im PDF finden
pdftotext "input.pdf" - | sed -e "s/\x{00A0}/ /g" | tr '\n' ' ' | sed -e "s/[[:space:]]\{2,\}/ /g" | grep -oaP "(die|Baureihe|VT|VB|E|V|BR|DB)[[:space:]]*[0-9]{2,3}[[:space:]]*[0-9]*" | sed -e "s/^ *//" | sed -e "s/ *$//" | sed -e "s/^Baureihe/BR/" | sed -e "s/^die/BR/" | sort -b | uniq | tr '\n' ','
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment