Skip to content

Instantly share code, notes, and snippets.

@oleksii-demedetskyi
Created March 25, 2015 17:06
Show Gist options
  • Save oleksii-demedetskyi/270a2578847059d48d9e to your computer and use it in GitHub Desktop.
Save oleksii-demedetskyi/270a2578847059d48d9e to your computer and use it in GitHub Desktop.
Google doc html public link parsing to txt.
grep '<tbody>' $1 | # select line with table
sed 's/^.*<tbody>//' | # drop html before table
sed 's/<\/tbody>.*$//' | # drop html after table
sed 's/<\/tr>/<\/tr>\'$'\n/g' | # split lines with \n
sed 's/<[^>]*>/|/g' | # replace <td> tags with '|'
sed 's/^\|\{3\}[0-9]*\|\{3\}//g' | # drop line numbers
sed 's/\|\|/'$'\t/g' # insert tabs instead '|'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment