Skip to content

Instantly share code, notes, and snippets.

@tobyontour
Created February 12, 2014 16:49
Show Gist options
  • Select an option

  • Save tobyontour/8959504 to your computer and use it in GitHub Desktop.

Select an option

Save tobyontour/8959504 to your computer and use it in GitHub Desktop.
Simple python script to validate HTML
from contextlib import closing
import urllib2
import html5lib
import sys
f = urllib2.urlopen(sys.argv[1])
data = f.read()
f.close()
parser = html5lib.HTMLParser()
dociument = parser.parse(data)
errList=[]
for pos, errorcode, datavars in parser.errors:
errList.append("Line %i Col %i"%pos + " " + html5lib.constants.E.get(errorcode, 'Unknown error "%s"' % errorcode) % datavars)
sys.stdout.write("\nValidation errors:\n" + "\n".join(errList)+"\n")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment