Skip to content

Instantly share code, notes, and snippets.

@kirsn
Last active August 29, 2015 14:14
Show Gist options
  • Save kirsn/c920bf25ddfa8810364a to your computer and use it in GitHub Desktop.
Save kirsn/c920bf25ddfa8810364a to your computer and use it in GitHub Desktop.
A quick test to test the tld module: https://github.com/barseghyanartur/tld
"""
A test to see whether the tld module recognizes all the valid domains.
"""
import requests # http://docs.python-requests.org/en/latest/
import tld # https://pypi.python.org/pypi/tld
# Mozilla curated db of tlds. Used by tld module
tlds = requests.get("https://publicsuffix.org/list/effective_tld_names.dat")
valid_urls = 0
invalid_urls = 0
invalid_domains = ""
print("<html><body>")
print("<h2>%s</h2>" %
"Domains NOT recognized by tld module, though the domain is VALID")
for line in tlds.text.split('\n'):
# Ignore lines with comments, or empty spaces
if line.startswith("//") == False and len(line.strip()) > 0:
url = "http://test." + line.strip()
try:
tld.get_tld(url)
valid_urls += 1
except tld.exceptions.TldDomainNotFound:
invalid_urls += 1
invalid_domains += "%s, " % line.encode(
'ascii', 'xmlcharrefreplace')
print(invalid_domains)
print("<h2>Some numbers</h2>")
print("<b>Invalid Domains according to the <i>tld module</i>: %d</b>" %
invalid_urls)
print("Valid domains according to the <i>tld module</i>: %d" %
valid_urls)
print("</body></html>")
@barseghyanartur
Copy link

Hey, thanks!

I'll check it as soon as I get some free time.

Best regards,

Artur

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment