Last active
October 31, 2020 19:28
-
-
Save deepns/ca1f9c35882a0512e1bf9cad4a6c1913 to your computer and use it in GitHub Desktop.
A util function to check whether a given URL exists or not
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import urllib.request | |
# NOTE: Python 3.6+ installations on macOS require an extra step to work with | |
# https links due to certificate access issues. | |
# Here is a snippet from the release notes. | |
# This package includes its own private copy of OpenSSL 1.1.1. | |
# The trust certificates in system and user keychains managed by the Keychain Access application and the security | |
# command line utility are not used as defaults by the Python ssl module. A sample command script is included | |
# in /Applications/Python 3.9 to install a curated bundle of default root certificates from the third-party | |
# certifi package (https://pypi.org/project/certifi/). Double-click on Install Certificates to run it. | |
# See https://www.python.org/downloads/release/python-360/ | |
def check_url_exists(url:str) -> bool: | |
""" | |
Returs True if the given link exists and accessible, False otherwise | |
""" | |
# Just checking whether an URL exists or not. | |
# The default method type is GET which fetches the data associated | |
# with the URL as well. HEAD method simply fetches only the header | |
# which is enough to for this function. Hence setting method to HEAD | |
# in the request object | |
# See https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods | |
request = urllib.request.Request(url, method='HEAD') | |
try: | |
with urllib.request.urlopen(request) as response: | |
# some helpful information to remember | |
# response.info() returns an instance of mimetools.Message | |
# For http response, this carries the header information | |
# print(response.info()) | |
# response.read() will return the data attached to the | |
# print(response.read()) | |
return True | |
except(urllib.request.HTTPError): | |
# Checking for HTTPError only. | |
# A different error will be raised (e.g. socket error) if there | |
# is any network error. | |
return False | |
# Tested with Python3.6+ versions | |
# print(check_url_exists("https://unsplash.com/photos/-zvx4EoPRDw")) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment