Skip to content

Instantly share code, notes, and snippets.

@dehowell
Created March 23, 2011 22:49
Show Gist options
  • Save dehowell/884204 to your computer and use it in GitHub Desktop.
Save dehowell/884204 to your computer and use it in GitHub Desktop.
Python function to test if a file at a URL exists.
import urllib2
def file_exists(location):
request = urllib2.Request(location)
request.get_method = lambda : 'HEAD'
try:
response = urllib2.urlopen(request)
return True
except urllib2.HTTPError:
return False
@cbdelavenne
Copy link

I rewrote this function for Python 3:

import urllib.request

def url_is_alive(url):
    """
    Checks that a given URL is reachable.
    :param url: A URL
    :rtype: bool
    """
    request = urllib.request.Request(url)
    request.get_method = lambda: 'HEAD'

    try:
        urllib.request.urlopen(request)
        return True
    except urllib.request.HTTPError:
        return False

For it to be compatible in both Python 2 and 3, I think you could just import urllib from six: from six.moves import urllib

@Magalame
Copy link

Thank you very much for porting it

@ahiliation
Copy link

@pohvak
Copy link

pohvak commented Jul 3, 2018

im testing this with the file at http://refriauto.com.mx/2015/img/Articulos/b613ae3efdf96ddd7e1a01a11ab0c3112513fd93.jpg
it returns me True, but the file doesn't exist, I delete it, what I am doing wrong?

captura de pantalla 2018-07-03 a la s 12 00 05

captura de pantalla 2018-07-03 a la s 11 58 11

captura de pantalla 2018-07-03 a la s 11 57 59

@dsmurf
Copy link

dsmurf commented Sep 21, 2018

I get no return. i use the python3 script. i did url = https://some.url.file.zip but get neither True nor False

I rewrote this function for Python 3:

import urllib.request

def url_is_alive(url):
    """
    Checks that a given URL is reachable.
    :param url: A URL
    :rtype: bool
    """
    request = urllib.request.Request(url)
    request.get_method = lambda: 'HEAD'

    try:
        urllib.request.urlopen(request)
        return True
    except urllib.request.HTTPError:
        return False

For it to be compatible in both Python 2 and 3, I think you could just import urllib from six: from six.moves import urllib

@nahidalam
Copy link

I rewrote this function for Python 3:

import urllib.request

def url_is_alive(url):
    """
    Checks that a given URL is reachable.
    :param url: A URL
    :rtype: bool
    """
    request = urllib.request.Request(url)
    request.get_method = lambda: 'HEAD'

    try:
        urllib.request.urlopen(request)
        return True
    except urllib.request.HTTPError:
        return False

For it to be compatible in both Python 2 and 3, I think you could just import urllib from six: from six.moves import urllib

tried with url https://images.lululemon.com/is/image/lululemon/LW9BAWR_0919_1

The program hangs and operation timed out

@loethen
Copy link

loethen commented Dec 25, 2020

hi guys, i rewrote this function

`def url_is_alive(url):
"""
Checks that a given URL is reachable.
:param url: A URL
:rtype: bool
"""

try:
    response = urllib.request.urlopen(url)
    status_code = response.getcode()
    if status_code == 200:
        return True
    else:
        return False
except urllib.request.HTTPError:
    return False

`

@glenn-jocher
Copy link

glenn-jocher commented May 12, 2022

Updated version I wrote:

def is_urlfile(url):
    # Check if online file exists
    try:
        r = urllib.request.urlopen(url)  # response
        return r.getcode() == 200
    except urllib.request.HTTPError:
        return False

@meryemCH
Copy link

Hello ,
i wanna fetch file in server with authentication,i try this :

values = {"username": "user", "password": "password"} try: r = urllib.request.urlopen(url, values) # response return r.getcode() == 200 except urllib.request.HTTPError: return False

but thasnt work, can u help me please

@HGStyle
Copy link

HGStyle commented Oct 29, 2024

For all the people saying "doesn't return anything" make sure to include a timeout in the function. I don't really know how to do it using Urllib but I know it is possible using Requests:

import requests

def is_online(url: str) -> bool:
    """
    Checks if the document at a providen URL is online.
    """
    try:
        return requests.head(url, timeout=3).status_code // 100 == 2
    except Exception:
        return False

If it still doesn't works, it might be the webserver: sometimes they simply reject HEAD requests because they are only used by bots. Instead use a GET request (replace "requests.head" to "requests.get" in the code I wrote) but note that it will also download the content body shouldn't be a problem in most cases but if its a file to download then you're kinda screwed because Python will have to load that file into memory which can take a lot of space (and that's why HEAD requests exists, but they're sometimes rejected as I said)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment