Skip to content

Instantly share code, notes, and snippets.

@kasramp
Last active April 19, 2016 01:19
Show Gist options
  • Save kasramp/0e7479079fa80cf12391cd9bfb78a7bd to your computer and use it in GitHub Desktop.
Save kasramp/0e7479079fa80cf12391cd9bfb78a7bd to your computer and use it in GitHub Desktop.
Extract all images of a webpage with its hyperlinks using Python
import urllib, re
source = urllib.urlopen('http://www.cbssports.com/nba/draft/mock-draft').read()
f = open('out.txt', 'w')
for link in re.findall('http://sports.cbsimg.net/images/nba/logos/30x30/[A-Z]*.png', source):
print >> f, link # or f.write('...\n')
actually_download = True
if actually_download:
filename = link.split('/')[-1]
urllib.urlretrieve(link, filename)
f.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment