Skip to content

Instantly share code, notes, and snippets.

@meskarune
Created April 29, 2016 19:48
Show Gist options
  • Save meskarune/0abccf1cc954bd3eb688402af71dc9af to your computer and use it in GitHub Desktop.
Save meskarune/0abccf1cc954bd3eb688402af71dc9af to your computer and use it in GitHub Desktop.
concurrent operations in python
#!/usr/bin/python
import concurrent.futures
import urllib.request
from bs4 import BeautifulSoup
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://google.com/',
'http://www.bbc.co.uk/',
'http://www.pinterest.com/',
'http://tumblr.com/',
'http://reddit.com',
'http://imgur.com/']
def get_data(url):
try:
data = urllib.request.urlopen(url, timeout=30)
except urllib.error.HTTPError as e:
data = e
return data
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
response = executor.map(get_data, URLS)
for url in response:
try:
title = BeautifulSoup(url.read(),
"html.parser").title.string.replace('\n', ' ').strip()
print (title)
except:
print ("error")
pass
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment