Skip to content

Instantly share code, notes, and snippets.

@MarounMaroun
Last active December 18, 2017 21:45
Show Gist options
  • Save MarounMaroun/d77ee15d603c3cd72453e86829f1af14 to your computer and use it in GitHub Desktop.
Save MarounMaroun/d77ee15d603c3cd72453e86829f1af14 to your computer and use it in GitHub Desktop.
AWS instance types extractor
from lxml import etree
import urllib
web = urllib.urlopen("https://aws.amazon.com/ec2/instance-types")
s = web.read()
html = etree.HTML(s)
tr_nodes = html.xpath('//*[@id="aws-page-content"]/div/div/main/section/div[2]/div[43]/div/table/tbody/tr')
for tr_node in tr_nodes:
if tr_node.xpath("td")[0].text:
print tr_node.xpath("td")[0].text + '\t' + tr_node.xpath("td")[1].text + '\t' + tr_node.xpath("td")[2].text
elif tr_node.xpath("td/p")[0].text:
print tr_node.xpath("td/p")[0].text + '\t' + tr_node.xpath("td/p")[1].text + '\t' + tr_node.xpath("td/p")[2].text
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment