Skip to content

Instantly share code, notes, and snippets.

@Alex-Huleatt
Last active April 18, 2018 21:52
Show Gist options
  • Save Alex-Huleatt/fdb1a97dd321c8b7d65ccff8efe6a2ee to your computer and use it in GitHub Desktop.
Save Alex-Huleatt/fdb1a97dd321c8b7d65ccff8efe6a2ee to your computer and use it in GitHub Desktop.
'''
Google deprecated their search api
Google results pages do not immediately contain result urls (I checked)
Here is a really bad script to get the first page of results and a bunch of other stupid irrelevant urls
You need selenium and firefox.
Works for me on OSX.
Suck it Google, you can't control me.
Note: This might be against some Google TOS or get you blocked or banned or something idk.
@AlexHuleatt
'''
from selenium import webdriver
import urllib
import lxml.html
def query(q):
driver = webdriver.Firefox()
driver.get("https://www.google.com/search?q="+q)
src = driver.page_source
dom = lxml.html.fromstring(src)
results = filter(lambda x:x.startswith("http"),[link for link in dom.xpath('//a/@href')])
driver.close()
return results
print query("llama")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment