Skip to content

Instantly share code, notes, and snippets.

@korakot
Last active October 8, 2024 14:39
Show Gist options
  • Save korakot/5c8e21a5af63966d80a676af0ce15067 to your computer and use it in GitHub Desktop.
Save korakot/5c8e21a5af63966d80a676af0ce15067 to your computer and use it in GitHub Desktop.
Use selenium in Colab
# install chromium, its driver, and selenium
!apt update
!apt install libu2f-udev libvulkan1
!wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
!dpkg -i google-chrome-stable_current_amd64.deb
!wget https://edgedl.me.gvt1.com/edgedl/chrome/chrome-for-testing/118.0.5993.70/linux64/chromedriver-linux64.zip
!unzip -j chromedriver-linux64.zip chromedriver-linux64/chromedriver -d /usr/local/bin/
!pip install selenium chromedriver_autoinstaller
# set options to be headless, ..
from selenium import webdriver
import chromedriver_autoinstaller
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
chromedriver_autoinstaller.install()
# open it, go to a website, and get results
wd = webdriver.Chrome(options=options)
wd.get("https://www.website.com")
print(wd.page_source) # results
# divs = wd.find_elements_by_css_selector('div')
# I create my own library to make it even easier
!pip install kora -q
from kora.selenium import wd
wd.get("https://www.website.com")
print(wd.page_source) # results
# I add a few helpers
divs = wd.select("div") # css selecter
div = divs[0]
span = div.select1("span") # return the first result
wd # screenshot
@Ezra-Cohen
Copy link

I just want to say thank you, I was trying everything I could for a project I was working on, nothing worked until I came across this, you are genuinely amazing for making this

@korakot
Copy link
Author

korakot commented Apr 28, 2022

For reference, this method is first discovered here in Dec 2018.

@GColab2023
Copy link

Getting below error for Google Colab Selenium with Chrome

Code:

install chromium, its driver, and selenium

!apt update
!apt install chromium-chromedriver
!pip install selenium# set options to be headless, ..
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')

create a webdriver instance, ready to use

wd = webdriver.Chrome('chromedriver',options=options)

Error

WebDriverException Traceback (most recent call last)
in
6 options.add_argument('--disable-dev-shm-usage')
7 # create a webdriver instance, ready to use
----> 8 wd = webdriver.Chrome('chromedriver',options=options)

3 frames
/usr/local/lib/python3.8/dist-packages/selenium/webdriver/common/service.py in assert_process_still_running(self)
115 return_code = self.process.poll()
116 if return_code:
--> 117 raise WebDriverException(f"Service {self.path} unexpectedly exited. Status code was: {return_code}")
118
119 def is_connectable(self) -> bool:

WebDriverException: Message: Service chromedriver unexpectedly exited. Status code was: 1

Can someone help on this.

@TomekGitHubPrivate
Copy link

getting the same error as above-mentioned :/

@korakot
Copy link
Author

korakot commented Mar 26, 2024

I have updated the solution, so it should work again now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment