-
-
Save chris-chris/a8e37ecc57662a95953d52a44ac4166d to your computer and use it in GitHub Desktop.
How to install Chrome, ChromeDriver and Selenium on CentOS. Plus a sample scraping script.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
# https://developers.supportbee.com/blog/setting-up-cucumber-to-run-with-Chrome-on-Linux/ | |
# https://gist.github.com/curtismcmullan/7be1a8c1c841a9d8db2c | |
# http://stackoverflow.com/questions/10792403/how-do-i-get-chrome-working-with-selenium-using-php-webdriver | |
# http://stackoverflow.com/questions/26133486/how-to-specify-binary-path-for-remote-chromedriver-in-codeception | |
# http://stackoverflow.com/questions/40262682/how-to-run-selenium-3-x-with-chrome-driver-through-terminal | |
# http://askubuntu.com/questions/760085/how-do-you-install-google-chrome-on-ubuntu-16-04 | |
# Versions | |
CHROME_DRIVER_VERSION=`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE` | |
SELENIUM_STANDALONE_VERSION=3.8.1 | |
SELENIUM_SUBDIR=$(echo "$SELENIUM_STANDALONE_VERSION" | cut -d"." -f-2) | |
# Make sure you have below info in the yum repo file. | |
cat >/etc/yum.repos.d/google-chrome.repo <<EOL | |
[google-chrome] | |
name=google-chrome | |
baseurl=http://dl.google.com/linux/chrome/rpm/stable/x86_64 | |
enabled=1 | |
gpgcheck=1 | |
gpgkey=https://dl-ssl.google.com/linux/linux_signing_key.pub | |
EOL | |
# Remove existing downloads and binaries so we can start from scratch. | |
yum remove google-chrome-stable | |
rm ~/selenium-server-standalone-*.jar | |
rm ~/chromedriver_linux64.zip | |
rm /usr/local/bin/chromedriver | |
rm /usr/local/bin/selenium-server-standalone.jar | |
# Install dependencies. | |
yum update | |
yum install -y unzip openjdk-8-jre-headless xorg-x11-server-Xvfb libXi-devel GConf2-devel google-chrome | |
# Install ChromeDriver. | |
wget -N https://chromedriver.storage.googleapis.com/2.35/chromedriver_linux64.zip -P ~/ | |
unzip ~/chromedriver_linux64.zip -d ~/ | |
rm ~/chromedriver_linux64.zip | |
mv -f ~/chromedriver /usr/local/bin/chromedriver | |
chown root:root /usr/local/bin/chromedriver | |
chmod 0755 /usr/local/bin/chromedriver | |
# Install Selenium. | |
wget -N http://selenium-release.storage.googleapis.com/$SELENIUM_SUBDIR/selenium-server-standalone-$SELENIUM_STANDALONE_VERSION.jar -P ~/ | |
mv -f ~/selenium-server-standalone-$SELENIUM_STANDALONE_VERSION.jar /usr/local/bin/selenium-server-standalone.jar | |
chown root:root /usr/local/bin/selenium-server-standalone.jar | |
chmod 0755 /usr/local/bin/selenium-server-standalone.jar |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import time | |
from selenium import webdriver | |
import os | |
from selenium.webdriver.common.keys import Keys | |
from pyvirtualdisplay import Display | |
display = Display(visible=0, size=(800, 800)) | |
display.start() | |
chrome_options = webdriver.ChromeOptions() | |
# below trick saved my life | |
chrome_options.add_argument('--no-sandbox') | |
# set the folder where you want to save your file | |
prefs = {'download.default_directory' : os.getcwd()} | |
chrome_options.add_experimental_option('prefs', prefs) | |
# Optional argument, if not specified will search path. | |
driver = webdriver.Chrome('/usr/local/bin/chromedriver',chrome_options=chrome_options) | |
chrome_options=chrome_options | |
# Scraping steps | |
driver.get("http://pypi.python.org/pypi/selenium") | |
time.sleep(3) | |
driver.find_element_by_css_selector("#introduction table tbody tr:nth-child(3) td:nth-child(2) a").click() | |
time.sleep(3) | |
print(' [*] Finished!') | |
driver.quit() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment