Skip to content

Instantly share code, notes, and snippets.

View sgrillon14's full-sized avatar
🏠
Working from home

Stéphane GRILLON sgrillon14

🏠
Working from home
View GitHub Profile
### 1. Clone your fork:
git clone [email protected]:YOUR-USERNAME/YOUR-FORKED-REPO.git
### 2. Add remote from original repository in your forked repository:
cd into/cloned/fork-repo
git remote add upstream https://github.com/ORIGINAL-DEV-USERNAME/REPO-YOU-FORKED-FROM.git -t BRANCH-NAME
git fetch upstream
@sgrillon14
sgrillon14 / web-scraping-java-jsoup-htmlunit-jaunt-uij-selenium-phantomjs.md
Last active March 3, 2018 22:33
Web Scraping with Java: JSoup - HtmlUnit - Jaunt - ui4j - Selenium - PhantomJS

JSoup

JSoup is a HTML parser, it can't control the web page, only parse the content. Supports only CSS Selectors. It gives you the possibility to select elements using jQuery-like CSS selectors and provides a slick API to traverse the HTML DOM tree to get the elements of interest. Particularly the traversing of the HTML DOM tree is the major strength of JSoup. Can be used in web applications.

HtmlUnit

HtmlUnit is a "GUI-Less browser for Java programs". The HtmlUnit browser can simulate Chrome, Firefox or Internet Explorer behaviour. It is a light weight solution that doesn't have too many dependencies. Generally, it supports JavaScript and Cookies, but in some cases it may fail. HtmlUnit is used for testing, web scraping, and is the basis for other tools. You can simulate pretty much anything a browser can do like click events, submit events etc. It's much more than alone a HTML parser, is ideal for web application automated unit testing. Supports XPath, but the problem starts when you try to extrac

import java.awt.AWTException;
import java.io.File;
import java.io.IOException;
import org.apache.commons.io.FileUtils;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.TakesScreenshot;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
@sgrillon14
sgrillon14 / gist:18bcb15cb045c557b0e01c81faf6ceb4
Last active May 27, 2020 07:42
download old stable builds of Chromium
Look up the version number (for example "44.0.2403.157") in the Position Lookup
In this case it returns a base position of "330231". This is the commit of where the 44 release was branched, back in May 2015.
Open the continuous builds archive
Click through on your platform (Linux/Mac/Win)
Paste "330231" into the filter field at the top and wait for all the results to XHR in.
Eventually I get a perfect hit: https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Mac/330231/
Sometimes you may have to decrement the commit number until you find one.
## gitlab-ci.yml: