Skip to content

Instantly share code, notes, and snippets.

View will-fong's full-sized avatar
🏠
Working from home

Will Fong will-fong

🏠
Working from home
View GitHub Profile
@conormm
conormm / r-to-python-data-wrangling-basics.md
Last active May 3, 2025 19:21
R to Python: Data wrangling with dplyr and pandas

R to python data wrangling snippets

The dplyr package in R makes data wrangling significantly easier. The beauty of dplyr is that, by design, the options available are limited. Specifically, a set of key verbs form the core of the package. Using these verbs you can solve a wide range of data problems effectively in a shorter timeframe. Whilse transitioning to Python I have greatly missed the ease with which I can think through and solve problems using dplyr in R. The purpose of this document is to demonstrate how to execute the key dplyr verbs when manipulating data using Python (with the pandas package).

dplyr is organised around six key verbs:

@yossorion
yossorion / what-i-wish-id-known-about-equity-before-joining-a-unicorn.md
Last active April 15, 2025 22:49
What I Wish I'd Known About Equity Before Joining A Unicorn

What I Wish I'd Known About Equity Before Joining A Unicorn

Disclaimer: This piece is written anonymously. The names of a few particular companies are mentioned, but as common examples only.

This is a short write-up on things that I wish I'd known and considered before joining a private company (aka startup, aka unicorn in some cases). I'm not trying to make the case that you should never join a private company, but the power imbalance between founder and employee is extreme, and that potential candidates would

@CharlieScarver
CharlieScarver / AdventureTime.csv
Last active May 19, 2025 16:07 — forked from austinpray/AdventureTime.csv
List of important adventure time episodes
Season Episode Title Reason
1 5 The Enchiridion A good intro to the series, plus introduces the important Enchiridion
1 2 Trouble in Lumpy Space* Introduces LSP (episode out of order)
1 3 Prisoners of Love Introduces Ice King and his obsession (episode out of order)
1 7 Ricardio the Heart Guy Finn and PB development, Sets a returning plot
1 8 Business Time* First mention of Ooo being post-apocalyptic
1 9 My Two Favorite People Intros the Jake and Lady Rainicorn plotline
1 10 Memories of Boom Boom Mountain A look at how Finn was adopted into Jake's Family
1 12 Evicted! Intros Marceline
Sparkline Line =
// Static line color - use %23 instead of # for Firefox compatibility
VAR LineColor = "%2301B8AA"
// "Date" field used in this example along the X axis
VAR XMinDate = MIN('Table'[Date])
VAR XMaxDate = MAX('Table'[Date])
// Obtain overall min and overall max measure values when evaluated for each date
@sarthology
sarthology / regexCheatsheet.js
Created January 10, 2019 07:54
A regex cheatsheet 👩🏻‍💻 (by Catherine)
let regex;
/* matching a specific string */
regex = /hello/; // looks for the string between the forward slashes (case-sensitive)... matches "hello", "hello123", "123hello123", "123hello"; doesn't match for "hell0", "Hello"
regex = /hello/i; // looks for the string between the forward slashes (case-insensitive)... matches "hello", "HelLo", "123HelLO"
regex = /hello/g; // looks for multiple occurrences of string between the forward slashes...
/* wildcards */
regex = /h.llo/; // the "." matches any one character other than a new line character... matches "hello", "hallo" but not "h\nllo"
regex = /h.*llo/; // the "*" matches any character(s) zero or more times... matches "hello", "heeeeeello", "hllo", "hwarwareallo"
@alirezamika
alirezamika / autoscraper-examples.md
Last active April 30, 2025 22:28
AutoScraper Examples

Grouping results and removing unwanted ones

Here we want to scrape product name, price and rating from ebay product pages:

url = 'https://www.ebay.com/itm/Sony-PlayStation-4-PS4-Pro-1TB-4K-Console-Black/203084236670' 

wanted_list = ['Sony PlayStation 4 PS4 Pro 1TB 4K Console - Black', 'US $349.99', '4.8'] 

scraper.build(url, wanted_list)
@nervuzz
nervuzz / happy_airflow_wsl2.md
Last active February 8, 2022 06:06
[DE Zoomcap] Airflow & WSL2: no-frills or no-thrills

-- Read about DataTalks.Club Data Engineering Zoomcamp --

Airflow & WSL2: no-frills or no-thrills

Second week of the data engineering Zoomcamp by DataTalks.Club brought a new tool that is one of the most popular data pipeline platforms - Apache Airflow. So we are going to create some workflows!

Intro

First you have to run the Docker compose Airflow installation in the environment of our choice, which can be one of but not limited to MacOS, Linux, GCP VM or very popular WSL. What's more, we also need the Google Cloud SDK installed in our Airflow env in order to connect with the Cloud Store bucket & create tables in Big Query. That means we cannot just use the official docker-compose.yaml referenced in the Airflow's docs, but we have to build custom Dockerfile with an extended apache/airflow image containing our additional dependencies. Then we can incorporate it into docker-compose.yaml 🙌