Skip to content

Instantly share code, notes, and snippets.

View mh0w's full-sized avatar
💭
🦆

Matthew Hawkes_ONS mh0w

💭
🦆
View GitHub Profile
@mh0w
mh0w / Note on Dawarich.rb
Created November 25, 2024 19:06
Note on Dawarich
# The Ruby development.rb is located within the Docker container at
# /var/app/config/environments/development.rb
# It's here that the user can configure e.g.
Rails.application.configure do
config.hosts << "my-site-here.app"
end
@mh0w
mh0w / Add git bash to Windows 11 terminal.md
Created October 30, 2024 14:32
Add git bash to Windows 11 terminal

Make sure the git command runs successfully in Command Prompt. It needs to be in the PATH env var.

Update the file profile.json: open Settings by pressing Ctrl+, in Windows Terminal, click on Open JSON file in the sidebar, and add following snippet inside the word profiles:

        { 
            "tabTitle": "Git Bash",
            "acrylicOpacity" : 0.75, 
            "closeOnExit" : true, 
@mh0w
mh0w / Identifying large files in a git repo.sh
Created July 8, 2024 10:30
Identifying large files in a git repo
git rev-list --objects --all |
git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' |
sed -n 's/^blob //p' |
awk '$2 >= 2^20' |
sort --numeric-sort --key=2 |
cut -c 1-12,41- |
$(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest
git rev-list HEAD | nl | xargs -n 2 -P 8 sh -c 'git ls-tree -rl "$1" | perl -p -e "\$_ =~ s/[^ ]*+ [^ ]*+ ([^ ]*+) ++([^\t]*+)\t.*+/\1 \2/" | sort > logfile-$0' ; sort -m -u logfile-* | awk '{ sum += $2 } END { print sum }'
@mh0w
mh0w / Add directory to PATH.py
Last active July 8, 2024 15:57
Add directory to PATH
import sys
import os
# Define the path to add to sys.path, such as the current working directory
print(os.getcwd())
path_to_add = os.getcwd()
# Add the path_to_add to the system path
if path_to_add not in sys.path:
sys.path.append(path_to_add)
@mh0w
mh0w / Extract table schemas from an EPIDD.py
Last active January 27, 2025 16:16
Extract table schemas from an EPIDD
"""
Extract table schemas from an EPIDD.
Requirements: pip install --upgrade pandas xlsxwriter python-docx
"""
import pandas as pd
from docx.api import Document
doc_path = "C:/path/to/my.docx"
@mh0w
mh0w / BOTO3 basics.py
Last active July 11, 2024 14:08
BOTO3 basics
#################################################################
# Reading and writing files Sparklessly from S3 with BOTO3 #
#################################################################
import boto3
import raz_client
import pandas as pd
import io
my_bucket = "bucket_name_goes_here"
@mh0w
mh0w / Basic unix.sh
Last active June 20, 2025 16:01
Basic unix
# Resources:
# https://mally.stanford.edu/~sr/computing/basic-unix.html
# https://mally.stanford.edu/~sr/computing/more-unix.html
# https://devhints.io/bash
###############
# Console use #
###############
@mh0w
mh0w / Basic intro to Object Oriented Programming (OOP), classes, and objects in Python.md
Last active June 5, 2024 09:00
Basic intro to Object Oriented Programming (OOP), classes, and objects in Python

There are objects in the real world (phone, microphone, popcorn). Objects have attributes (what it is or has; properties) and methods (what it can do). In Python, a class can be used to create an object. A class is a blueprint or template that describes what attributes and methods an object has.

Class: data type, such as int, string, function, or user-defined Car

Object: instantiated class

Method: a function encapsulated within a class, such as the string class .upper() method.

Encapsulation: is where data in a class are hidden from other classes and only accessible via the class’s methods (‘data-hiding’); Objects (e.g. Cat) manage their own state/attributes (e.g., energy, age), have private methods (e.g., .meow(), .sleep()) that they can call whenever they want, and can only be touched by other classes via public methods (e.g., .feed()).

@mh0w
mh0w / Using spark locally.md
Last active March 6, 2025 12:05
Using spark locally
@mh0w
mh0w / intro.sql
Last active May 6, 2025 14:34
SQL intro
-- SQL: Structured Query Language for communicating with relational (tabular) databases
-- RDBMS: Relational Database Management Systems
-- CRUD: Create, Read, Update, and Delete
-- Databases contain tables, plus potentially Views and Queries too
-- Once connected to the server database, the following example snippets work
-- These snippets were written primarily with Hive and PySpark SQL in mind
------------------