Skip to content

Instantly share code, notes, and snippets.

View do-me's full-sized avatar

Dominik Weckmüller do-me

View GitHub Profile
@do-me
do-me / duckdb_boundary.sh
Last active November 11, 2025 15:20
Filter OSM US Layercake boundary.parquet remotely over DuckDB and export e.g. a region to e.g. a parquet or json file.
#!/usr/bin/env bash
#
# duckdb_boundary.sh — run DuckDB spatial queries by region
#
# Usage:
# ./duckdb_boundary.sh --query "Sicilia" [--format PARQUET|CSV|JSON]
#
# Description:
# - Ensures DuckDB 'spatial' extension is installed and loaded
# - Reads OpenStreetMap boundary data (remote parquet)
@do-me
do-me / extract.sql
Created November 10, 2025 13:02
Extract regional boundaries from osm us layercake with duckdb -ui
COPY (
SELECT *
FROM 'https://data.openstreetmap.us/layercake/boundaries.parquet'
WHERE list_contains("tags"['name'], 'Sicilia')
OR list_contains("tags"['names']['en'], 'Sicily')
) TO 'sicily_boundaries.parquet' (FORMAT 'PARQUET');
@do-me
do-me / script.sh
Created November 10, 2025 09:44
OSM: From .osm.pbf to parquet file with all streets for a region
osmium extract -b 12.2,36.5,15.8,38.8 italy.osm.pbf -o sicily.osm.pbf --overwrite;
osmium tags-filter sicily.osm.pbf w/highway -o sicily_streets.osm.pbf --overwrite;
ogr2ogr -f Parquet -oo PRELUDE_STATEMENTS="INSTALL spatial; LOAD spatial;" \
sicily_streets.parquet sicily_streets.osm.pbf lines;
@do-me
do-me / style.json
Created November 7, 2025 08:09
Mapterhorn 3D DEM + ArcGIS Online World Imagery Map Style
{
"version": 8,
"sources": {
"arcgisonline": {
"type": "raster",
"tiles": [
"https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}"
],
"tileSize": 256,
"maxzoom": 18,
@do-me
do-me / style.json
Created November 7, 2025 07:32
Maptherhorn Terrain 3D Style
{
"version": 8,
"sources": {
"osm": {
"type": "raster",
"tiles": ["https://a.tile.openstreetmap.org/{z}/{x}/{y}.png"],
"tileSize": 256,
"attribution": "<a href=\"https://www.openstreetmap.org/copyright\">&copy; OpenStreetMap Contributors</a>",
"maxzoom": 19
},
@do-me
do-me / style.json
Last active November 5, 2025 08:20
Maptherhorn Style
{
"version": 8,
"sources": {
"hillshadeSource": {
"type": "raster-dem",
"tiles": ["https://tiles.mapterhorn.com/{z}/{x}/{y}.webp"],
"encoding": "terrarium",
"tileSize": 512,
"attribution": "<a href=\"https://mapterhorn.com/attribution\">© Mapterhorn</a>"
}
@do-me
do-me / extract.sh
Last active October 30, 2025 14:54
Extract Bundesländer polygons from germany-latest.osm.pbf to bundeslaender.parquet fast with osmium and ogr2ogr
osmium tags-filter germany-latest.osm.pbf r/boundary=administrative -o temp_boundaries.osm.pbf --overwrite
osmium tags-filter temp_boundaries.osm.pbf r/admin_level=4 -o bundeslaender.osm.pbf --overwrite
time ogr2ogr -f Parquet bundeslaender.parquet bundeslaender.osm.pbf multipolygons
@do-me
do-me / benchmark.py
Created October 21, 2025 12:04
Benchmark for Mac M3 Max 128Gb and mlx-community/gemma-3-270m-it-4bit with mlx-lm
from mlx_lm import batch_generate, load
model, tokenizer = load("mlx-community/gemma-3-270m-it-4bit")
# load a pandas df here, df has a text column
import pandas as pd
df = pd.read_parquet("2000_benchmark_texts_BAAI.parquet")
# Apply the chat template and encode to tokens
prompts = [i + "--------\nSummarize this article in one sentence" for i in df.text.to_list()]
prompts = [
@do-me
do-me / copygit.sh
Created October 9, 2025 09:19
A shell scirpt cloning a github repo, running yek to copy all text files to clipboard, then deleting the repo for easy pasting in Gemini or other long context models
#!/bin/bash
# Usage: clone_yek_copy <github_repo_url>
# Description: Clones a GitHub repo, runs `yek | pbcopy`, then deletes the repo.
set -e
# --- 1️⃣ Check for URL argument ---
if [ -z "$1" ]; then
echo "Usage: $0 <github_repo_url>"
exit 1
@do-me
do-me / prompt.py
Last active October 7, 2025 03:42
A single line to try out mlx-community/Qwen3-Next-80B-A3B-Instruct-8bit on MacOS with mlx
import argparse
from mlx_lm import load, generate
# Parse CLI arguments
parser = argparse.ArgumentParser()
parser.add_argument("--prompt", type=str, default="hello", help="Custom prompt text")
parser.add_argument("--max-tokens", type=int, default=1024, help="Maximum number of tokens to generate")
args = parser.parse_args()
# Load model