Skip to content

Instantly share code, notes, and snippets.

@JohannesBuchner
JohannesBuchner / mplrecorder.py
Last active April 21, 2022 19:19
Intercept all matplotlib calls and store each figure's data into json files, with labels as keys.
"""
Intercept all matplotlib calls and store each figure's data
into json files, with labels as keys.
Usage:
Just replace:
import matplotlib.pyplot as plt
with:
from mplrecorder import plt
@JohannesBuchner
JohannesBuchner / logdeduplicator.py
Created April 6, 2022 10:11
Strips duplicated and repeated lines from stdin (such as a log output). Can also handles multiline repeats, up to a configurable memory limit.
import os
import sys
max_memory = int(os.environ.get('MAX_MEMORY', '10'))
recent_lines = []
for line in sys.stdin:
if line not in recent_lines:
sys.stdout.write(line)
@JohannesBuchner
JohannesBuchner / shrink-folder.sh
Created March 18, 2022 23:05
Delete files until folder is smaller than 1GB
find $FOLDER -maxdepth 1 -type f -printf '%s\t%p\n' |
{ S=0; while read s l; do
((S+=s)); [[ $S -gt 1000000000 ]] && rm -v "$l";
done; }
@JohannesBuchner
JohannesBuchner / convert_setup_py_to_pyproject_toml.py
Created March 17, 2022 16:42
Convert setup.py to pyproject.toml (WIP)
# Help create a pyproject.toml from a setup.py file
#
# USAGE:
# 1)
# replace "from [a-z.]* import setup" in your setup.py
# with "from convert_setup_py_to_pyproject_toml import setup"
# 2)
# run the resulting script with python, with this script in the PYTHONPATH
#
# The above can be achieved on Linux, for example, with:
@JohannesBuchner
JohannesBuchner / supertar.sh
Created September 22, 2021 12:15
Better tar file compression by sorting similar files together
# Compression can be improved when files with the same or similar content
# are next to each other in the file list.
#
# This command sorts by reversed filenames, which places files
# together by file extension, filename and path, in that order.
# identify all files
find mypath/ -type f |
rev | sort | rev |
tar --no-recursion --files-from=- -cvzf myarchive.tar.gz
@JohannesBuchner
JohannesBuchner / cmdcache.py
Created March 1, 2021 10:26
Cache/Memoize any command line program. Keeps stdout, stderr and exit code, env aware.
import sys, os
import joblib
import subprocess
mem = joblib.Memory('.', verbose=False)
@mem.cache
def run_cmd(args, env):
process = subprocess.run(args, capture_output=True, text=True)
return process.stdout, process.stderr, process.returncode
@JohannesBuchner
JohannesBuchner / joss_make_latex.sh
Created January 23, 2021 22:30
Make LaTeX and PDF from JOSS markdown papers
#!/bin/bash
# you need to install:
# pip install openbases
# sudo apt install texlive-xetex pandoc pandoc-citeproc
PDF_INFILE=paper.md
PDF_LOGO=logo.png
PDF_OUTFILE=paper.pdf
TEX_OUTFILE=paper.tex
@JohannesBuchner
JohannesBuchner / xray_opt_gif.sh
Created December 18, 2020 12:14
Make a gif flipping between an X-ray and optical image at some coordinate
#!/bin/bash
# example usage:
# bash xray_opt_gif.sh 155.87737 +19.86508
RA=$1
DEC=$2
wget -nc "https://alasky.unistra.fr/hips-thumbnails/thumbnail?ra=${RA}&dec=${DEC}&fov=0.21750486127986932&width=500&height=500&hips_kw=CDS%2FP%2FSDSS9%2Fcolor" -O opt.jpg
wget -nc "https://alasky.unistra.fr/hips-thumbnails/thumbnail?ra=${RA}&dec=${DEC}&fov=0.21750486127986932&width=500&height=500&hips_kw=xcatdb%2FP%2FXMM%2FPN%2Fcolor" -O xmm.jpg
@JohannesBuchner
JohannesBuchner / benchmark.sh
Last active January 1, 2021 07:54
awk solutions for simple groupby in https://h2oai.github.io/db-benchmark/
# columns: id1,id2,id3,id4,id5,id6,v1,v2,v3
f=G1_1e7_1e2_0_0.csv
awk="time mawk"
# groupby simple
$awk -F, 'NR>1 { a[$1] += $7 } END {for (i in a) print i, a[i]}' $f >/dev/null
$awk -F, 'NR>1 { a[$1,$2] += $7 } END { for (comb in a) { split(comb,sep,SUBSEP); print sep[1], sep[2], a[sep[1],sep[2]]; }}' $f >/dev/null
$awk -F, 'NR>1 { a[$3] += $7; n[$3]++; b[$3] += $9; } END {for (i in a) print i, a[i], b[i]/n[i];}' $f >/dev/null
$awk -F, 'NR>1 { a[$4] += $7; n[$4]++; b[$4] += $8; } END {for (i in a) print i, a[i]/n[i], b[i]/n[i];}' $f >/dev/null
@JohannesBuchner
JohannesBuchner / cachestan.py
Created May 18, 2020 11:32
Build and cache Stan models smartly (ignoring changes in comments and white spaces)
import re
import pystan
import hashlib
import pickle
import os
def build_model(code):
lines = code.split("\n")
lines = [re.sub('//.*$', '', line).strip() for line in lines]
lines = [line.replace(' ', ' ').replace(' ', ' ').replace(' ', ' ')