Skip to content

Instantly share code, notes, and snippets.

View u8sand's full-sized avatar

Daniel J. B. Clarke u8sand

View GitHub Profile
@u8sand
u8sand / rdmt.rs
Last active November 29, 2020 22:03
Rust Disk Matrix Transpose: for transposing large matrix data-frames without lots of memory
// #!/usr/bin/env run-cargo-script
//! Uncomment first line & remove extension for use as cargo rust script, left as .rs for syntax highlighting on github.
//!
//! Transpose large matrix data-frames in csv/tsv format with
//! extremely low memory usage. It does this by reading
//! the matrix top-down.
//!
//! For usage instructions see rmdt --help
//!
//! ```cargo
@u8sand
u8sand / merge_on_set.py
Created October 10, 2020 21:22
A pandas-style merge which works efficiently on a join field that consists of sets (via an inverted index) -- relevant for things like using synonyms
def build_inverse_dict(items):
idict = {}
for k, V in items:
for v in V:
idict[v] = idict.get(v, set()) | {k}
return idict
def pd_merge_on_set(left=None, left_on=None, right=None, right_on=None):
''' Merge on a one to many relationship.
```raw
@u8sand
u8sand / pacman-transaction-rollback.py
Last active July 2, 2023 17:03
Revert to a previous system package state by identifying transactions in pacman.log
#!/usr/bin/env python3
# If, for example, a recent upgrade broke a lot of things by breaking something that everything else depends on
# like the linux kernel, libc, qt, or something else fundamental; this script makes it possible to revert to
# a previous system package install state. By walking through your pacman.log, the script groups pacman upgrades into
# individual transactions (reported by ALPM), we can then step through these transactions to determine the set of
# packages necessary to downgrade. Naturally this only works if you haven't removed the packages from your package
# cache yet.
#
# WARNING: Use at your own risk, this has been minimally tested so far but it activates pacman
# anyway which will prompt you to review any changes before they are applied.
@u8sand
u8sand / docker-patch
Last active December 6, 2022 20:04
A helper script for patching docker images
#!/bin/bash
# Dependencies:
# - docker (obviously)
# - jq (json parsing)
docker_patch_usage() {
echo 'Usage: docker-patch'
echo ' CONTAINER=$(docker-patch start your/tag)'
echo ' # apply patch to $CONTAINER (docker container)'
echo ' docker-patch commit ${CONTAINER} your/patched-tag'
@u8sand
u8sand / proxychrome
Last active August 17, 2022 15:31
A wrapper script for spawning a blank chromium browser profile passing a proxy uri like proxychain
#!/bin/python
import sys
import click
import socket
import contextlib
import tempfile
import traceback
from subprocess import Popen
@u8sand
u8sand / proxychain
Last active August 17, 2022 15:32
A wrapper script for generating proxychains config and running proxychains with a sane command-line interface
#!/bin/python
import sys
import click
import socket
import contextlib
import urllib.parse
import tempfile
from subprocess import Popen
@u8sand
u8sand / dash_in_jupyter.py
Created July 1, 2020 16:55
A mechanism to display dash components (that have UMD set up) in jupyter
# Usage:
#
# import your_dash_component_module
# your_dash_component_module_display = dash_to_ipython_HTML_template(your_dash_component_module)
# your_dash_component_module_display(
# your_dash_component_module.your_dash_component(
# your_props='as_kwargs'
# )
# )
@u8sand
u8sand / Important File Recovery.md
Last active June 6, 2020 19:15
A gist describing a simple way with standard unix tools you can recover lost files from the contents of your disk when all else fails.

NOTE: The script and some information may be inaccurate in this tutorial, I'm under the impression my files were found because they were cached by vscode, nonetheless parts of the tutorial will work specifically if you have some contents of a file (strings, grep, dd)

Recovering important lost files from disk with unix tools

This is a last resort after trying testdisk's un-delete feature which I find works quite well. In this situation however testdisk was unable to identify the files in the directory I was looking for. It helps to know a unique string in the file but is not fully necessary.

If the file is really important then stop writing to that disk, if it's only kind of important, depending on how full your disk is it may not matter if you're super lazy (I recovered this file a few days after the fact still using my disk as normal, but you should not do this). You should also probably at least work on a different drive during the process (saving these dumps and such to a different drive).

@u8sand
u8sand / fetch_save_read.py
Created April 9, 2020 16:55
A simple pandas helper for cached reading of pandas files from urls
import os
import pandas as pd
def fetch_save_read(url, file, reader=pd.read_csv, sep=',', **kwargs):
''' Download file from {url}, save it to {file}, and subsequently read it with {reader} using pandas options on {**kwargs}.
'''
if not os.path.exists(file):
df = reader(url, sep=sep, index_col=None)
df.to_csv(file, sep=sep, index=False)
return pd.read_csv(file, sep=sep, **kwargs)
@u8sand
u8sand / read_gmt.py
Created April 9, 2020 16:53
Read a .gmt as dictionary or pandas dataframe
import re
import json
def _try_json_loads(s):
try:
return json.loads(s)
except:
return s
_gene_parser = re.compile(r'^([\w\d]+)[^\w\d]+(.+)$')