Skip to content

Instantly share code, notes, and snippets.

View u8sand's full-sized avatar

Daniel J. B. Clarke u8sand

View GitHub Profile
@u8sand
u8sand / pd_merge.py
Created April 9, 2020 16:47
Pandas merges simplified
import pandas as pd
def merge(left, *rights, **kwargs):
''' Helper function for many trivial (index based) joins
Usage:
merge(frame1, frame2, frame3, how='inner')
'''
merged = left
for right in rights:
merged = pd.merge(left=merged, left_index=True, right=right, right_index=True, **kwargs)
@u8sand
u8sand / quantile_normalize.py
Last active April 9, 2020 16:43
Quantile normalization in python
import pandas as pd
import numpy as np
def quantileNormalize_np(mat):
# sort vector in np (reuse in np)
sorted_vec = np.sort(mat, axis=0)
# rank vector in np (no dict necessary)
rank = sorted_vec.mean(axis=1)
# construct quantile normalized matrix
return np.array([
@u8sand
u8sand / priority_cache.ts
Last active March 25, 2020 16:16
Typescript priority cache I made for a project but never used
export class PriorityCache<V = any> {
store: {
[key: string]: {
value: V,
start: number,
cost: number,
demand: number,
size: number,
}
} = {}
@u8sand
u8sand / dezoomify.py
Last active November 5, 2022 20:52
Convert zoomify pages into giant images
#!/bin/python
# pip install click Pillow tqdm
# dezoomify.py will infer the shape while downloading all the tiles and then re-assemble a complete image.
# Usage:
# python dezoomify.py -v -Z 5 -t tmp -o output.jpg https://baseurl/zoomify/id/TileGroup{0,1,2,3}
# -Z n: zoom level (if not specified, will identify the best zoom level and download that one)
# -v: verbosity
# -t [dir]: temporary directory (can be omitted, but recommended in case it needs to be re-run)
# -o [file.jpg]: output file
# [url...]: the urls to download the tiles
@u8sand
u8sand / ncbi_lookup.py
Last active January 7, 2020 21:56
A convenient snippet for ncbi-driven gene synonym symbol mapping
import pandas as pd
ncbi = pd.read_csv('ftp://ftp.ncbi.nih.gov/gene/DATA/GENE_INFO/Mammalia/Homo_sapiens.gene_info.gz', sep='\t')
# Ensure nulls are treated as such
ncbi = ncbi.applymap(lambda v: float('nan') if type(v) == str and v == '-' else v)
# Break up lists
split_list = lambda v: v.split('|') if type(v) == str else []
ncbi['dbXrefs'] = ncbi['dbXrefs'].apply(split_list)
ncbi['Synonyms'] = ncbi['Synonyms'].apply(split_list)
ncbi['LocusTag'] = ncbi['LocusTag'].apply(split_list)
@u8sand
u8sand / keybindings.json
Last active March 26, 2021 17:04
Visual studio code keybindings for using shift+enter to run selection in terminal whenever a terminal panel is opened (and not interfere with python interactive terminal)
// Place your key bindings in this file to overwrite the defaults
[
{
"key": "shift+enter",
"command": "workbench.action.terminal.runSelectedText",
"when": "editorTextFocus && terminalIsOpen"
},
{
"key": "shift+enter",
"command": "markdown-preview-enhanced.runCodeChunk",
@u8sand
u8sand / parsed_url.py
Created November 19, 2019 19:27
A convenient mutable python urlparse for parsing and manipulating urls
import urllib.parse as urllib_parse
class ParsedUrl:
''' A convenient object for manipulating urls -- works
like urlparse + parse_qs but mutable
Example:
url = ParsedUrl('http://google.com/?q=hello+world&a=this#anchor')
del url.fragment
del url.query['a']
@u8sand
u8sand / cannonical_uuid.py
Created October 4, 2019 16:07
A simple reproducible way of generating consistent UUIDs based on an object
import uuid
U = uuid.UUID('00000000-0000-0000-0000-000000000000')
def canonical_uuid(obj):
return str(uuid.uuid5(U, str(obj)))
@u8sand
u8sand / deep_typeof.py
Created September 29, 2019 20:31
A convenient type function that reports the set of types in dictionaries/other iterables up to a desired depth
''' Usage:
val = {1: [{'hello': 'world', 2: 'okay'}], 2: []}
typeof(val) == dict[int: list,list[dict[int,str: str]]]
typeof(val, 2) == dict[int: list]
i.e. it will look (as deep as you want) into your object and let you know what it is
'''
import functools
def isIterable(v):
@u8sand
u8sand / defaultdict.py
Created September 29, 2019 20:30
A simple implementation of defaultdict which works better for nested defaultdicts
''' Motivation: standard python collections.defaultdict doesn't work to well with
deep defaultdicts because it only actually creates the object on __setitem__; this
means the following occurs:
```
d = collections.defaultdict(lambda: collections.defaultdict(lambda: []))
d[1][2].append(3)
3 in d[1][2] == False # you would have expected it to be True
```
The above situation works for the below implementation. The trade-off is potential