Skip to content

Instantly share code, notes, and snippets.

View StevenSong's full-sized avatar

Steven Song StevenSong

View GitHub Profile
#!/usr/bin/env python
# eog not bundled with your distro?
# no sudo access?
# do it all with python!
# easiest to use with miniconda env with matplotlib
import sys
import matplotlib.pyplot as plt
from PIL import Image
import os
import re
import gzip
import hashlib
from tqdm import tqdm
from bs4 import BeautifulSoup, SoupStrainer
fnames = []
for fname in os.listdir('PubMed'):
if fname.endswith('.xml'):
@StevenSong
StevenSong / fp.py
Created August 12, 2022 16:27
A gist demonstrating floating point errors, mostly as a convincing argument why learning systems is important
import tensorflow as tf
b = tf.constant(21306806, dtype=tf.float32)
print((1+b)-b)
print(1+(b-b))
# Launch jupyter notebook in a new tmux session for Colab to connect to
# https://research.google.com/colaboratory/local-runtimes.html
SESSION=colab
tmux new -d -s $SESSION
tmux send-keys -t $SESSION "jupyter notebook \
--NotebookApp.allow_origin='https://colab.research.google.com' \
--port=8888 \
--NotebookApp.port_retries=0 \
@StevenSong
StevenSong / auc
Last active September 8, 2020 13:59
#!/bin/bash
# Log files are stored in subdirectories of current directly: ./*/log*
# Log files contain output like: ROC AUC * {label} * = 0.NNN
# Where the AUC for a label on the test set is the last ocurrence of the above fragment in the file
#
# For example, in an experiment to predict "death", our results are in a
# folder "experiment" with 10 subfolders "trial_N", where N is the trial number.
# Within each "trial_N" subfolder, there is a file "log" with output lines:
#
@StevenSong
StevenSong / dispatch.py
Last active June 23, 2020 20:39
dispatch machine learning tasks across multi gpu machines using run scripts
#!/bin/python
#
# usage:
# python dispatch.py \
# --gpus 0-3 \
# --bootstraps 0-9 \
# --scripts \
# train-simple.sh \
# train-varied.sh \
# train-deeper.sh