Skip to content

Instantly share code, notes, and snippets.

View hughdbrown's full-sized avatar

Hugh Brown hughdbrown

View GitHub Profile
@hughdbrown
hughdbrown / fibonacci.py
Created May 17, 2017 04:36
Function to calculate fibonacci function
import numpy as np
m = np.matrix([[1, 1], [1, 0]])
def fib(n):
return (m ** n)[0, 1]
@hughdbrown
hughdbrown / fixer.py
Created May 1, 2017 04:49
Python code to remove duplicate files with soft links. Good for video courses that put the same course into multiple directories.
#!/usr/bin/env python
from __future__ import print_function
from hashlib import sha1
from collections import defaultdict
import os.path
import os
def calc_sha(filename, size=256 * 1000 * 1000):
@hughdbrown
hughdbrown / find_empty_folders.py
Created April 3, 2017 20:41
List directories that are empty
#!/usr/bin/env python
from __future__ import print_function
import os
import os.path
def find_empty_folders(start_dir='.'):
for root, dirs, files in os.walk(start_dir):
for d in dirs:
fulldir = os.path.join(root, d)
@hughdbrown
hughdbrown / mobility.py
Last active April 2, 2017 21:27
How correlated is rank of most mobile cities versus rank of most foreign-born cities?
#!/usr/bin/env python
from __future__ import print_function
from re import compile
from abc import abstractmethod
from scipy.stats import spearmanr, pearsonr, linregress
import numpy as np
import matplotlib.pyplot as plt
@hughdbrown
hughdbrown / bash-prefix.sh
Created March 23, 2017 14:44
How bash scripts should start
set -o errexit
set -o nounset # same as 'set -u'
set -o pipefail
# See notes in:
# http://www.davidpashley.com/articles/writing-robust-shell-scripts/
@hughdbrown
hughdbrown / progressbar-test.py
Last active February 16, 2017 17:24
Use of python progressbar
# pip install progressbar2
import time
from progressbar import (
ProgressBar,
Percentage, Bar, ETA,
)
values = list(range(1, 10 + 1))
with ProgressBar(widgets=[Percentage(), Bar(), ETA()], max_value=len(values)) as pbar:
for i in values:
@hughdbrown
hughdbrown / caesar-cipher-solve.py
Created November 11, 2016 18:55
Produce candidates for solution of a Caesar cipher using a dictionary
from os.path import expanduser
def match_word(pattern, wordfile=None):
wordfile = wordfile or os.path.expanduser("~/Downloads/sowpods.txt")
with open(wordfile) as handle:
for line in handle:
word = line.rstrip().lower()
lookup = {c: d for c, d in zip(pattern, word)}
if "".join(lookup.get(c, '!') for c in pattern) == word:
@hughdbrown
hughdbrown / sync-files.py
Created September 27, 2016 19:27
Move updated files from src directory to dst directory
#!/usr/bin/env python
from __future__ import print_function
import os
import sys
from collections import defaultdict
from fnmatch import fnmatch
from pprint import pprint
from hashlib import sha1
@hughdbrown
hughdbrown / graphlab-create 1.8.3.txt
Created September 26, 2016 02:07
Coursera course requires version 1.8.3, which is wrong
(data2) C:\Users\hughdbrown>pip install graphlab-create==1.8
Collecting graphlab-create==1.8
Could not find a version that satisfies the requirement graphlab-create==1.8 (from versions: 2.1)
No matching distribution found for graphlab-create==1.8
(data2) C:\Users\hughdbrown>pip install graphlab-create==1.8.3
Collecting graphlab-create==1.8.3
Could not find a version that satisfies the requirement graphlab-create==1.8.3 (from versions: 2.1)
No matching distribution found for graphlab-create==1.8.3
@hughdbrown
hughdbrown / docker-run-pipeline.sh
Last active September 21, 2016 17:21
Shell script to run docker image for O'Reilly Kafka-Cassandra-Spark course
#!/usr/bin/env bash -e
export image_version="2.0.1"
export image_name="datafellas/distributed-pipeline-quotes:${image_version}"
sudo docker pull ${image_name}
sudo docker run --rm -it \
--memory=8g \
--cpuset-cpus="0-3" \