Skip to content

Instantly share code, notes, and snippets.

@sangheestyle
sangheestyle / howto.txt
Last active August 29, 2015 14:07
How to install Debian and some Python data science tools.
OK, here is what you should do to use python pandas on Debian Wheezy,
Step 1: Make a bootable image with usb key
a. Get a usb key
b. Download debian live ISO image (recommend xfce integrated image)
c. Make a usb key as a ISO boot image (recommend using dd)
Step 2: Install & Boot
d. Boot up your machine with Debian usb key (recommend UEFI)
e. Do configuration
@sangheestyle
sangheestyle / medici.py
Last active August 29, 2015 14:06
ps2-6.py
# Reference:
# http://networkx.github.io/documentation/networkx-1.9.1/
# reference/algorithms.centrality.html
import operator
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
import itertools
@sangheestyle
sangheestyle / scratch
Created September 8, 2014 07:01
Scratch
nothing
@sangheestyle
sangheestyle / grand_sum.py
Last active August 29, 2015 14:06
Results
import os
import json
summary_folder = "summary"
summary = {}
for path in os.listdir(summary_folder):
path_summary = json.load(open(os.path.join(summary_folder, path), "r"))
if not len(summary):
summary = path_summary
else:
@sangheestyle
sangheestyle / summary.py
Last active August 29, 2015 14:06
Summary sp
import sys
import json
from os import listdir
from os.path import isfile, join, basename
def get_file_paths(root):
file_paths = []
for f in listdir(root):
if isfile((join(root, f))):
@sangheestyle
sangheestyle / run_sssp.sh
Last active August 29, 2015 14:06
Run SSSP
#!/bin/bash
mkdir sssp
cd sssp
curl "https://dl.dropboxusercontent.com/u/7571776/facebook100_txt_only.zip" -o "facebook100_txt_only.zip"
unzip facebook100_txt_only.zip
curl https://gist.githubusercontent.com/sangheestyle/6335b209b4779dddc5ac/raw/47a803d6ccfa727efb80d5c32f5e7361d32c861a/ps16c.py > ps16c.py
mkdir out
python ps16c.py facebook100txt $1 $2
cd ..
from os import listdir, stat, sys
from os.path import isfile, join, basename
import time
import multiprocessing
import json
import networkx as nx
from sklearn.utils.graph import graph_shortest_path as gsp
def get_file_paths(root):
@sangheestyle
sangheestyle / install_pandas_linux.md
Created September 3, 2014 20:16
Install Pandas on Debian from the scratch

Install Pandas on Debian from the scratch

a. install requirements

$ sudo apt-get install python-numpy python-scipy
$ sudo apt-get install libopenblas-base libatlas3-base

b. set symbolic link of libblas.so.3 and liblapack.so.3. This step affects performance of Numpy, Scipy, and Pandas. Check which implementation is better. Of course, it can spoil installing pandas with pip.

@sangheestyle
sangheestyle / percent
Created August 29, 2014 16:11
print with percent
$ git log --grep="fix" --no-merges --format="%ae" | awk '{a[$0]++} END{for (i in a) if (a[i]>1) printf "%5.2f%%\t%s\n", 100*a[i]/NR, i}'| sort -nr
50.00% [email protected]
21.43% [email protected]
14.29% [email protected]
7.14% [email protected]
$ git log --no-merges --format="%ae" | awk '{a[$0]++} END{for (i in a) if (a[i]>1) printf "%5.2f%%\t%s\n", 100*a[i]/NR, i}'| sort -nr
40.78% [email protected]
22.57% [email protected]
6.53% [email protected]
2.52% [email protected]
@sangheestyle
sangheestyle / developers_relationship.md
Last active August 29, 2015 14:01
Analyzing developers' commit relationship

This scratch is for analyzing developers' relationship by analyzing commits such as same chunk modification.