Skip to content

Instantly share code, notes, and snippets.

@sangheestyle
sangheestyle / repo.py
Last active August 29, 2015 13:58
Fetching repo informations from github based on query
import os
import json
import sys
from multiprocessing import Pool
from subprocess import call, check_output
from datetime import datetime, date, timedelta
from time import sleep
import github3 as github
"""
@sangheestyle
sangheestyle / analyze_gutenberg.py
Last active October 19, 2017 18:39
Analyzing books in gutenberg with LDA
import logging
import os
import zipfile
import multiprocessing
from subprocess import call
from gensim.corpora.textcorpus import TextCorpus
from gensim.corpora import Dictionary, MmCorpus
from gensim.models import TfidfModel
from gensim import utils
@sangheestyle
sangheestyle / scratchpad
Created April 20, 2014 23:00
scratchpad
scratchpad
@sangheestyle
sangheestyle / repo.py
Last active August 29, 2015 14:00
git filter among repos
import os
from subprocess import check_output, CalledProcessError
def get_repo_paths(root=None):
repo_paths = []
for dir, _, _ in os.walk(root):
if dir.endswith('/.git'):
repo_paths.append(dir[:-5])
return repo_paths
@sangheestyle
sangheestyle / MyEncoder.py
Last active August 29, 2015 14:01
Python: convert class into json without None
from json import JSONEncoder
class MyEncoder(JSONEncoder):
def default(self, o):
temp = o.__dict__
a = dict()
for k, v in temp.iteritems():
if v is not None:
if type(v) is list:
v = [x for x in v if x is not None]
@sangheestyle
sangheestyle / how_to_install.md
Last active August 29, 2015 14:01
During installing debian Wheezy in recent hardware, I needed to do something to boot the system correctly without pending.

Install NVidia driver

During installing debian, you need to set CMS disable at your BIOS menu. Also after installing done, press e on grub(recovery mode) and modify following line, and then press F10.

Go to linux line and then add following value

nouveau.modeset=0
@sangheestyle
sangheestyle / developers_relationship.md
Last active August 29, 2015 14:01
Analyzing developers' commit relationship

This scratch is for analyzing developers' relationship by analyzing commits such as same chunk modification.

@sangheestyle
sangheestyle / percent
Created August 29, 2014 16:11
print with percent
$ git log --grep="fix" --no-merges --format="%ae" | awk '{a[$0]++} END{for (i in a) if (a[i]>1) printf "%5.2f%%\t%s\n", 100*a[i]/NR, i}'| sort -nr
50.00% [email protected]
21.43% [email protected]
14.29% [email protected]
7.14% [email protected]
$ git log --no-merges --format="%ae" | awk '{a[$0]++} END{for (i in a) if (a[i]>1) printf "%5.2f%%\t%s\n", 100*a[i]/NR, i}'| sort -nr
40.78% [email protected]
22.57% [email protected]
6.53% [email protected]
2.52% [email protected]
@sangheestyle
sangheestyle / install_pandas_linux.md
Created September 3, 2014 20:16
Install Pandas on Debian from the scratch

Install Pandas on Debian from the scratch

a. install requirements

$ sudo apt-get install python-numpy python-scipy
$ sudo apt-get install libopenblas-base libatlas3-base

b. set symbolic link of libblas.so.3 and liblapack.so.3. This step affects performance of Numpy, Scipy, and Pandas. Check which implementation is better. Of course, it can spoil installing pandas with pip.

from os import listdir, stat, sys
from os.path import isfile, join, basename
import time
import multiprocessing
import json
import networkx as nx
from sklearn.utils.graph import graph_shortest_path as gsp
def get_file_paths(root):