Skip to content

Instantly share code, notes, and snippets.

View armgilles's full-sized avatar
🎯
Focusing

GILLES Armand armgilles

🎯
Focusing
View GitHub Profile
@armgilles
armgilles / code_scrap.csv
Created December 14, 2015 09:17
To scrap regionnal 2015
code_dep code_scrap
08 44
10 44
51 44
52 44
54 44
55 44
57 44
67 44
68 44
@armgilles
armgilles / vote_nb.csv
Created December 14, 2015 08:36
Nombre de vote au 1er tour des régionnales
We can't make this file beautiful and searchable because it's too large.
code_postal,Commune,departement,region,statut,altitude,superficie,population,code_departement,code_region,LAUT,LCMD,LCOM,LCOP,LDIV,LDVD,LDVG,LEXD,LEXG,LFG,LFN,LMAJ,LMMD,LREG,LSOC,LUD,LUG,LVEC,abstentions,blanc,code_insee,exprime,inscrit,nul,url,votant
39270,ALIEZE,JURA,FRANCHE-COMTE,Commune simple,587.0,582.0,200.0,39,43,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,3.0,23.0,0.0,0.0,0.0,0.0,16.0,11.0,1.0,65.0,1.0,39007,59.0,126.0,1.0,http://elections.interieur.gouv.fr/regionales-2015/27/2739/2739007.html,61.0
88600,CHAMP-LE-DUC,VOSGES,LORRAINE,Commune simple,454.0,396.0,500.0,88,41,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,6.0,11.0,109.0,0.0,0.0,1.0,0.0,63.0,56.0,16.0,192.0,10.0,88086,273.0,480.0,5.0,http://elections.interieur.gouv.fr/regionales-2015/44/4488/4488086.html,288.0
62490,NOYELLES-SOUS-BELLONNE,PAS-DE-CALAIS,NORD-PAS-DE-CALAIS,Commune simple,49.0,426.0,800.0,62,31,0.0,0.0,19.0,0.0,2.0,4.0,0.0,0.0,2.0,0.0,163.0,0.0,0.0,0.0,0.0,86.0,62.0,0.0,189.0,7.0,62627,373.0,572.0,3.0,http://elections.interieur.gouv.fr/regionales-2
@armgilles
armgilles / Data by commune.md
Last active November 9, 2015 10:30
Liste de toutes les données sur le HackAllocs

File niveau commune :

@armgilles
armgilles / code_naf.csv
Last active September 29, 2015 21:37
Nomenclature des codes NAF
code_naf libelle_naf
0111Z Culture de céréales (à l'exception du riz), de légumineuses et de graines oléagineuses
0112Z Culture du riz
0113Z Culture de légumes, de melons, de racines et de tubercules
0114Z Culture de la canne à sucre
0115Z Culture du tabac
0116Z Culture de plantes à fibres
0119Z Autres cultures non permanentes
0121Z Culture de la vigne
0122Z Culture de fruits tropicaux et subtropicaux
@armgilles
armgilles / neuronral_network_pattern.md
Last active August 29, 2015 14:24
List of different pattern. How to use it & why

List of partern for neuronal network :

  1. Autoencoders are simplest ones. They are intuitively understandable, easy to implement and to reason about (e.g. it's much easier to find good meta-parameters for them than for RBMs).
  2. RBMs are generative. That is, unlike autoencoders that only discriminate some data vectors in favour of others, RBMs can also generate new data with given joined distribution. They are also considered more feature-rich and flexible.
  3. CNNs are very specific model that is mostly used for very specific task (though pretty popular task). Most of the top-level algorithms in image recognition are somehow based on CNNs today, but outside that niche they are hardly applicable (e.g. what's the reason to use convolution for film review analysis?).

Autoencoder

Autoencoder is a simple 3-layer neural network where output units are directly connected back to input units. E.g. in a network like this

@armgilles
armgilles / looking_best_preference_AffinityPropagation.py
Last active April 5, 2016 15:10
searching the number of cluster find by AffinityPropagation algo by moving preference value (checking if there is convergence and number of iteration to converge)
from sklearn.cluster import AffinityPropagation
import pandas as pd
import sys
import cStringIO
# You already have your feature in X
aff_eps = []
for i in [x for x in range(-50, 0, 5)]:
# To know caputre the output of verbose
tdout_ = sys.stdout #Keep track of the previous value.
@armgilles
armgilles / looking_best_eps_dbscan.py
Last active August 29, 2015 14:21
searching the number of cluster find by DBSCAN algo by moving eps value
from sklearn.cluster import DBSCAN
import pandas as pd
# You already have your feature in X
dbscan_eps = []
for i in [x / 10.0 for x in range(1, 20, 1)]:
db = DBSCAN(eps=i).fit(X)
n_clusters_ = len(set(db.labels_)) - (1 if -1 in db.labels_ else 0)
print "eps = " +" "+ str(i) +" "+ " cluster = " + str(n_clusters_)
dbscan_eps.append({'eps' : i,
@armgilles
armgilles / optimal_bin_hist.md
Created May 12, 2015 18:25
Looking for optimal bin for a histogram

sturges = lambda n: int(log2(n) + 1) square_root = lambda n: int(sqrt(n)) from scipy.stats import kurtosis doanes = lambda data: int(1 + log(len(data)) + log(1 + kurtosis(data) * (len(data) / 6.) ** 0.5))

n = len(titanic) sturges(n), square_root(n), doanes(titanic.fare.dropna())

titanic.fare.hist(bins=doanes(titanic.fare.dropna()))