Skip to content

Instantly share code, notes, and snippets.

@satomacoto
satomacoto / cca.py
Created April 7, 2013 06:34
正準相関分析(Canonical correlation analysis; cca)
#!/usr/bin/env python
# -*- coding:utf-8 -*-
'''
正準相関分析
cca.py
'''
import numpy as np
import scipy as sp
from scipy import linalg as LA
@satomacoto
satomacoto / distance_matrix.py
Created April 7, 2013 06:17
distance matrix with numpy/scipy
# http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html
from scipy.spatial.distance import squareform, pdist
def test_distance():
X = [[2,3,4],[0,0,0],[1,1,1],[2,2,2]]
print pdist(X, 'euclidean')
print squareform(pdist(X, 'euclidean'))
print squareform(pdist(X, polynomial_kernel))
test_distance()
@satomacoto
satomacoto / create.py
Last active December 15, 2022 10:34
kNN on xvideos.com-db.csv
# -*- coding:utf-8 -*-
from pymongo import MongoClient
client = MongoClient()
db = client.xvideos
def create_db():
f = open('xvideos.com-db.csv')
for line in f:
@satomacoto
satomacoto / cooccur-top20.txt
Last active March 24, 2017 05:52
count tag occurrence and cooccurrence on xvideos.com-db.csv
('blowjob', 'hardcore') 594241
('blowjob', 'brunette') 336485
('blowjob', 'teen') 334746
('amateur', 'teen') 322670
('brunette', 'hardcore') 320892
('hardcore', 'teen') 302974
('blonde', 'brunette') 282348
('blonde', 'blowjob') 272567
('blowjob', 'oral') 269461
('blonde', 'hardcore') 252474
@satomacoto
satomacoto / get.py
Created March 31, 2013 02:38
Sort a dictionary by value
import random
def sort_test():
a = [random.random() for i in range(100)]
b = [random.random() for i in range(100)]
c = dict(zip(a, b))
sorted(c, key=c.get)
@satomacoto
satomacoto / genre_list.txt
Created March 30, 2013 07:45
LDA on tags in xvideos.com-db.csv
Amateur
Anal
Anime
Asian Woman
Ass
Ass to Mouths
BDSM
Big Ass
Big Cock
Big Tits
@satomacoto
satomacoto / README.md
Last active December 15, 2015 11:09
Authors Relationships based upon Not-kanji-hiragana Rubis

Visulalize authors relationships based upon not-hiragana-kanji rubis on Aozorabunko.

@satomacoto
satomacoto / youtube_gdata_create_playlist.py
Last active December 15, 2015 10:29
Create a playlist from a webpage that has links to YouTube with YouTube Data API
#!/usr/bin/env python
# -*- coding:utf-8 -*-
'''
まとめサイトなどURLに含まれるYouTubeをプレイリストに保存する
$ python youtube_gdata_playlist.py http://...
'''
# 以下要設定
# cf. YouTubeクライアント
@satomacoto
satomacoto / params_test.py
Last active December 15, 2015 08:59
Test parameters
#!/usr/bin/env python
# -*- coding:utf-8 -*-
'''
try some parameters.
$ python params_test.py
[0, -1]
False
[0, 0]
False
#!/usr/bin/env python
# -*- coding:utf-8 -*-
# Viterbi algorithm
# http://en.wikipedia.org/wiki/Viterbi_algorithm
#
# > python viterbi.py
# 0 1 2
# Rainy: 0.06000 0.03840 0.01344
# Sunny: 0.24000 0.04320 0.00259
# (0.01344, ['Sunny', 'Rainy', 'Rainy'])