Keiichi Kuroyanagi Keiku

🐢

Slowly but surely.

Artificial Intelligence Engineer / Lightning AI Ambassador

Keiku / tqdm.py

Created February 17, 2017 05:49

Print progress bar.

	import time
	from tqdm import tqdm

	pbar = tqdm(["1", "2", "3", "4", "5"])
	for char in pbar:
	pbar.set_description("Processing %s" % char)
	time.sleep(1)
	# 0%\| \| 0/5 [00:00<?, ?it/s]
	# Processing 1: 20%\|██████▏ \| 1/5 [00:01<00:04, 1.00s/it]
	# Processing 2: 40%\|████████████▍ \| 2/5 [00:02<00:03, 1.00s/it]

Keiku / extract_subset.r

Last active February 20, 2017 07:12

Extract a set from the multiple vectors.

	a <- c(1, 3, 5, 7, 9)
	b <- c(3, 6, 8, 9, 10)
	c <- c(2, 3, 4, 5, 7, 9)

	intersect_all <- function(...) Reduce(intersect, list(...))
	union_all <- function(...) Reduce(union, list(...))

	intersect_all(a, b, c)
	# [1] 3 9
	union_all(a, b, c)

Keiku / dplyr_examples.r

Created February 23, 2017 02:12

The example codes on dplyr package.

	library(dplyr)

	iris_df <- as_data_frame(iris)
	iris_df %>% rename_(.dots = setNames(names(.), toupper(names(.)))) %>% head(2)
	# A tibble: 2 × 5
	# SEPAL.LENGTH SEPAL.WIDTH PETAL.LENGTH PETAL.WIDTH SPECIES
	# <dbl> <dbl> <dbl> <dbl> <fctr>
	# 1 5.1 3.5 1.4 0.2 setosa
	# 2 4.9 3.0 1.4 0.2 setosa

Keiku / roc_auc.py

Last active October 5, 2022 01:52

Plot ROC curve.

	import matplotlib.pyplot as plt
	from sklearn.metrics import roc_curve, auc

	import seaborn as sns
	sns.set('talk', 'whitegrid', 'dark', font_scale=1.5, font='Ricty',
	rc={"lines.linewidth": 2, 'grid.linestyle': '--'})

	fpr, tpr, _ = roc_curve([1, 0, 1, 0, 1, 0, 0], [0.9, 0.8, 0.7, 0.7, 0.6, 0.5, 0.4])
	roc_auc = auc(fpr, tpr)

Keiku / impute.py

Created March 10, 2017 01:48

Impute some missing columns with pandas.

	import pandas as pd

	df = pd.DataFrame({'A':['A1', 'A2', 'A3'], 'B':[None, 'B2', None]})
	df
	# Out[51]:
	# A B
	# 0 A1 None
	# 1 A2 B2
	# 2 A3 None

Keiku / dplyr_se.r

Created March 10, 2017 11:07

Summarising by standard evaluation with dplyr.

Keiku / Modeling_GermanCredit.r

Created March 17, 2017 08:38

データサイエンティスト養成読本登竜門編「11-3 Rで機械学習を試してみよう」のソースコード

	# パッケージをインストールする
	pkgs <- c("dplyr", "rpart", "rpart.plot", "rattle", "mlr", "evtree")
	install.packages(pkgs, quiet = TRUE)

	# パッケージを読み込む
	library("dplyr")
	library("rattle")
	library("mlr")
	library("evtree")

Keiku / extract_tfidf_vector.py

Last active April 11, 2017 07:40

Extract the tf-idf vector.

	text = ['This is a string', 'This is another string', 'TFIDF computation calculation', 'TfIDF is the product of TF and IDF']

	from sklearn.feature_extraction.text import TfidfVectorizer
	vectorizer = TfidfVectorizer(max_df=1.0, min_df=1, stop_words='english', norm = None)

	X = vectorizer.fit_transform(text)
	X_vovab = vectorizer.get_feature_names()
	# Out[1]: ['calculation', 'computation', 'idf', 'product', 'string', 'tf', 'tfidf']
	X_mat = X.todense()
	# Out[2]:

Keiku / extract_onehot_vector.py

Created April 12, 2017 06:30

Extract the one-hot encoding vector.

	from sklearn.preprocessing import LabelEncoder, OneHotEncoder

	X_str = np.array([['a', 'dog', 'red'], ['b', 'cat', 'green']])
	# transform to integer
	X_int = LabelEncoder().fit_transform(X_str.ravel()).reshape(*X_str.shape)
	# transform to binary
	X_bin = OneHotEncoder().fit_transform(X_int).toarray()

	print(X_bin)
	# [[ 1. 0. 0. 1. 0. 1.]

Keiku / OrderedDict_sample.py

Last active April 13, 2017 03:35

Get keys/values from sorted OrderedDict.

	from collections import OrderedDict

	d = {'A': 3,
	'B': 2,
	'C': 1}

	OrderedDict(sorted(d.items(), key=lambda x: x[0])).values()
	# Out[1]: odict_values([3, 2, 1])
	OrderedDict(sorted(d.items(), key=lambda x: x[1])).values()
	# Out[2]: odict_values([1, 2, 3])