amorgun’s gists

amorgun / one-hot.py

Created March 4, 2016 13:35 — forked from ramhiser/one-hot.py

Apply one-hot encoding to a pandas DataFrame

	import pandas as pd
	import numpy as np
	from sklearn.feature_extraction import DictVectorizer

	def encode_onehot(df, cols):
	"""
	One-hot encoding is applied to columns specified in a pandas DataFrame.

	Modified from: https://gist.github.com/kljensen/5452382

amorgun / virtualenv.md

Last active March 31, 2016 07:22

How to create new virtualenv

virtualenv --no-site-packages --distribute -p `which python3` <env_name>

or simply

pyvenv <env_name>

amorgun / kill.sh

Created May 31, 2016 11:59

Kill all processes by name

sudo kill -9 $(ps aux | grep <name> | awk '{print $2}')

amorgun / script.py

Last active February 14, 2017 12:09

Tkinter perspective experiment

	import tkinter
	import time

	master = tkinter.Tk()

	canvas = tkinter.Canvas(master, width=1000, height=1000)
	canvas.pack()

	def animation():
	def draw_point(x, y, c='black'):

amorgun / sparse_hstack_external_memory.py

Last active March 13, 2018 14:54

Efficient sparse csr matrix hstack

	import numpy as np
	import scipy as sp
	import scipy.sparse
	import tempfile


	def hstack(parts):
	with tempfile.TemporaryFile() as data_file, tempfile.TemporaryFile() as indices_file:
	data = np.memmap(data_file,
	dtype=parts[0].dtype,

amorgun / convert_svm.py

Created February 27, 2017 11:24

SVM hack

	from scipy.sparse import csr_matrix
	import numpy as np

	def make_sparse(clf):
	"""
	Make sklearn.svm.SVC trained on dense data work on sparse features without fitting if again.
	"""
	clf._sparse = True
	clf.support_vectors_ = csr_matrix(clf.support_vectors_)

amorgun / sample.py

Created November 10, 2017 10:51

SqlAlchemy postgres bulk upsert

	from sqlalchemy.dialects import postgresql

	def bulk_upsert(session: Session,
	items: Sequence[Mapping[str, Any]]):
	session.execute(
	postgresql.insert(MyModel.__table__)
	.values(items)
	.on_conflict_do_update(
	index_elements=[MyModel.id],
	set_={MyModel.my_field.name: 'new_value'},

amorgun / jupyter.service

Created June 19, 2018 06:15 — forked from whophil/jupyter.service

A systemd script for running a Jupyter notebook server.

	# After Ubuntu 16.04, Systemd becomes the default.
	# It is simpler than https://gist.github.com/Doowon/38910829898a6624ce4ed554f082c4dd

	[Unit]
	Description=Jupyter Notebook

	[Service]
	Type=simple
	PIDFile=/run/jupyter.pid
	ExecStart=/home/phil/Enthought/Canopy_64bit/User/bin/jupyter-notebook --config=/home/phil/.jupyter/jupyter_notebook_config.py

amorgun / svnmv.sh

Created November 1, 2018 13:00

Register moved file in SVN

	# register file moved without svn mv
	set -eu
	shopt -s expand_aliases
	. ~/.bash_aliases
	old=$1
	new=$2
	tmp=$(mktemp svnmove.XXXXXX)
	svn rm --keep-local "$new"
	mv "$new" "$tmp"
	svn revert "$old"

amorgun / code.py

Created July 10, 2019 16:57

Pretty print confusion matrix

	def print_stats(y_true, y_pred):
	print(f'Total accuracy: {accuracy_score(y_true, y_pred)}')
	print()
	cm = pd.DataFrame(confusion_matrix(y_true, y_pred),
	columns=pd.MultiIndex.from_arrays([['not_fit', 'fit']], names=['My fit']),
	index=['not_fit', 'fit'],
	)
	cm.index.name = 'Toloka fit'
	print('Confusion matrix')
	print(cm)

Alexander Morgun amorgun