Olivier Grisel ogrisel

Machine Learning Engineer at :probabl. and core contributor to scikit-learn.

ogrisel / non_degenerate_mlp_gram.py

Last active March 8, 2022 22:30

Spectrum of the extended feature Gram matrix of an single hidden layer ReLU MLP

	"""Empirical evaluation of the extended feature Gram matrix of a ReLU MLP

	Here we try to estimate the spectrum of the H^\infty matrix as defined in:

	Gradient Descent Provably Optimizes Over-parameterized Neural Networks (2018)
	Simon S. Du, Xiyu Zhai, Barnabas Poczos, Aarti Singh
	https://arxiv.org/abs/1810.02054

	Theorem 4.1 relies on the assumption that H^\infty has a strictly positive
	minimum eigenvalue. The following computes an estimate of this eigenvalue

ogrisel / kmeans_benchmark.py

Created July 14, 2018 16:29

	from sklearn.datasets import make_blobs
	from sklearn.cluster import KMeans
	from sklearn.externals import joblib


	m = joblib.Memory(cachedir='/tmp/joblib')
	make_blobs = m.cache(make_blobs)
	data, labels = make_blobs(n_samples=10**5, n_features=50, cluster_std=100,
	centers=10, random_state=777)

ogrisel / numpy_pickle_protocol_5.py

Last active October 13, 2019 09:17

Draft use of pickle protocol 5 (PEP 574) for zero-copy numpy array pickling

	from pickle import Pickler, load

	try:
	from pickle import PickleBuffer
	except ImportError:
	PickleBuffer = None
	import copyreg
	import os
	import numpy as np
	import time

ogrisel / large_pickle_dump.py

Last active April 20, 2018 09:06

Memory profiling for Python pickling of large buffers

	from pickle import Pickler, _Pickler, Unpickler, _Unpickler, HIGHEST_PROTOCOL
	import os
	import time
	import sys
	import gc
	from multiprocessing import get_context

	PROTOCOL = HIGHEST_PROTOCOL

	ctx = get_context('spawn')

ogrisel / worker_log.txt

Created October 20, 2017 17:27

	distributed.worker - WARNING - Worker at 72 percent memory usage. Trigger GC. Process memory: 723.75 MB -- Worker memory limit: 1000.00 MB
	distributed.worker - WARNING - Worker at 66 percent memory usage. After GC. Process memory: 660.93 MB -- Worker memory limit: 1000.00 MB
	distributed.worker - WARNING - Worker at 73 percent memory usage. Trigger GC. Process memory: 732.79 MB -- Worker memory limit: 1000.00 MB
	distributed.worker - WARNING - Worker at 73 percent memory usage. After GC. Process memory: 732.79 MB -- Worker memory limit: 1000.00 MB
	distributed.core - WARNING - Event loop was unresponsive for 1.01s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
	distributed.worker - WARNING - Worker at 70 percent memory usage. Trigger GC. Process memory: 705.26 MB -- Worker memory limit: 1000.00 MB
	distributed.worker - WARNING - Worker at 67 percent memory usage. After GC. Process memory: 670.00 MB -- Worker memory limit: 1000.0

ogrisel / mean_target_encoding.py

Last active July 7, 2018 04:31

Mean target value encoding for categorical variable using dask (take 2)

	import os
	import os.path as op
	from time import time
	import dask.dataframe as ddf
	import dask.array as da
	from distributed import Client


	def make_categorical_data(n_samples=int(1e7), n_features=10, n_partitions=100):
	"""Generate some random categorical data

ogrisel / mean_target_encoding.py

Last active September 29, 2017 15:05

Mean target value encoding for categorical variable using dask

	#
	# XXX: do not use this code, it's broken!
	# Use: https://gist.github.com/ogrisel/b6a97ed87939e3b559568ac2f6599cba
	#
	# See comments.

	import os
	import os.path as op
	from time import time
	import dask.dataframe as ddf

ogrisel / .gitignore

Last active August 30, 2017 12:00

roofline analysis

ogrisel / test_max_depth_log.txt

Created June 16, 2017 16:28

	_________________ TestsProcessPoolLokyExecutor.test_max_depth __________________

	self = <tests.test_process_executor_loky.TestsProcessPoolLokyExecutor instance at 0x10392ed88>

	def test_max_depth(self):

	from loky.process_executor import MAX_DEPTH

	if self.context.get_start_method() == 'fork':

ogrisel / Untitled1.ipynb

Created April 26, 2017 16:14

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.