mathieu mdespriee

CTO @sencrop Startuper (x4), Octo Alumni. Love tech challenges, scalability, data. Favourite OSS: spark, mxnet, kafka

6 followers · 7 following

Sencrop
Lille
@mdespriee

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

mdespriee / check_user_privileges_on_redshift.sql

Last active December 11, 2015 09:55

	select tablename,
	HAS_TABLE_PRIVILEGE(tablename, 'select') as select,
	HAS_TABLE_PRIVILEGE(tablename, 'insert') as insert,
	HAS_TABLE_PRIVILEGE(tablename, 'update') as update,
	HAS_TABLE_PRIVILEGE(tablename, 'delete') as delete,
	HAS_TABLE_PRIVILEGE(tablename, 'references') as references
	from pg_tables where schemaname='public' order by tablename;

mdespriee / sbt opts

Created December 11, 2015 14:18

export SBT_OPTS="-Xmx4G -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=2G -Xss2M "

mdespriee / string_normalization.py

Created December 23, 2015 13:15

Unicode string normalization

	def normalize(s):
	""" Expects a unicode string, not encoded byte string.
	Returns unicode string
	"""
	out = ''.join( c for c in unicodedata.normalize("NFKD", s)
	if not unicodedata.combining(c) )
	out = _regexAlpha.sub(' ', out)
	out = _regexSpace.sub(' ', out)
	out = out.strip().upper()
	return out

mdespriee / random_crypto_string.sh

Last active August 29, 2016 14:06

Create random cryptographic string

cat /dev/urandom | tr -dc '[:alnum:]' | fold -w 64 | head -n 1

mdespriee / reboot wireless

Created August 31, 2016 13:34

Reboot ubuntu wireless without rebooting

	# grab the kernel module name
	lshw -C network 2>&1 \| grep wireless \| grep driver

	sudo modprobe -r ath9k && sudo modprobe ath9k

mdespriee / stream_to_hdfs.sh

Created March 16, 2017 11:29

stream data in/out of hdfs through a edge node


	ssh edge_node "hdfs dfs -cat /some/path/part-*" \| cat > file

	cat file \| ssh edge_node "hdfs dfs -put - /target/path"


	# think of using a named pipe (mkfifo) to sream directly for application output
	rm -f stream
	mkfifo stream

mdespriee / LDAIncrementalExample.scala

Created June 29, 2017 19:13

Example of how to build LDA incrementally in Spark, with comparison to one-shot learning.


	// This code is related to PR https://github.com/apache/spark/pull/17461
	// I show how to use the setInitialModel() param of LDA to build a model incrementally,
	// and I compare the performance (perplexity) with a model built in one-shot


	import scala.collection.mutable

	import org.apache.spark.ml.{Pipeline, PipelineModel}
	import org.apache.spark.ml.clustering.{LDA, LDAModel}

mdespriee / README-Template.md

Created January 12, 2018 13:55 — forked from PurpleBooth/README-Template.md

A template to make good README.md

Project Title

One Paragraph of project description goes here

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

mdespriee / SPARK_23220_SimpleTest.scala

Last active January 26, 2018 23:36

aims to reproduce SPARK-23220 : a broadcast join is transformed to a SortMergeJoin

	package ssp

	import java.nio.charset.Charset
	import java.nio.file.{Files, Paths}

	import org.apache.spark.sql.functions.broadcast
	import org.apache.spark.sql.streaming.{OutputMode, Trigger}
	import org.apache.spark.sql.types.{StringType, StructField, StructType}
	import org.apache.spark.sql.{Dataset, SparkSession}
	import org.apache.spark.storage.StorageLevel

mdespriee / SymbolRandomAPIBase.scala

Last active November 5, 2018 21:08

generated code - from PR https://github.com/apache/incubator-mxnet/pull/13039

	/*
	* Licensed to the Apache Software Foundation (ASF) under one or more
	* contributor license agreements. See the NOTICE file distributed with
	* this work for additional information regarding copyright ownership.
	* The ASF licenses this file to You under the Apache License, Version 2.0
	* (the "License"); you may not use this file except in compliance with
	* the License. You may obtain a copy of the License at
	*
	* http://www.apache.org/licenses/LICENSE-2.0
	*

OlderNewer