Skip to content

Instantly share code, notes, and snippets.

View mindis's full-sized avatar

Mindaugas Zickus mindis

  • Marks and Spencer
  • London, UK
  • X @MindisZ
View GitHub Profile
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@mindis
mindis / avazu_ftrl_concurrent.go
Created October 12, 2018 14:19 — forked from ceshine/avazu_ftrl_concurrent.go
Kaggle Avazu Challenge: FTRL-Proximal with L1 & L2 implemented in Go (Concurrent/Multi-threaded)
// Based on tinrtgu's Python script here:
// https://www.kaggle.com/c/avazu-ctr-prediction/forums/t/10927/beat-the-benchmark-with-less-than-1mb-of-memory
package main
import (
"encoding/csv"
"os"
"strconv"
"hash/fnv"
"math"
@mindis
mindis / GAME.avsc
Created October 5, 2018 13:38 — forked from wjohnson/GAME.avsc
{
"type" : "record",
"name" : "TrainingExample",
"namespace" : "com.linkedin.metronome.avro.generated",
"fields" : [ {
"name" : "uid",
"type" : [ "null", "string", "long", "int" ],
"doc" : "a unique id for the training event",
"default" : null
}, {
#!/usr/bin/env python
'''Crop an image to just the portions containing text.
Usage:
./crop_morphology.py path/to/image.jpg
This will place the cropped image in path/to/image.crop.png.
For details on the methodology, see
#install pylucene from http://lucene.apache.org/pylucene/
import sys
import lucene
import os
from java.io import File
from java.nio.file import Paths
from itertools import izip
from lucene import JavaError
@mindis
mindis / gist:676e0713c5f5a9a8f181d79ce5f81d01
Created November 4, 2016 14:30 — forked from dnbaker/gist:760d2cd79ae0fc0a67e8
A collection of links for streaming algorithms and data structures
  1. General Background and Overview
- word2vec https://arxiv.org/abs/1310.4546
- sentence2vec, paragraph2vec, doc2vec http://arxiv.org/abs/1405.4053
- tweet2vec http://arxiv.org/abs/1605.03481
- tweet2vec https://arxiv.org/abs/1607.07514
- author2vec http://dl.acm.org/citation.cfm?id=2889382
- item2vec http://arxiv.org/abs/1603.04259
- lda2vec https://arxiv.org/abs/1605.02019
- illustration2vec http://dl.acm.org/citation.cfm?id=2820907
- tag2vec http://ktsaurabh.weebly.com/uploads/3/1/7/8/31783965/distributed_representations_for_content-based_and_personalized_tag_recommendation.pdf
- category2vec http://www.anlp.jp/proceedings/annual_meeting/2015/pdf_dir/C4-3.pdf
@mindis
mindis / zstdtest2.cpp
Created September 21, 2016 16:51 — forked from Lazin/zstdtest2.cpp
Zstandard test (block compression)
#include "storage_engine/compression.h"
#include "perftest_tools.h"
#define ZSTD_STATIC_LINKING_ONLY
#include <zstd.h>
#include <iostream>
#include <cstdlib>
#include <algorithm>
#include <zlib.h>
@mindis
mindis / BitSetBenchmark.java
Created June 20, 2016 16:15 — forked from cpatni/BitSetBenchmark.java
Caliper Benchmarks for Analytics using Bitmaps
package blog;
import com.google.caliper.SimpleBenchmark;
import java.util.BitSet;
public class BitSetBenchmark extends SimpleBenchmark{
private BitSet bitSet;
@Override
protected void setUp() {
@mindis
mindis / useful_pandas_snippets.py
Created January 14, 2016 17:45 — forked from bsweger/useful_pandas_snippets.md
Useful Pandas Snippets
#List unique values in a DataFrame column
pd.unique(df.column_name.ravel())
#Convert Series datatype to numeric, getting rid of any non-numeric values
df['col'] = df['col'].astype(str).convert_objects(convert_numeric=True)
#Grab DataFrame rows where column has certain values
valuelist = ['value1', 'value2', 'value3']
df = df[df.column.isin(value_list)]