Skip to content

Instantly share code, notes, and snippets.

@robertsdionne
robertsdionne / deepdream-install.md
Last active February 15, 2021 16:07
Deepdream installation
#!/usr/bin/env bash

# Assuming OS X Yosemite 10.10.4

# Install XCode and command line tools
# See https://itunes.apple.com/us/app/xcode/id497799835?mt=12#
# See https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/xcode-select.1.html
xcode-select --install
@tbertelsen
tbertelsen / pearson.scala
Last active February 28, 2021 22:14 — forked from kaja47/pearson.scala
Calculating pearson for Breeze vectors
import breeze.linalg._
import breeze.stats._
import scala.math.sqrt
/**
* Effecient for sparse vectors. Scales in O(activeSize)
*/
// Must take SparseVector, for implicits to be linked correctly
def pearson(a: SparseVector[Double], b: SparseVector[Double]): Double = {
@kaja47
kaja47 / csfdsim.scala
Created December 30, 2014 04:58
How to compute similar movies from CSFD data in 10 minutes and find love of your life
import breeze.linalg._
import breeze.stats
import breeze.numerics._
val dataFile = new File(???)
val userItems: Array[SparseVector[Double]] = loaderUserItemsWithRatings(dataFile, """[ ,:]""".r)
val itemUsers: Array[SparseVector[Double]] = transpose(userItems) map { vec => normalize(vec, 2) }
// weights
val N = DenseVector.fill[Double](itemIndex.size)(userIndex.size) // vector where total numbers of users is repeated
@pjankiewicz
pjankiewicz / gist:8ab7094d263bf0d4cfb8
Last active September 9, 2016 03:04
kaggle vazu
'''
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004
Copyright (C) 2004 Sam Hocevar <[email protected]>
Everyone is permitted to copy and distribute verbatim or modified
copies of this license document, and changing it is allowed as long
as the name is changed.
@azymnis
azymnis / KMeansJob.scala
Created October 23, 2014 23:07
K-Means in scalding
import com.twitter.algebird.{Aggregator, Semigroup}
import com.twitter.scalding._
import scala.util.Random
/**
* This job is a tutorial of sorts for scalding's Execution[T] abstraction.
* It is a simple implementation of Lloyd's algorithm for k-means on 2D data.
*
* http://en.wikipedia.org/wiki/K-means_clustering
@stucchio
stucchio / monte_carlo_compare_theory_to_practice.py
Created June 17, 2014 13:34
Code to make other graph in equal weights post
from pylab import *
from numpy.random import dirichlet, rand, binomial, uniform, normal
def _unit_weight(dim):
return ones(dim) / float(dim)
ONE_FRAC = 0.5
SQRT_TWO_INV = 1.0 / sqrt(2.0)
def _feature_vec(dim, method="bernoulli"):
if method == "bernoulli":
@kaja47
kaja47 / svd-img.scala
Created May 13, 2014 21:57
Visualization of truncated SVD
import breeze._
import breeze.linalg._
import breeze.numerics._
import java.awt.image.BufferedImage
import javax.imageio.ImageIO
val f = ???
val img = javax.imageio.ImageIO.read(new File(f))
val gray = new BufferedImage(img.getWidth, img.getHeight, BufferedImage.TYPE_BYTE_GRAY)
val g = gray.createGraphics()
@kaja47
kaja47 / gist:554f62c61f21b0420720
Created May 9, 2014 19:47
minhash vs. HyperLogLog
// min-hash
val fs: Vector[Int => Int] // hash funkce
items map { it => fs map { f => f(it) } } fold (vectorPairwise(min), initialValue = Vector.fill(infinity))
// HyperLogLog
@jrudolph
jrudolph / JsonRejectionHandler.scala
Created March 6, 2014 11:29
Custom rejection handler that returns JSON
case class ErrorMessage(message: String, cause: String)
object ErrorMessage {
import spray.json.DefaultJsonProtocol._
implicit val errorFormat = jsonFormat2(ErrorMessage.apply)
}
import spray.httpx.SprayJsonSupport._
implicit val jsonRejectionHandler = RejectionHandler {
case MalformedRequestContentRejection(msg, cause) :: Nil =>
complete(StatusCodes.BadRequest, ErrorMessage("The request content was malformed", msg))
@debasishg
debasishg / gist:8172796
Last active June 8, 2025 08:30
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&amp;rep=rep1&amp;t