Skip to content

Instantly share code, notes, and snippets.

@azymnis
azymnis / ItemSimilarity.scala
Created December 13, 2013 05:17
Approximate item similarity using LSH in Scalding.
import com.twitter.scalding._
import com.twitter.algebird.{ MinHasher, MinHasher32, MinHashSignature }
/**
* Computes similar items (with a string itemId), based on approximate
* Jaccard similarity, using LSH.
*
* Assumes an input data TSV file of the following format:
*
* itemId userId
@fawda123
fawda123 / nnet_plot_update.r
Last active May 11, 2022 00:20
nnet_plot_update
plot.nnet<-function(mod.in,nid=T,all.out=T,all.in=T,bias=T,wts.only=F,rel.rsc=5,
circle.cex=5,node.labs=T,var.labs=T,x.lab=NULL,y.lab=NULL,
line.stag=NULL,struct=NULL,cex.val=1,alpha.val=1,
circle.col='lightblue',pos.col='black',neg.col='grey',
bord.col='lightblue', max.sp = F,...){
require(scales)
#sanity checks
if('mlp' %in% class(mod.in)) warning('Bias layer not applicable for rsnns object')
@kaja47
kaja47 / combinations.scala
Last active December 27, 2015 15:49
Fast array combinations
// genrate all combinations of integers in range from 0 to `len`-1
// fast as fuck
def combIdxs(len: Int, k: Int): Iterator[Array[Int]] = {
val arr = Array.range(0, k)
arr(k-1) -= 1
val end = k-1
Iterator.continually {
arr(end) += 1
if (arr(end) >= len) {
@piotrbelina
piotrbelina / BoomerangLogJob.scala
Created August 3, 2013 10:12
Scalding apache log parser for boomerang.js
import cascading.tuple.{Fields, TupleEntry}
import com.twitter.scalding._
import java.net.URLDecoder
import scala.util.matching.Regex
class BoomerangLogJob(args: Args) extends Job(args) {
val input = TextLine(args("input"))
val output = TextLine(args("output"))
val trap = Tsv(args("trap"))
@bmarcot
bmarcot / knapsack_problem.scala
Last active June 28, 2017 13:41
The Knapsack Problem, in Scala -- Keywords: dynamic programming, recursion, scala.
def knapsack_aux(x: (Int, Int), is: List[Int]): List[Int] = {
for {
w <- is.zip(is.take(x._1) ::: is.take(is.size - x._1).map(_ + x._2))
} yield math.max(w._1, w._2)
}
def knapsack_rec(xs: List[(Int, Int)], is: List[Int]): List[List[Int]] = {
xs match {
case x :: xs => knapsack_aux(x, is) :: knapsack_rec(xs, knapsack_aux(x, is))
case _ => Nil
@steipete
steipete / PSPDFUIKitMainThreadGuard.m
Last active May 27, 2024 12:11
This is a guard that tracks down UIKit access on threads other than main. This snippet is taken from the commercial iOS PDF framework http://pspdfkit.com, but relicensed under MIT. Works because a lot of calls internally call setNeedsDisplay or setNeedsLayout. Won't catch everything, but it's very lightweight and usually does the job.You might n…
// Taken from the commercial iOS PDF framework http://pspdfkit.com.
// Copyright (c) 2014 Peter Steinberger, PSPDFKit GmbH. All rights reserved.
// Licensed under MIT (http://opensource.org/licenses/MIT)
//
// You should only use this in debug builds. It doesn't use private API, but I wouldn't ship it.
// PLEASE DUPE rdar://27192338 (https://openradar.appspot.com/27192338) if you would like to see this in UIKit.
#import <objc/runtime.h>
#import <objc/message.h>
@Yangqing
Yangqing / mr_compute_gist.py
Created May 17, 2013 00:05
The mapreduce code to extract gist features from ImageNet images. To be used together with mincepie.
from mincepie import mapreducer, launcher
import gflags
import glob
import leargist
import numpy as np
import os
from PIL import Image
import uuid
# constant value
You can use cURL to upload packet captures to Packetloop. We created a simple script that shows how to login, list capture points, create capture points, upload and also check processing status.
## variables
PL_ENDPOINT=https://www.packetloop.com
PL_USERNAME=... # your packetloop email address
PL_PASSWORD=... # your packetloop password
## logging in
PL_TOKEN=$(curl -3 -s -b cookies.jar -c cookies.jar -X GET "$PL_ENDPOINT/init")
curl -3 -s -H "X-CSRF-Token: $PL_TOKEN" -H "Content-Type: application/json" -H "Accept: application/json" -b cookies.jar -c cookies.jar -X POST "$PL_ENDPOINT/users/sign_in.json?pretty=true" -d "{ \"user\": { \"email\": \"$PL_USERNAME\", \"password\": \"$PL_PASSWORD\" } }"
@codeinthehole
codeinthehole / run.py
Created November 21, 2012 13:46
Sample Celery chain usage for processing pipeline
from celery import chain
from django.core.management.base import BaseCommand
from . import tasks
class Command(BaseCommand):
def handle(self, *args, **kwargs):
@dsparks
dsparks / distance_matrix.R
Created September 18, 2012 23:15
Calculating distances (including between matrices)
# Cross-matrix distances and different measurement options, with "proxy"
doInstall <- TRUE # Change to FALSE if you don't want packages installed.
toInstall <- c("proxy", "MASS", "Zelig")
if(doInstall){install.packages(toInstall, repos = "http://cran.us.r-project.org")}
lapply(toInstall, library, character.only = TRUE)
# Invent two data frames
voterIdealPoints <- data.frame(matrix(rnorm(26*2), ncol = 2))
rownames(voterIdealPoints) <- letters