Skip to content

Instantly share code, notes, and snippets.

package com.mansur.scalding
import com.twitter.scalding._
import org.apache.lucene.search.spell._
import org.apache.mahout.common.distance.TanimotoDistanceMeasure
import org.apache.mahout.math.DenseVector
import org.apache.commons.math.util.MathUtils
/**
@MLnick
MLnick / MovieSimilarities.scala
Created April 1, 2013 17:49
Movie Similarities with Spark
import spark.SparkContext
import SparkContext._
/**
* A port of [[http://blog.echen.me/2012/02/09/movie-recommendations-and-more-via-mapreduce-and-scalding/]]
* to Spark.
* Uses movie ratings data from MovieLens 100k dataset found at [[http://www.grouplens.org/node/73]]
*/
object MovieSimilarities {
package com.mansur.scalding
import com.twitter.scalding._
import org.apache.lucene.search.spell._
import org.apache.mahout.common.distance.TanimotoDistanceMeasure
import org.apache.mahout.math.DenseVector
import org.apache.commons.math.util.MathUtils
/**
@mathias-brandewinder
mathias-brandewinder / gist:5558573
Last active August 12, 2024 14:10
Stub for F# Machine Learning Dojo
// This F# dojo is directly inspired by the
// Digit Recognizer competition from Kaggle.com:
// http://www.kaggle.com/c/digit-recognizer
// The datasets below are simply shorter versions of
// the training dataset from Kaggle.
// The goal of the dojo will be to
// create a classifier that uses training data
// to recognize hand-written digits, and
// evaluate the quality of our classifier
This is how we create an inclusive/exclusive range of Ints:
val ie = 0 until 500
//1. Fill in the missing item to create a range of Ints from 1 to 100 inclusive
val ints = 1 ??? 100
//2. Find the sum of the integers in this range
import java.util.Currency
object Fx {
type CcyPair = (Currency, Currency)
case class FxRate(from: Currency, to: Currency, rate: BigDecimal) {
def pair: CcyPair = from → to
def unary_~ = FxRate(to, from, 1 / rate)
def *(that: FxRate): FxRate = {
require(this.to == that.from)
FxRate(this.from, that.to, this.rate * that.rate)
@ashrithr
ashrithr / kafka.md
Last active March 14, 2024 21:16
kafka introduction

Introduction to Kafka

Kafka acts as a kind of write-ahead log (WAL) that records messages to a persistent store (disk) and allows subscribers to read and apply these changes to their own stores in a system appropriate time-frame.

Terminology:

  • Producers send messages to brokers
  • Consumers read messages from brokers
  • Messages are sent to a topic
@krishnanraman
krishnanraman / gist:5855410
Last active December 18, 2015 22:38
ServiceDependencies: Calculate dependencies between the various services that talk to zipkin. This job produces the data that powers http://bmd-linux:8080/radial.html (check it out, its pretty cool)
package com.twitter.observability.analytics.jobs
import com.twitter.pluck.job.TwitterDateJob
import com.twitter.scalding._
import com.twitter.observability.analytics._
import com.twitter.zipkin.common.{Service, DependencyLink, Dependencies, Span}
import com.twitter.zipkin.conversions.thrift._
import com.twitter.algebird.Moments
import java.net.InetSocketAddress
@tfnico
tfnico / Something.java
Created July 4, 2013 09:49
Sending a HTTP request with Java + Google Guava
public void sendMessage(String url, String params){
final HttpURLConnection connection;
try {
URL requestUrl = new URL(url);
connection = (HttpURLConnection) requestUrl.openConnection();
connection.setDoOutput(true);
connection.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
connection.setRequestProperty("Content-Length", Integer.toString(params.getBytes().length));
connection.setRequestProperty("Content-Language", "en-US");
@krishnanraman
krishnanraman / zipcodes of wealthy elite
Last active December 20, 2015 06:09
Where do the WEALTHY WELL EDUCATED ELITE live ?
/*
Goal: Use Scalding to datamine the 2010 US Census data (kindly provided by @ElonAzoulay & @hmason), to find
Where do the WEALTHY WELL EDUCATED ELITE live ?
WEALTHY == house value quarter million, household income 150k
WELL EDUCATED == sort by edu, edu = (10 * Phd + 5 * MS + 1 * BS) score
*/
import com.twitter.scalding._
import cascading.tuple.Fields
import cascading.tap.SinkMode