Skip to content

Instantly share code, notes, and snippets.

View johnynek's full-sized avatar

P. Oscar Boykin johnynek

View GitHub Profile
@johnynek
johnynek / abstract_join.scala
Last active August 29, 2015 14:10
Abstracting map/reduce joins.
/**
* @avibryant and I have been interested in extracting as much of scalding out into Algebird,
* so that it is portable across many execution systems, but how to model joins?
*
* In the FP world, Applicative[M] is a typeclass that gives you both Functor[M] (which provides map)
* and in addition join:
*/
trait Functor[M[_]] {
// law: map(map(a)(f))(g) == map(a)(f.andThen(g))
def map[V,U](init: M[T])(fn: T => U): M[U]
@johnynek
johnynek / Future.rs
Created November 13, 2014 04:20
Future with map and monadic bind in Rust.
use std::comm::{Receiver, channel};
use std::io;
use std::mem::replace;
use std::task::spawn;
struct Future<'a, A> {
state: FutureState<'a, A>
}
package mapreduce
/**
* This is an attempt to find a minimal set of type classes that describe the map-reduce programming model
* (the underlying model of Google map/reduce, Hadoop, Spark and others)
* The idea is to have:
* 1) lawful types that fully constrain correctness
* 2) a minimal set of laws (i.e. we can't remove any laws,
* 3) able to fully express existing map/reduce in terms of these types
*
@johnynek
johnynek / pomonoids.md
Last active August 29, 2015 14:05
Some notes about partially ordered Monoids

This question on MathOverflow: http://mathoverflow.net/questions/179390/standard-name-for-a-monoid-semigroup-with-ab-a-b?noredirect=1#comment449777_179390

lead me to the definition of a partially ordered monoid, or pomonoid: http://books.google.com/books?id=ZO5Z-zZijDgC&pg=PA248&lpg=PA248&dq=pomonoid+definition&source=bl&ots=lX2PietcDd&sig=k79BcAh0s5Y8OGC12Kclmn4Tkgw&hl=en&sa=X&ei=1Mr8U6HGKYu6ogTL3oGYAQ&ved=0CCYQ6AEwATgK#v=onepage&q=pomonoid%20definition&f=false

Pomonoid: if x <= y, then zx <= zy for all x,y,z in X. Integral Pomonoid if x <= 1 for all x in X.

Lemma: If you have an Integral Pomonoid, then xy <= x. yx <= x

@johnynek
johnynek / TypedDataCube.md
Last active August 29, 2015 14:04
How to do data cubing in typed scalding?

Suppose you have a key like (page, geo, day) and you want to make rollups/datacube so you can query for all pages, or all geos or all days.

Here is how you do it:

def opts[T](t: T): Seq[Option[T]] = Seq(Some(t), None)

val p: TypedPipe[(String, String, Int)] = ...

p.sumByLocalKeys
@johnynek
johnynek / scalding_alice.scala
Created July 18, 2014 17:15
Learn Scalding with Alice
/**
git clone https://github.com/twitter/scalding.git
cd scalding
./sbt scalding-repl/console
*/
import scala.io.Source
val alice = Source.fromURL("http://www.gutenberg.org/files/11/11.txt").getLines
// Add the line numbers, which we might want later
val aliceLineNum = alice.zipWithIndex.toList
@johnynek
johnynek / 0_reuse_code.js
Created April 2, 2014 22:58
Here are some things you can do with Gists in GistBox.
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console

Keybase proof

I hereby claim:

  • I am johnynek on github.
  • I am posco (https://keybase.io/posco) on keybase.
  • I have a public key whose fingerprint is 11C7 5CF0 8E84 D7FA EC33 1F78 7BE2 A709 A274 EDB0

To claim this, I am signing this object:

@johnynek
johnynek / sized_list.scala
Last active July 3, 2021 17:54
Simple example of how to write a linked-list in scala that knows its length at compile-time. This allows you to write a zip method that is always exact.
// I was reading through these examples: http://apocalisp.wordpress.com/2010/06/08/type-level-programming-in-scala/
// and I thought it would be nice to more quickly get to something useful, to show the power of the techniques.
object SizedListExample {
// Type of all Non-negative integers
sealed trait Nat
// This is zero.
sealed trait _0 extends Nat
// Successor to some non-negative number
sealed trait Succ[N <: Nat] extends Nat
// Find the newest asserted at for each combo of subject & property
.groupBy('subject, 'property) {
_.sortedReverseTake[(String, Double)](('asserted_at, 'value) -> 'items , 1)
}
// Extract out that newest asserted at value
.flatten[(String, Double)](('items) -> ('asserted_at, 'value))
.discard('items)