Skip to content

Instantly share code, notes, and snippets.

Ideally I'd be able to write this with only one pass of data, but it's not possible in one pass (as far as I know)

def separate(r: RDD[A \/ B]): (RDD[A], RDD[B]) = ???

I'd settle for something like this where the As are dumped to a file and the Bs are still in the RDD. It's kind of like observeW from scalaz-stream.

@coltfred
coltfred / Catsflatten.scala
Created October 23, 2015 20:01
Cats flatten fail
scala> type ErrorOr[A] = Xor[Exception, A]
defined type alias ErrorOr
scala> def f[A](a:A): ErrorOr[A] = a.right
f: [A](a: A)ErrorOr[A]
scala> f(f("string")).flatten
res10: ErrorOr[String] = Right(string)
scala> "hello".right[Exception].right[Exception].flatten
@coltfred
coltfred / contrib.sh
Last active August 29, 2015 14:22 — forked from non/contrib.sh
#!/bin/sh
git log --numstat | awk '/^Author: /{author=$0} /^[0-9]+\t[0-9]+/{n = $1 + $2; d[author] += n; t += n} END { for(a in d) { printf("%6d %6.3f%% %s\n", d[a], d[a] * 100 / t, a)}}' | sort -rn
# written less illegibly, it is:
#
# git log --numstat | \
# awk '
# /^Author: /{author=$0}
# /^[0-9]+\t[0-9]+/{n = $1 + $2; d[author] += n; t += n}
# END { for(a in d) { printf("%6d %6.3f%% %s\n", d[a], d[a] * 100 / t, a)}}

Git DMZ Flow

I've been asked a few times over the last few months to put together a full write-up of the Git workflow we use at RichRelevance (and at Precog before), since I have referenced it in passing quite a few times in tweets and in person. The workflow is appreciably different from GitFlow and its derivatives, and thus it brings with it a different set of tradeoffs and optimizations. To that end, it would probably be helpful to go over exactly what workflow benefits I find to be beneficial or even necessary.

  • Two developers working on independent features must never be blocked by each other
    • No code freeze! Ever! For any reason!
  • A developer must be able to base derivative work on another developer's work, without waiting for any third party
  • Two developers working on inter-dependent features (or even the same feature) must be able to do so without interference from (or interfering with) any other parties
  • Developers must be able to work on multiple features simultaneously, or at lea
@coltfred
coltfred / step1.scala
Last active August 29, 2015 14:11 — forked from danclien/step1.scala
// Implementing functor manually
import scalaz._, Scalaz._, Free.liftF
sealed trait TestF[+A]
case class Foo[A](o: A) extends TestF[A]
case class Bar[A](h: (Int => A)) extends TestF[A]
case class Baz[A](h: (Int => A)) extends TestF[A]
implicit def testFFunctor[B]: Functor[TestF] = new Functor[TestF] {
@coltfred
coltfred / Window.hs
Last active August 29, 2015 14:06 — forked from pchiusano/Window.hs
module Window where
import Data.Monoid
data Window a = Window [a] a [a] deriving (Show,Read)
null :: Window a -> Bool
null (Window [] _ []) = True
null _ = False
import scalaz._, Scalaz._
implicit val b = Show.shows[Boolean]{b => if(b)"0" else " "}
true.shows //Value is still "true", why?
//Methods returning an Option of Boolean(or generally any Monad of Boolean)
def one: Option[Boolean] = Some(true)
def two: Option[Boolean] = Some(true)
//I'd like to be able to write something like the following.
if (!one && two) {
}
import scalaz._, Scalaz._
val a: List[Int] = List(1)
a.traverse{_.point[Option].toRightDisjunction("fail")}
@coltfred
coltfred / gist:5759793
Created June 11, 2013 19:19
ctags for scala
--langdef=scala
--langmap=scala:.scala
--regex-Scala=/^[ \t]*(final[ \t]*)*(abstract[ \t]*)*(sealed[ \t]*)*(case[ \t]*)*class[ \t]*([a-zA-Z0-9_]+)/\5/c,classes/
--regex-Scala=/^(final[ \t]*)*(case[ \t]*)*[ \t]*object[ \t]*([a-zA-Z0-9_]+)/\3/o,objects/
--regex-Scala=/^[ \t]*(protected[ \t]*)*(sealed[ \t]*)*trait[ \t]*([a-zA-Z0-9_]+)/\3/t,traits/
--regex-Scala=/[ \t]*def[ \t]*([a-zA-Z0-9_=]+)[ \t]*.*[:=]/\1/m,methods/
--regex-Scala=/[ \t]*(final[ \t]*)*val[ \t]*([a-zA-Z0-9_]+)[ \t]*[:=]/\2/V,values/
--regex-Scala=/[ \t]*var[ \t]*([a-zA-Z0-9_]+)[ \t]*[:=]/\1/v,variables/
--regex-Scala=/^[ \t]*type[ \t]*([a-zA-Z0-9_]+)[ \t]*[\[<>=]/\1/T,types/
--regex-Scala=/^[ \t]*import[ \t]*([a-zA-Z0-9_{}., \t=>]+$)/\1/i,includes/