Skip to content

Instantly share code, notes, and snippets.

View erikerlandson's full-sized avatar

Erik Erlandson erikerlandson

View GitHub Profile
@erikerlandson
erikerlandson / instructions.md
Last active November 1, 2019 19:20
Instructions for standing up the ML workflows on openshift workshop in RHPDS

In RHPDS, Request: Service Catalogs > Workshops > Openshift 4 Workshop

It will take 40+ mins to provision the OCP cluster. When you receive the RHPDS email confirming the cluster is ready:

$ export RHPDS_GUID=<your-guid>  # from rhpds email
$ export RHPDS_USER=<user-name>  # your rhpds user name
$ export NUM_LAB_USERS=10 # how many lab/workshop users will you be hosting?
# log into host: (use creds provided in rhpds email)
# (preferred-auth flag may nor may not be necessary)
@erikerlandson
erikerlandson / compile.py
Created August 26, 2019 02:40
pruning and compiling markovify models
class CompiledMarkovify(object):
def __init__(self, model):
def compile_next(next_dict):
words = list(next_dict.keys())
cff = np.array(list(itertools.accumulate(next_dict.values())))
return (words, cff)
chain_dict = model.chain.model
self.sxf = { state: compile_next(next_dict) for (state, next_dict) in chain_dict.items() }
self.state_size = model.state_size
self.BEGIN = '___BEGIN__'
@erikerlandson
erikerlandson / CountSerDe.scala
Last active July 6, 2019 19:12
Benchmarking Description for Spark UDIA pull request
package org.apache.spark.countSerDe
import org.apache.spark.internal.Logging
import org.apache.spark.sql.Row
import org.apache.spark.sql.catalyst.InternalRow
import org.apache.spark.sql.catalyst.expressions.GenericInternalRow
import org.apache.spark.sql.catalyst.util._
import org.apache.spark.sql.expressions.MutableAggregationBuffer
import org.apache.spark.sql.expressions.UserDefinedAggregateFunction
import org.apache.spark.sql.expressions.UserDefinedImperativeAggregator
@erikerlandson
erikerlandson / ConfigIntegration.scala
Last active May 2, 2019 21:28
Integrate typesafe/lightbend Config with coulomb QuantityParser
object ConfigIntegration {
import scala.language.implicitConversions
import scala.util.Try
import scala.reflect.runtime.universe.TypeTag
import com.typesafe.config.ConfigFactory
import com.typesafe.config.Config
import coulomb.parser.unitops.UnitTypeString
package org.apache.spark.countSerDe
import org.apache.spark.sql.catalyst.util._
import org.apache.spark.sql.types._
import org.apache.spark.sql.Row
import org.apache.spark.sql.catalyst.InternalRow
import org.apache.spark.sql.catalyst.expressions.GenericInternalRow
import org.apache.spark.sql.expressions.MutableAggregationBuffer
import org.apache.spark.sql.expressions.UserDefinedAggregateFunction
@erikerlandson
erikerlandson / clean-git-repos.sh
Last active February 20, 2019 22:45
rsync backup, excludes, git repo cleaning
cd /home/eje/git
for f in $(ls -d */) ; do cd /home/eje/git/$f; git rev-parse --git-dir 2>/dev/null && git clean -fdx; done
@erikerlandson
erikerlandson / testJO.scala
Last active February 26, 2024 18:42
An example quadratic programming (QP) optimization using JOptimizer in Scala
object testJO {
// libraryDependencies += "com.joptimizer" % "joptimizer" % "4.0.0"
import com.joptimizer.functions.PDQuadraticMultivariateRealFunction
import com.joptimizer.functions.PSDQuadraticMultivariateRealFunction
import com.joptimizer.functions.ConvexMultivariateRealFunction
import com.joptimizer.functions.LinearMultivariateRealFunction
import com.joptimizer.optimizers.OptimizationRequest
import com.joptimizer.optimizers.JOptimizer
// solution space is dimension n; in this example n = 2
@erikerlandson
erikerlandson / demo.scala
Last active March 26, 2018 16:48
sifting
scala> :load /home/eje/sift.scala
Loading /home/eje/sift.scala...
defined module sift
defined module demo
scala> import sift._
import sift._
scala> val r = Sifted.seq(demo.dirty).map(_.map(_.toDouble)).filter(_(0) > 1).map(_(1))
r: sift.SiftedSeq[Double,scala.collection.immutable.Vector,(scala.collection.immutable.Vector[scala.collection.immutable.Vector[Double]], (scala.collection.immutable.Vector[scala.collection.immutable.Vector[Double]], (scala.collection.immutable.Vector[scala.collection.immutable.Vector[String]], List[Nothing])))] = SiftedSeq(Vector(5.0),(Vector(Vector(3.0)),(Vector(Vector()),(Vector(Vector(2, 3, z)),List()))))
@erikerlandson
erikerlandson / topkmonoid.scala
Created January 26, 2018 21:35
pseudocode for a Scala topk-monoid built on count-min-sketch monoid
// type parameter V is type of object values being counted
// class parameters are 'val'; this class is immutable
class TopK[V](
val k: Int,
val cms: CountMinSketch[V],
val topk: immutable.Map[V, Int],
val fmin: Int) {
// update the TopK sketch w/ a new element 'v'
def +(v: V): TopK[V] = {
@erikerlandson
erikerlandson / types.scala
Created December 2, 2017 23:40
Save some shapeless-style type computations using dependent types
trait Length[L] {
type Out
}
object Length {
type Aux[L, O] = Length[L] { type Out = O }
implicit def length0: Aux[HNil, Witness.`0`.T] = new Length[HNil] { type Out = Witness.`0`.T }
implicit def length1[H, T <: HList, O](implicit tl: Aux[T, O], inc: +[O, Witness.`1`.T]): Aux[H :: T, inc.Out] = {
new Length[H :: T] { type Out = inc.Out }
}