Skip to content

Instantly share code, notes, and snippets.

View erikerlandson's full-sized avatar

Erik Erlandson erikerlandson

View GitHub Profile
@erikerlandson
erikerlandson / build_spark.sh
Created August 4, 2017 19:03
build incantation for spark on kube
build/mvn install -Pkubernetes -pl resource-managers/kubernetes/core -am -DskipTests
build/mvn compile -T 4C -Pkubernetes -pl resource-managers/kubernetes/core -am -DskipTests
source ~/.bash_profile
dev/make-distribution.sh --pip --tgz -Phadoop-2.7 -Pyarn -Pkubernetes -Dhadoop.version=2.7.3
mv spark-2.2.0-k8s-0.3.0-SNAPSHOT-bin-2.7.3.tgz ~/Bloomberg/
cd ..
tar -xvf spark-2.2.0-k8s-0.3.0-SNAPSHOT-bin-2.7.3.tgz
@erikerlandson
erikerlandson / stackdump.scala
Last active July 30, 2017 20:41
stack-dump from trying to write TDigestUDT to parquet
scala> val data = sc.parallelize(Seq(1,2,3,4,5)).toDF("x")
data: org.apache.spark.sql.DataFrame = [x: int]
scala> val udaf = tdigestUDAF[Double].maxDiscrete(10)
udaf: org.isarnproject.sketches.udaf.TDigestUDAF[Double] = TDigestUDAF(0.5,10)
scala> val agg = data.agg(udaf($"x").alias("tdigest"))
agg: org.apache.spark.sql.DataFrame = [tdigest: tdigest]
scala> agg.show()
@erikerlandson
erikerlandson / update_passwd.sh
Created July 20, 2017 22:25
updating a passwd file for randomized openshift uid to make apache spark happy
if [ `id -u` -ge 10000 ]; then
cat /etc/passwd | sed -e "s/^$NB_USER:/builder:/" > /tmp/passwd
echo "$NB_USER:x:`id -u`:`id -g`:,,,:/home/$NB_USER:/bin/bash" >> /tmp/passwd
cat /tmp/passwd > /etc/passwd
rm /tmp/passwd
fi
@erikerlandson
erikerlandson / custom_zip_artifact.sbt
Created July 20, 2017 01:53
Example of adding a custom zipfile artifact to an sbt build
lazy val deleteZip = taskKey[Unit]("Delete python zipfile")
deleteZip := {
val s: TaskStreams = streams.value
s.log.info("delete python zip...")
val cmd = "bash" :: "-c" :: "cd python && rm -f isarnproject.zip" :: Nil
val stat = (cmd !)
if (stat == 0) {
s.log.info("delete zip succeeded")
} else {
@erikerlandson
erikerlandson / package_pyc_in_jar_fragment.sbt
Created July 15, 2017 23:08
Add a custom task and dependency to compile .pyc files and install them in the jarfile artifact
lazy val compilePython = taskKey[Unit]("Compile python files")
compilePython := {
val s: TaskStreams = streams.value
s.log.info("compiling python...")
val stat = (Seq("python2", "-m", "compileall", "python/") !)
if (stat == 0) {
s.log.info("python compile succeeded")
} else {
throw new IllegalStateException("python compile failed")
[eje@linux spark]$ ./bin/pyspark \
--packages "org.isarnproject:isarn-sketches-spark_2.11:0.1.0.py2" \
--repositories "https://dl.bintray.com/isarn/maven/"
# a bunch of maven loading log output ...
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
@erikerlandson
erikerlandson / demo_pyspark_tdigest.txt
Created July 8, 2017 19:10
Demo using T-Digest UDAFs from pyspark
[eje@linux spark]$ ./bin/pyspark --jars /home/eje/git/isarn-sketches-spark/target/scala-2.11/isarn-sketches-spark-assembly-0.1.0.jar --driver-class-path /home/eje/git/isarn-sketches-spark/target/scala-2.11/isarn-sketches-spark-assembly-0.1.0.jar
Python 2.7.13 (default, May 10 2017, 20:04:28)
[GCC 6.3.1 20161221 (Red Hat 6.3.1-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/07/08 12:04:34 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
____ __
@erikerlandson
erikerlandson / old_gmail.txt
Last active June 27, 2017 20:19
filter old mail in gmail
older_than:1y { label:inbox label:announce-list label:spark-issues label:spark-user label:memo-list label:tech-list label:zeppelin label:spark-dev label:openstack label:openstack-dev}
impl= demo.this.demoEvidence
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$ToolBoxGlobal$$anonfun$compile$1.apply(ToolBoxFactory.scala:275)
at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl.eval(ToolBoxFactory.scala:444)
at scala.reflect.macros.contexts.Evals$class.eval(Evals.scala:20)
at scala.reflect.macros.contexts.Context.eval(Context.scala:6)
@erikerlandson
erikerlandson / annotation.md
Last active March 18, 2020 17:49
Monads for using break and continue in scala for comprehensions