Install Python3, Scala and Apache Spark via Brew (http://brew.sh/)
brew update
brew install python3
brew install scala
brew install apache-sparkSet environment variables
Install Python3, Scala and Apache Spark via Brew (http://brew.sh/)
brew update
brew install python3
brew install scala
brew install apache-sparkSet environment variables
I hereby claim:
To claim this, I am signing this object:
| // Control.using is used to automatically close any resource that has a close method | |
| // note: from the book "Beginning Scala" (by David Pollak) | |
| object Control { | |
| import scala.language.reflectiveCalls | |
| def using[A <: { def close(): Unit }, B](param: A)(f: A => B): B = | |
| try { | |
| f(param) | |
| } finally { | |
| param.close() |
| // Measure.time is used to measure the time that takes to complete a block of code (in nanoseconds) | |
| // note: this version does not return the result of calling that function; a different version should be created for that | |
| object Measure { | |
| def time(block: => Unit)={ | |
| val s = System.nanoTime | |
| block | |
| System.nanoTime - s | |
| } | |
| } |
| #!/bin/bash | |
| # Configuration | |
| #export DIGITALOCEAN_ACCESS_TOKEN= # Digital Ocean Token (mandatory to provide) | |
| export DIGITALOCEAN_SIZE=512mb # default | |
| export DIGITALOCEAN_REGION=nyc3 # default | |
| export DIGITALOCEAN_PRIVATE_NETWORKING=true # default=false | |
| #export DIGITALOCEAN_IMAGE="ubuntu-15-04-x64" # default | |
| # For other settings see defaults in https://docs.docker.com/machine/drivers/digital-ocean/ |
From https://github.com/spark-jobserver/spark-jobserver#getting-started-with-spark-job-server:
The easiest way to get started is to try the Docker container which prepackages a Spark distribution with the job server and lets you start and deploy it.
➜ spark-jobserver git:(master) docker-machine version
docker-machine version 0.7.0, build a650a40
// https://gist.github.com/radekg/ec5a1575c450a48e5cba
From http://stackoverflow.com/a/32393044/1305344:
object size extends App {
(1 to 1000000).map(i => ("foo"+i, ()))
val input = readLine("prompt> ")
}
Run it with sbt 'runMain size' and then use jps (to know the pids), jstat -gc pid (to query for gc) and jmap (similar to jstat) to analise resource allocation.
How much of machine learning is statistics and vice versa?
Learning using https://www.coursera.org/learn/machine-learning/home/welcome