➜ sbt-snapshot curl -o sbt-launch-0.13.8-SNAPSHOT.jar https://repo.typesafe.com/typesafe/ivy-snapshots/org.scala-sbt/sbt-launch/0.13.8-SNAPSHOT/sbt-launch.jar
➜ sbt-snapshot tree
.
|-- project
| `-- build.properties
`-- sbt-launch-0.13.8-SNAPSHOT.jar
1 directory, 2 files
➜ sbt-snapshot cat project/build.properties
That works in 2.11.4
:
scala> scala.util.Properties.versionString
res0: String = version 2.11.4
scala> val s = Seq(1 -> 2)
s: Seq[(Int, Int)] = List((1,2))
scala> s.groupBy(_._1)
import org.specs2._
class HelloSpec extends Specification with matcher.DataTables { def is =
"adding integers should just work in scala" ! e1
def e1 =
"a" | "b" | "c" | // the header of the table, with `|` separated strings
2 ! 2 ! 4 | // an example row
1 ! 1 ! 2 |> { // the > operator to "execute" the table
[sbt-learning-space]> console
[info] Starting scala interpreter...
[info]
Welcome to Scala version 2.11.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_31).
Type in expressions to have them evaluated.
Type :help for more information.
scala> val m = Map(1->2, 0->3)
m: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 0 -> 3)
➜ sandbox scala | |
Welcome to Scala version 2.11.6 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_40). | |
Type in expressions to have them evaluated. | |
Type :help for more information. | |
scala> :javap | |
:javap [-lcsvp] [path1 path2 ...] | |
scala> case class AAA(s: String) | |
defined class AAA |
- no upfront installation/agents on remote/slave machines - ssh should be enough
- application components should use third-party software, e.g. HDFS, Spark's cluster, deployed separately
- configuration templating
- environment requires/asserts, i.e. we need a JVM in a given version before doing deployment
- deployment process run from Jenkins
aka "Let's take some notes about using Docker on Mac OS X to turn deployment of Scala applications into a much better experience."
DISCLAIMER The doc is a compilation of different articles and videos found on the Internet. Almost nothing's mine - mostly layout. See CREDITS section below to know who to praise. All mistakes are mine and are not intended. Drop me an email at [email protected] if you spot any errors or just share what you think about the doc.
The document lives at https://gist.github.com/jaceklaskowski/ca55be80cb76e84ce478
I'm on Mac OS X and so you're going to see a lot of setup tweaks for the platform that are not necessarily needed for your environment, esp. Linux one. When you see boot2docker
and you're on Linux, just disregard the line or even entire paragraph.
How much of machine learning is statistics and vice versa?
Learning using https://www.coursera.org/learn/machine-learning/home/welcome
- machine learning = teaching a computer to learn concepts using data — without being explicitly programmed.
- Supervised learning = "right answers" given
- Regression problem
- continuous valued output
- deduce the function for a given data set and predict other values
- "in regression problems, we are taking input variables and trying to map the output onto a continuous expected result function."
Steps:
- Build a Docker image and install sphinx inside
- Run the image to have a complete working environment to create docs.
See https://github.com/subuser-security/subuser/blob/master/docs/Makefile.
# Sphinx doc system containerized
- What use cases are a good fit for Apache Spark? How to work with Spark?
- create RDDs, transform them, and execute actions to get result of a computation
- All computations in memory = "memory is cheap" (we do need enough of memory to fit all the data in)
- the less disk operations, the faster (you do know it, don't you?)
- You develop such computation flows or pipelines using a programming language - Scala, Python or Java <-- that's where ability to write code is paramount
- Data is usually on a distributed file system like Hadoop HDFS or NoSQL databases like Cassandra
- Data mining = analysis / insights / analytics
- log mining