This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
To execute https://github.com/twitter/scalding/blob/master/scalding-core/src/main/scala/com/twitter/scalding/examples/KMeans.scala | |
and wrap results in a file. | |
Execute locally like this: | |
scala -classpath target/project-0.0.1-jar-with-dependencies.jar com.mycompany.project.KMeans2Caller \ | |
--local \ | |
--clusters <num clusters> \ | |
--input ../work/kmeansData.tsv \ | |
--output kout.tsv \ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import com.twitter.scalding.typed.TypedPipe | |
import com.twitter.scalding._ | |
import scala.util.{Failure, Success} | |
// If you want to understand what's going on, read the code of Execution and ExecutionApp (it's not long) | |
// Here's the highlights: The main of ExecutionApp - executes job for you, as long as you return Execution[Unit] | |
// Execution: Consider flatMap, zip, unit | |
// unit: since you have to return Execution[Unit] at some point, this is handy. Nice with .zip, for example | |
// zip: combine Executions to execute in parallel for fun and profit |