Skip to content

Instantly share code, notes, and snippets.

@solidpple
Created July 25, 2016 07:26
Show Gist options
  • Save solidpple/e7c71cb0ed258049f63a969851f7f464 to your computer and use it in GitHub Desktop.
Save solidpple/e7c71cb0ed258049f63a969851f7f464 to your computer and use it in GitHub Desktop.
import org.apache.spark.storage.StorageLevel
val input = sc.parallelize(List(1, 2, 3, 4))
// input: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[35] at parallelize at <console>:31
val result = input.map(x => x*x)
// result: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[36] at map at <console>:33
result.persist(StorageLevel.DISK_ONLY)
// res43: result.type = MapPartitionsRDD[36] at map at <console>:33
println(result.count()) // 4
println(result.collect().mkString(",")) // 1, 4, 9, 16
result.unpersist()
// res46: result.type = MapPartitionsRDD[36] at map at <console>:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment