Skip to content

Instantly share code, notes, and snippets.

@solidpple
Last active July 25, 2016 07:22
Show Gist options
  • Save solidpple/1ade5b1039a72b623e35b2f1cc7ce38e to your computer and use it in GitHub Desktop.
Save solidpple/1ade5b1039a72b623e35b2f1cc7ce38e to your computer and use it in GitHub Desktop.
val pairs = sc.parallelize(List((1, 1), (2, 2), (3, 3)))
// pairs: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[33] at parallelize at <console>:28
pairs.partitioner
// res40: Option[org.apache.spark.Partitioner] = None
import org.apache.spark.HashPartitioner
// import org.apache.spark.HashPartitioner
val partitioned = pairs.partitionBy(new HashPartitioner(2)).persist()
// partitioned: org.apache.spark.rdd.RDD[(Int, Int)] = ShuffledRDD[34] at partitionBy at <console>:31
partitioned.partitioner
// res41: Option[org.apache.spark.Partitioner] = Some(org.apache.spark.HashPartitioner@2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment