Skip to content

Instantly share code, notes, and snippets.

@bryanyang0528
Created November 18, 2014 14:31
Show Gist options
  • Save bryanyang0528/5bcc5428b333fe43f6ee to your computer and use it in GitHub Desktop.
Save bryanyang0528/5bcc5428b333fe43f6ee to your computer and use it in GitHub Desktop.
GroupByKey vs ReduceByKey
//GroupByKey
textPairsRDD.groupByKey().map(x => (x._1,x._2.sum)).collect()
INFO SparkContext: Job finished: collect at <console>:17, took 0.227842137 s
//ReduceByKey
textPairsRDD.reduceByKey(_ + _).collect()
SparkContext: Job finished: collect at <console>:17, took 0.107143156 s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment