Skip to content

Instantly share code, notes, and snippets.

@MarioAriasC
Last active December 21, 2019 11:22
Show Gist options
  • Save MarioAriasC/d834e578c6587bf6da51 to your computer and use it in GitHub Desktop.
Save MarioAriasC/d834e578c6587bf6da51 to your computer and use it in GitHub Desktop.
Word Count with Apache Spark and Kotlin
package org.cakesolutions.spark
import org.apache.spark.SparkConf
import org.apache.spark.api.java.JavaSparkContext
import scala.Tuple2
fun main(args: Array<String>) {
val inputFile = args[0]
val outputFile = args[1]
val conf = SparkConf().setAppName("wordCount")
val sc = JavaSparkContext(conf)
val input = sc.textFile(inputFile)
val words = input.flatMap { x -> x.splitBy(" ") }
val counts = words.mapToPair { x -> Tuple2(x, 1) }.reduceByKey { x, y -> x + y }
counts.saveAsTextFile(outputFile)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment