Skip to content

Instantly share code, notes, and snippets.

val streamingContext: StreamingContext = new StreamingContext(sparkContext, Seconds(20))
val lines: ReceiverInputDStream[String] = streamingContext.socketTextStream("localhost", 9999)
val split :RDD[String] = rdd.flatMap(_.split(" "))
val trim :RDD[String] = split.map(_.trim.toLowerCase)
val stopwordsRemoved = trim.filter( x => !Set("and", "the", "is", "to", "she", "he").contains(x))
val assignOne = stopwordsRemoved.map((_, 1))
val counts = assignOne.reduceByKey(_ + _)