Skip to content

Instantly share code, notes, and snippets.

@zorteran
Created October 31, 2020 19:22
Show Gist options
  • Select an option

  • Save zorteran/4673d63ecde7d0573c8bcfce3283abbd to your computer and use it in GitHub Desktop.

Select an option

Save zorteran/4673d63ecde7d0573c8bcfce3283abbd to your computer and use it in GitHub Desktop.
val spark = SparkSession
.builder
.appName("MyAwesomeApp")
.master("local[*]")
.getOrCreate()
import spark.implicits._
val groceries = spark.read
.option("inferSchema", "true")
.option("header", "true")
.csv("some-data.csv")
val sumOfFruits = groceries
.filter($"type" === "fruit")
.withColumn("normalized_name", lower($"name"))
.groupBy("normalized_name")
.agg(
sum(($"quantity")).as("sum")
)
val fruits = groceries.filter($"type" === "fruit")
val normalizedFruits = fruits.withColumn("normalized_name", lower($"name"))
val sumOfFruits = normalizedFruits
.groupBy("normalized_name")
.agg(
sum(($"quantity")).as("sum")
)
sumOfFruits.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment