Skip to content

Instantly share code, notes, and snippets.

@Yengas
Last active November 16, 2018 10:13
Show Gist options
  • Save Yengas/314d6d90cfbbcd255c9d667e0a164444 to your computer and use it in GitHub Desktop.
Save Yengas/314d6d90cfbbcd255c9d667e0a164444 to your computer and use it in GitHub Desktop.
Spark At Getir - 01
import org.apache.spark.sql.types._
import org.apache.spark.sql.functions._
case class Order(id: String, userName: String)
val orders = Seq(Order("5b85bda7685ca053517a948b", "Ahmet"), Order("5b85bda764d8194a675a546d", "Mehmet"), Order("5b85bda812c1e568bc6596dc", "Ahmet")).toDS()
orders
.groupBy("userName")
.agg(
first(struct($"id" as "oid")) as "firstOrder",
collect_list(struct($"id" as "oid")) as "orders"
)
.write
.format("com.mongodb.spark.sql.DefaultSource")
.option("spark.mongodb.output.uri", "mongodb://localhost:27017/local.test")
.save()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment