Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save dgadiraju/1bae5afe2e3e34e54280a27e54a8dfff to your computer and use it in GitHub Desktop.
Save dgadiraju/1bae5afe2e3e34e54280a27e54a8dfff to your computer and use it in GitHub Desktop.
val orders = sc.textFile("/public/retail_db/orders")
val ordersMap = orders.
map(o => (o.split(",")(1), 1))
val ordersGroupByDate = ordersMap.
groupByKey
val orderCountByDate = ordersGroupByDate.
map(o => (o._1, o._2.size))
orderCountByDate.
take(10).
foreach(println)
val orders = sc.textFile("/public/retail_db/orders")
val ordersMap = orders.
map(o => (o.split(",")(2).toInt, o))
val ordersGroupByCustomerId = ordersMap.
groupByKey
val ordersByCustomerByDateDesc = ordersGroupByCustomerId.
flatMap(o => {
val ordersPerCustomer = o._2
ordersPerCustomer.
toList.
sortBy(k => k.split(",")(1))(Ordering.String.reverse)
})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment