Skip to content

Instantly share code, notes, and snippets.

@dgadiraju
Last active September 4, 2019 09:22
Show Gist options
  • Save dgadiraju/fc932f6eb581d94a5a357f7d268e9a7d to your computer and use it in GitHub Desktop.
Save dgadiraju/fc932f6eb581d94a5a357f7d268e9a7d to your computer and use it in GitHub Desktop.
val orders = sc.textFile("/public/retail_db/orders")
orders.
map(o => o.split(",")(1)).
take(10).
foreach(println)
orders.
map(o => o.split(",")(1)).
distinct.
collect.
foreach(println)
orders.
map(o => o.split(",")(3)).
take(10).
foreach(println)
orders.
map(o => o.split(",")(3)).
distinct.
collect.
foreach(println)
orders.
map(o => (o.split(",")(0).toInt, o.split(",")(1))).
take(10).
foreach(println)
orders.
map(o => (o.split(",")(0).toInt, o.split(",")(1).substring(0, 7).replace("-", "").toInt)).
take(10).
foreach(println)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment