Skip to content

Instantly share code, notes, and snippets.

@dgadiraju
Last active September 10, 2019 12:26
Show Gist options
  • Save dgadiraju/eb4d25ec62ff5a384a5f5756845343e2 to your computer and use it in GitHub Desktop.
Save dgadiraju/eb4d25ec62ff5a384a5f5756845343e2 to your computer and use it in GitHub Desktop.
val orderCustomers1 = sc.textFile("/public/retail_db/orders").
filter(o => o.split(",")(1).contains("2013-08")).
map(o => o.split(",")(2))
val orderCustomers2 = sc.textFile("/public/retail_db/orders").
filter(o => o.split(",")(1).contains("2013-09")).
map(o => o.split(",")(2))
orderCustomers1.count
orderCustomers2.count
orderCustomers1.
union(orderCustomers2).
take(10).
foreach(println)
orderCustomers1.
union(orderCustomers2).
count
val orderCustomers1 = sc.textFile("/public/retail_db/orders").
filter(o => o.split(",")(1).contains("2013-08")).
map(o => o.split(",")(2))
val orderCustomers2 = sc.textFile("/public/retail_db/orders").
filter(o => o.split(",")(1).contains("2013-09")).
map(o => o.split(",")(2))
orderCustomers1.
union(orderCustomers2).
distinct.
take(10).
foreach(println)
orderCustomers1.
union(orderCustomers2).
distinct.
count
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment