Skip to content

Instantly share code, notes, and snippets.

@ldacosta
Created December 10, 2014 11:26
Show Gist options
  • Save ldacosta/bde900bf458156098a83 to your computer and use it in GitHub Desktop.
Save ldacosta/bde900bf458156098a83 to your computer and use it in GitHub Desktop.
case class B(d: Int)
case class C(c: Int)
case class A(b: B, i: Int, c: C)
val sc = new SparkContext()
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext._
val rdd = sc.parallelize(1 to 10).map(v => A(b = B(v), i = v, C(v)))
val fName = "lala.parquet"
rdd.saveAsParquetFile(fName)
val rdd2 = sqlContext.parquetFile(fName)
val tName = "tableParquet"
rdd2.registerAsTable(tName)
val allRowsInTable = sql(s"SELECT * FROM ${tName}")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment