Skip to content

Instantly share code, notes, and snippets.

@idris75
Forked from jamesrajendran/RDD-To-DF-To-DS
Created March 2, 2020 13:37
Show Gist options
  • Save idris75/b68897d417ede9623eecdb30d66757e5 to your computer and use it in GitHub Desktop.
Save idris75/b68897d417ede9623eecdb30d66757e5 to your computer and use it in GitHub Desktop.
Rdd --> DF --> Table --> SQL --> DS
val lrdd =sc.parallelize( List(1,2,3,4,5,3,5))
//without case class
val namedDF = sqlContext.createDataFrame(lrdd.map(Tuple1.apply)).toDF("Id")
//with case class
case class Dummy(Id: Int)
val namedDF = lrdd.map(x => Dummy(x.toInt)).toDF()
//one liner DF
val ldf = List(1,2,3,4,5,3,5).toDS().toDF()
namedDF.registerTempTable("l_table")
sqlContext.sql("select * from l_table").show
sqlContext.sql("select * from l_table where Id =3").show
sqlContext.sql("select * from l_table where Id in (3,1)").show
sqlContext.sql("select * from l_table where Id like '3%' ").show
sqlContext.sql("select * from l_table where Id like '3' ").show
sqlContext.sql("select id,count(*) from l_table group by Id ").show
===================
val ds = namedDF.as[Dummy]
ds.distinct.show
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment