Skip to content

Instantly share code, notes, and snippets.

@rberenguel
Created August 16, 2018 09:57
Show Gist options
  • Save rberenguel/5def30c7560e2ff3230616e704cce289 to your computer and use it in GitHub Desktop.
Save rberenguel/5def30c7560e2ff3230616e704cce289 to your computer and use it in GitHub Desktop.
case class Foo(id: String, value: Int)
case class Bar(theId: String, value: Int)
val ds = List(Foo("Alice", 42), Foo("Bob", 43)).toDS
import org.apache.spark.sql.{DataFrame, Dataset}
val renamedDF: DataFrame = ds.select($"id".as("theId"), $"value")
val renamedDS: Dataset[Bar] = renamedDF.toDF("theId", "value").as[Bar]
renamedDF.where($"id" === "Alice").show
renamedDS.where($"id" === "Alice").show
@rberenguel
Copy link
Author

Some println added to PushDownPredicate to investigate this:

Pushing down the predicate on Filter/Project on fields ArrayBuffer(id#2 AS theId#5, value#3, id#2), true
a: theId#5, id#2
aliasmap: Map(theId#5 -> id#2), grandChild: LocalRelation [id#2, value#3]
Pushing down the predicate on Filter/Project on fields ArrayBuffer(theId#5 AS theId#8, value#3 AS value#9, id#2), true
a: theId#8, theId#5
a: value#9, value#3
aliasmap: Map(theId#8 -> theId#5, value#9 -> value#3), grandChild: Project [id#2 AS theId#5, value#3, id#2]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment