Created
October 26, 2017 11:03
-
-
Save eavidan/47a990c7d29b2118df1a675ec84bb3d0 to your computer and use it in GitHub Desktop.
working on spark DF partitions. the following creates a Map of columns (key: column name, value: list of values) from the Rows supplied by the DF in each partition
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
val names = df.columns.toList | |
println(names) | |
df.foreachPartition(rows => { | |
var cols = scala.collection.mutable.Map[String, List[Any]]() | |
names.foreach(col => cols(col) = List()) | |
rows.foreach(row => names.zip(row.toSeq).map(x => { | |
cols(x._1) = cols(x._1) :+ x._2 | |
})) | |
println(cols) | |
}) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment