Created
November 20, 2017 20:42
-
-
Save netanel246/08b1ad074f31ae87301b935b75d30c95 to your computer and use it in GitHub Desktop.
Creates Dataframe with Structured Streaming and renaming all the columns
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Gets the schema from the CSV files | |
val crimesSchema = spark.read | |
.option("inferSchema", true) | |
.option("header","true") | |
.csv(dataPath) | |
.schema | |
// Creates Structured Stream over the CSV files | |
val crimes = spark.readStream | |
.schema(crimesSchema) | |
.csv(dataPath) | |
import spark.implicits._ | |
// Map of (oldColumnName -> newColumnName) | |
val columnsToBottom = crimes.columns | |
.map(col => (col, col.replace(" ", "_"))) | |
.toMap | |
// Update columns names by the columnsToBottom | |
val renamedCrimes = columnsToBottom.foldLeft(crimes)((acc, pair) => acc.withColumnRenamed(pair._1, pair._2)) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment