Skip to content

Instantly share code, notes, and snippets.

@netanel246
Created November 20, 2017 20:42
Show Gist options
  • Save netanel246/08b1ad074f31ae87301b935b75d30c95 to your computer and use it in GitHub Desktop.
Save netanel246/08b1ad074f31ae87301b935b75d30c95 to your computer and use it in GitHub Desktop.
Creates Dataframe with Structured Streaming and renaming all the columns
// Gets the schema from the CSV files
val crimesSchema = spark.read
.option("inferSchema", true)
.option("header","true")
.csv(dataPath)
.schema
// Creates Structured Stream over the CSV files
val crimes = spark.readStream
.schema(crimesSchema)
.csv(dataPath)
import spark.implicits._
// Map of (oldColumnName -> newColumnName)
val columnsToBottom = crimes.columns
.map(col => (col, col.replace(" ", "_")))
.toMap
// Update columns names by the columnsToBottom
val renamedCrimes = columnsToBottom.foldLeft(crimes)((acc, pair) => acc.withColumnRenamed(pair._1, pair._2))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment