Skip to content

Instantly share code, notes, and snippets.

View rvilla87's full-sized avatar

Rubén Villacreces rvilla87

View GitHub Profile
@rvilla87
rvilla87 / MongoDBsparkConnector.scala
Created August 6, 2017 16:28
Inserting documents in MongoDB with Spark Connector (Dataframe vs Spark Structured Streaming)
// Dataframe (supported) - read 1 file, no streaming
// Step 1, create the Dataframe source
val fileDF = spark
.read // No streaming
.csv("file/file1.csv")
.selectExpr("CAST(key as String)", // more code with other casting...
)
// Out [1]: fileDF: org.apache.spark.sql.package.DataFrame = [key: string, country: string ... 6 more fields]
// Step 2, insert Dataframe into MongoDB