A "Best of the Best Practices" (BOBP) guide to developing in Python.
- "Build tools for others that you want to be built for you." - Kenneth Reitz
- "Simplicity is alway better than functionality." - Pieter Hintjens
// Dataframe (supported) - read 1 file, no streaming | |
// Step 1, create the Dataframe source | |
val fileDF = spark | |
.read // No streaming | |
.csv("file/file1.csv") | |
.selectExpr("CAST(key as String)", // more code with other casting... | |
) | |
// Out [1]: fileDF: org.apache.spark.sql.package.DataFrame = [key: string, country: string ... 6 more fields] | |
// Step 2, insert Dataframe into MongoDB |