Skip to content

Instantly share code, notes, and snippets.

@dennyglee
Created June 7, 2017 22:09
Show Gist options
  • Save dennyglee/41429d833ab159ee993a9b1ec0b03ddb to your computer and use it in GitHub Desktop.
Save dennyglee/41429d833ab159ee993a9b1ec0b03ddb to your computer and use it in GitHub Desktop.
Spark Connector for Cosmos DB to Mongo container
//
// Spark Connector for Cosmos DB to Mongo container
// This gist provides an example of how to connect to Spark Connector for Cosmos DB to a Mongo container
//
// How to start spark-shell
// spark-shell --master yarn --jars /home/sshuser/jars/0.0.3c_1.12/azure-cosmosdb-spark-0.0.3-SNAPSHOT.jar,/home/sshuser/jars/0.0.3c_1.12/azure-documentdb-1.12.0-SNAPSHOT.jar
//
// Import Necessary Libraries
import org.joda.time._
import org.joda.time.format._
import com.microsoft.azure.cosmosdb.spark.schema._
import com.microsoft.azure.cosmosdb.spark._
import com.microsoft.azure.cosmosdb.spark.config.Config
// Maps
val baseConfigMap = Map(
"Endpoint" -> "https://$mongo-container$.documents.azure.com:443/",
"Masterkey" -> "$secret key$",
"Database" -> "$database name$",
"Collection" -> "$collection name$",
"SamplingRatio" -> "1.0",
"schema_samplesize" -> "1000"
)
// Config
val baseConfig = Config(baseConfigMap)
// Create collection connection
val mongoCntr = spark.sqlContext.read.cosmosDB(baseConfig)
mongoCntr.createOrReplaceTempView("mongoContainer")
// show data
mongoCntr.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment