Skip to content

Instantly share code, notes, and snippets.

@mannharleen
Last active September 20, 2017 14:56
Show Gist options
  • Save mannharleen/a2c9941ecf108408950ac62b37a9d303 to your computer and use it in GitHub Desktop.
Save mannharleen/a2c9941ecf108408950ac62b37a9d303 to your computer and use it in GitHub Desktop.
sparkSQL connect to JDBC (Extract data)
import org.apache.spark.{SparkContext,SparkConf}
import org.apache.spark.sql.hive.HiveContext
//Do add the following artifact in build.sbt
//libraryDependencies += "mysql" % "mysql-connector-java" % "5.1.43"
//initializations
val conf = new SparkConf().setAppName("xx").setMaster("local[2]")
val sc = new SparkContext(conf)
val hiveContext = new HiveContext(sc)
// can use sqlContext as well (i.e. sparkSQL in native context)
val prop = new java.util.Properties
prop.setProperty("user","root")
prop.setProperty("password","XXX")
val df_mysql = hiveContext.read.jdbc("jdbc:mysql://localhost:3306/retail_db","products",prop)
//in pyspark it would have been:
//sqlContext.read.jdbc("jdbc:mysql://localhost:3306/retail_db","products",properties={"user":"root","password":"cloudera"})
df_mysql.show
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment