Instructions to use Zeppelin with Spark and Cassandra

This procedure is for Spark running in a stand-alone deployment mode

Please follow those instructions:

Clone Zeppelin project from master branch on Github
If you use DSE 4.8 (thus Spark 1.4) edit the file $ZEPPELIN_HOME/spark-dependencies/pom.xml. Duplicate the Maven profile cassandra-spark-1.3 to cassandra-spark-1.4 and update the spark-cassandra-connector version to 1.4.0
Build it with this Maven command mvn clean package -Pcassandra-spark-1.3 (or 1.4 if using DSE 4.8) -Dhadoop.version=2.6.0 -Phadoop-2.6 -DskipTests. Ensure you have Maven version at least 3.x
Duplicate the file $ZEPPELINE_HOME/conf/zeppelin-env.sh.template to $ZEPPELINE_HOME/conf/zeppelin-env.sh
Edit the file $ZEPPELINE_HOME/conf/zeppelin-env.sh and add export MASTER=spark://<spark_DSE_master_IP>:7077
Start Zeppelin with $ZEPPELIN_HOME/bin/zeppelin-daemon.sh start
Goto localhost:8080 to open Zeppelin, go to the Interpreter menu
Edit Spark interpreter properties to change the property master and set it to spark://<spark_DSE_master_IP>:7077. Add also the new property spark.cassandra.connection.host to point to a list of IP addresses of your Cassandra cluster. Save the change and confirm by Yes when the popup asks you to confirm.
Restart Zeppelin with $ZEPPELIN_HOME/bin/zeppelin-daemon.sh restart

Now you can use Spark, Cassandra and the Spark Cassandra connector. Do not forget to import the Scala implicits:

  import org.apache.spark.SparkContext._
  import com.datastax.spark.connector._

doanduyhai/Spark-Cassandra-Zeppelin-instructions.md