Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save nsabharwal/d202ab19ad2877df4408 to your computer and use it in GitHub Desktop.
Save nsabharwal/d202ab19ad2877df4408 to your computer and use it in GitHub Desktop.
Phoenix Zeppelin
1. Checkout source code from https://github.com/apache/incubator-zeppelin
2. Custom build the code with spark 1.3 and with the respective Hadoop version.
mvn clean package -Pspark-1.3 -Dhadoop.version=2.6.0 -Phadoop-2.6 -DskipTests
3. Have the following jars in the spark classpath by placing them in the location $ZEPPELIN_HOME/interpreter/spark
a. hbase-client.jar
b. hbase-protocol.jar
c. hbase-common.jar
d. phoenix-4.4.x-client-without-hbase.jar
4. Start Zeppelin
sh $ZEPPELIN_HOME/bin/zeppelin-daemon.sh start
5. Create a Phoenix table and populate with data.
psql.py localhost ../examples/web_stat.sql ../examples/web_stat.csv ../examples/web_stat_queries.sql
6. Open Zeppelin and create a new Note.
Run the following spark script.
import org.apache.spark.SparkContext
import org.apache.spark.sql.SQLContext
import org.apache.phoenix.spark._
val sqlContext = new SQLContext(sc)
val df = sqlContext.load("org.apache.phoenix.spark",Map("table" -> "WEB_STAT", "zkUrl" -> "zookeeper_url"))
df.select(df("HOST")).show
For further reading , refer to https://phoenix.apache.org/phoenix_spark.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment