-
First install Java, Scala and Spark in Ubuntu
-
Install Java
sudo apt-add-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer java -version
-
Install Scala
wget http://downloads.lightbend.com/scala/2.12.0/scala-2.12.0.tgz sudo mkdir /usr/local/src/scala tar -xvf scala-2.12.0.tgz -C /usr/local/src/scala/ nano .bashrc export SCALA_HOME=/usr/local/src/scala/scala-2.12.0 export PATH=$SCALA_HOME/bin:$PATH source .bashrc scala -version
-
Install Git
sudo apt-get install git
-
Install sbt
wget https://bintray.com/artifact/download/sbt/debian/sbt-0.13.6.deb sudo dpkg -i sbt-0.13.6.deb sudo apt-get update sudo apt-get install sbt
-
To Install Spark
# Follow the link: https://www.santoshsrinivas.com/installing-apache-spark-on-ubuntu-16-04/ wget http://d3kbcqa49mib13.cloudfront.net/spark-2.0.2-bin-hadoop2.7.tgz tar -xvf spark-2.0.2-bin-hadoop2.7.tgz mv spark-2.0.2-bin-hadoop2.7/ spark cd conf/ cp spark-env.sh.template spark-env.sh nano spark-env.sh
-
-
Add the following lines to spark-env.sh
JAVA_HOME=/usr/lib/jvm/java-8-oracle
SPARK_WORKER_MEMORY=4g
PYSPARK_PYTHON=/home/<username>/anaconda3/bin/python
source spark-env.sh
- Run pyspark
username@machine:~$ pyspark
Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul 2 2016, 17:53:06)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/12/06 12:05:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/12/06 12:05:55 WARN Utils: Your hostname, username resolves to a loopback address: xxx.x.x.x; using xxx.x.x.xx instead (on interface enp0s31f6)
16/12/06 12:05:55 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 2.0.2
/_/
Using Python version 3.5.2 (default, Jul 2 2016 17:53:06)
SparkSession available as 'spark'.
>>> exit()
- To install Zeppelin notebook
sudo apt-get update
sudo apt-get install npm
sudo apt install nodejs-legacy
sudo apt-get install libfontconfig
- Install maven
wget http://www-eu.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz
sudo tar -zxf apache-maven-3.3.9-bin.tar.gz -C /usr/local/
sudo ln -s /usr/local/apache-maven-3.3.9/bin/mvn /usr/local/bin/mvn
- Check for the versions
node --version
mvn --version
- Clone zeppelin repo
git clone https://github.com/apache/zeppelin.git
- Add the below lines to .bashrc
export M2_HOME=/usr/local/apache-maven-3.3.9
export PATH=${M2_HOME}/bin:${PATH}
source .bashrc
- Install 'bower'
sudo npm install -g bower
- Maven Build for running on local
mvn clean install -DskipTests
# The Zeppelin: web-application [FAILURE] would be there
# Do the below steps to rectify it
cd zeppelin/zeppelin-web
bower install
# A lot of packages would be shown on the screen
# At the end it would show the following message::
# Answer?
# Type --- !2 (Select the option which has "... needed for zeppelin-web"
# Above persists the angularjs package for our maven build
# Source: https://stackoverflow.com/questions/35014273/failed-to-run-task-bower-allow-root-install-failed/37527480#37527480
# Run
mvn clean install -DskipTests
# We should get the below message for successful build::
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ zeppelin-distribution ---
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Zeppelin ........................................... SUCCESS [ 2.173 s]
[INFO] Zeppelin: Interpreter .............................. SUCCESS [ 5.891 s]
[INFO] Zeppelin: Zengine .................................. SUCCESS [ 3.633 s]
[INFO] Zeppelin: Display system apis ...................... SUCCESS [ 7.594 s]
[INFO] Zeppelin: Spark dependencies ....................... SUCCESS [ 20.124 s]
[INFO] Zeppelin: Spark .................................... SUCCESS [ 10.737 s]
[INFO] Zeppelin: Markdown interpreter ..................... SUCCESS [ 0.249 s]
[INFO] Zeppelin: Angular interpreter ...................... SUCCESS [ 0.170 s]
[INFO] Zeppelin: Shell interpreter ........................ SUCCESS [ 0.270 s]
[INFO] Zeppelin: Livy interpreter ......................... SUCCESS [ 0.429 s]
[INFO] Zeppelin: HBase interpreter ........................ SUCCESS [ 1.928 s]
[INFO] Zeppelin: PostgreSQL interpreter ................... SUCCESS [ 0.268 s]
[INFO] Zeppelin: JDBC interpreter ......................... SUCCESS [ 0.277 s]
[INFO] Zeppelin: File System Interpreters ................. SUCCESS [ 0.445 s]
[INFO] Zeppelin: Flink .................................... SUCCESS [ 3.841 s]
[INFO] Zeppelin: Apache Ignite interpreter ................ SUCCESS [ 0.496 s]
[INFO] Zeppelin: Kylin interpreter ........................ SUCCESS [ 0.229 s]
[INFO] Zeppelin: Python interpreter ....................... SUCCESS [ 0.249 s]
[INFO] Zeppelin: Lens interpreter ......................... SUCCESS [ 1.866 s]
[INFO] Zeppelin: Apache Cassandra interpreter ............. SUCCESS [ 20.993 s]
[INFO] Zeppelin: Elasticsearch interpreter ................ SUCCESS [ 1.514 s]
[INFO] Zeppelin: BigQuery interpreter ..................... SUCCESS [ 0.410 s]
[INFO] Zeppelin: Alluxio interpreter ...................... SUCCESS [ 1.497 s]
[INFO] Zeppelin: web Application .......................... SUCCESS [01:07 min]
[INFO] Zeppelin: Server ................................... SUCCESS [ 35.707 s]
[INFO] Zeppelin: Packaging distribution ................... SUCCESS [ 1.318 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:10 min
[INFO] Finished at: 2016-12-06T16:47:24+05:30
[INFO] Final Memory: 225M/3228M
[INFO] ------------------------------------------------------------------------
- Start the daemon.sh
$ bin/zeppelin-daemon.sh start
Zeppelin start [ OK ]
helpful thanks