Skip to content

Instantly share code, notes, and snippets.

@jacksonpradolima
Created April 4, 2017 19:15
Show Gist options
  • Save jacksonpradolima/a819fc8ce1aa06f84f743c4c92ff90aa to your computer and use it in GitHub Desktop.
Save jacksonpradolima/a819fc8ce1aa06f84f743c4c92ff90aa to your computer and use it in GitHub Desktop.
install_spark_scala_zeppelin
#check java instalation
java -version
#get path
echo $JAVA_HOME
#install java
sudo apt-get update
sudo apt-get install oracle-java8-installer
#install scala
sudo apt-get install scala
#install python (by default in linux is the version 2.7)
sudo apt-get install python3
#install spark
wget http://d3kbcqa49mib13.cloudfront.net/spark-2.1.0-bin-hadoop2.7.tgz
tar -xvf spark-2.1.0-bin-hadoop2.7.tgz
#live in a common folder
sudo mv spark-2.1.0-bin-hadoop2.7 /opt/spark-2.1.0-bin-hadoop2.7
#spark conf (spark-env.sh)
SPARK_MASTER_WEBUI_PORT=8888
#spark - slave files
localhost
#zeppelin conf
#commons path
export JAVA_HOME="/usr/lib/jvm/java-8-oracle/"
export SPARK_HOME="/home/'user'/spark-2.1.0-bin-hadoop2.7/"
export PYSPARK_PYTHON="/usr/bin/python2.7"
export PYTHONPATH="/home/'user'/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip" ---- or in "/opt/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip"
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/build:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.8.2.1-src.zip:$PYTHONPATH
#change in zeppelin-site the property zeppelin.server.port for another "value". the port 8080 is the most common, so, this can be a problem after
# run zeppelin
bin/zeppelin-daemon.sh start
#install some python packages (matplotlib,geopy,folium,pandas,seaborn,scikit-learn)
#Python 3.x:
sudo pip install matplotlib
#Python 2.7:
sudo python -m pip install folium
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment