Skip to content

Instantly share code, notes, and snippets.

@avolochenko
Last active September 22, 2015 05:26
Show Gist options
  • Save avolochenko/028ca60c17b4e2f1d61f to your computer and use it in GitHub Desktop.
Save avolochenko/028ca60c17b4e2f1d61f to your computer and use it in GitHub Desktop.
ipython spark 1.4.1 bootstrap (OSX)
import os
import sys
spark_home = os.environ.get('SPARK_HOME', None)
#if not spark_home:
# raise ValueError('SPARK_HOME environment variable is not set')
sys.path.insert(0, os.path.join(spark_home, 'python'))
sys.path.insert(0, os.path.join(spark_home, 'python/lib/py4j-0.8.2.1-src.zip'))
execfile(os.path.join(spark_home, 'python/pyspark/shell.py'))
export SPARK_HOME='/Applications/spark/spark-1.4.1-bin-hadoop2.4'
export PATH="$SPARK_HOME/bin:$PATH"
export PYSPARK_SUBMIT_ARGS='--master local[2] pyspark-shell'
#install ipython
pip install "ipython[notebook]"
#download spark 1.4.1
SPARK_FILE=spark-1.4.1-bin-hadoop2.4.tgz
wget -O /tmp/$SPARK_FILE http://d3kbcqa49mib13.cloudfront.net/$SPARK_FILE
#make dir for spark
mkdir /Applications/spark
tar zxvf /tmp/$SPARK_FILE -C /Applications/spark
#create ipython profile
ipython profile create pyspark
c = get_config()
c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False
c.NotebookApp.port = 8880
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment