Spooky Stuff Migration

Start the Ubuntu clusters
Do following lines

sudo apt-get update
sudo apt-get install openjdk-7-jdk (to install java)

Download anaconda
Download phantomjs
Move spark-1.1.1 to the cluster (we'll use 1.2.0 later if it's ready)

scp -i <key_pair> -r spark-1.1.1-bin-hadoop1 ubuntu@<amazon_ip>:~/.

Copy spooky stuff

scp -i <key_pair> -r spookystuff ubuntu@<amazon_ip>:~/.

Then adjust .bash_profile and test spooky stuff i.e. add JAVA_HOME, Phantomjs path, Anaconda, SPARK_HOME

###After testing Spookystuff, we will then try ISpark and ISpooky

Move ISpark, ISpooky to cluster

scp -i <key_pair> -r ISpark ubuntu@<amazon_ip>:~/.
scp -i <key_pair> -r ISpooky ubuntu@<amazon_ip>:~/.

create ipython profile spooky: ipython profile create spooky
In ipython_config.py, customize as follows:

# Configuration file for ipython.                                                                                                    
import os
c = get_config()

SPARK_HOME = os.environ['SPARK_HOME']
# the above line can be replaced with: SPARK_HOME = '${INSERT_INSTALLATION_DIR_OF_SPARK}'
MASTER = 'local[4]'

c.KernelManager.kernel_cmd = [SPARK_HOME+"/bin/spark-submit",
 "--master", MASTER,
 "--class", "org.tribbloid.ispooky.SpookyMain",
 "--executor-memory", "2G",
 "--jars", "/home/ubuntu/spookystuff/shell/target/scala-2.10/spookystuff-shell-assembly-0.3.0-SNAPSHOT.jar", "/home/ubuntu/ISpooky/target/ispooky-assembly-0.1.0-SNAPSHOT.jar",
 "--profile", "{connection_file}",
 "--interp", "Spooky",
 "--parent"]

c.NotebookApp.ip = '*' # only add this line if you want IPython-notebook being open to the public                                    
c.NotebookApp.open_browser = False

###NOTE for clusters

check .bashrc and .bash_profile
copy phantomjs to all nodes
use hadoop hdfs to save the file (i.e. all nodes can access)

titipata/spooky_migrate.md

Spooky Stuff Migration