Create your environment using Anaconda miniconda for python environments
conda create -n spark python=3
Activate the environment
source activate spark
Install ipython
conda install ipython
Now install pyspark
pip install pyspark
Fire up an ipython terminal
ipython
Show that you can import the package with import pyspark.
To prove it works, type pyspark. and hit tab, this will show the methods
Download the package using devtools and load it
install.packages("sparklyr")
library(sparklyr)
the sparklyr package has utilities to manage the install for you
spark_install(version = "2.1.0")
and then ensure we have the connection loaded
sc <- spark_connect(master = "local")
and class(sc) should render:
[1] "spark_connection" "spark_shell_connection" "DBIConnection"