Here is suggestion on how to run pyspark from Amazon EC2:
IPYTHON_OPTS="notebook --ip=* --no-browser" ~/spark-1.2.0-bin-hadoop1/bin/pyspark --master local[4] --driver-memory 4g --executor-memory 4g
For help, we can do something like:
spark-1.2.0-bin-hadoop1/bin/pyspark --help
For example, we want to get larger output, where we can do:
--conf PROP=VALUE
IPYTHON_OPTS="notebook --ip=* --no-browser" ~/spark-1.2.0-bin-hadoop1/bin/pyspark --master local[4] --driver-memory 4g --executor-memory 4g --conf spark.driver.maxResultSize=4096