###create an emr cluster on latest spark version --> should be fine if you don't want to use hive, or parquet
aws emr create-cluster --name SparkCluster --ami-version 3.2.1 --instance-type m3.xlarge --instance-count 3 --ec2-attributes KeyName=<your key homey> --applications Name=Hive --bootstrap-actions Path=s3://support.elasticmapreduce/spark/install-spark,Args=\["-v1.2.0.a"\]
###k.. now u have a cluster - do some slicing and dicing thorugh pyspark client mode is fine but be sure to start in a screen, you will also need to play with parameters
./spark/bin/pyspark --master yarn --deploy-mode client --num-executors 12 --executor-memory 2g --executor-cores 4