###Tested with:
- Spark 2.0.0 pre-built for Hadoop 2.7
- Mac OS X 10.11
- Python 3.5.2
Use s3 within pyspark with minimal hassle.
| # Install R + RStudio on Ubuntu 14.04 | |
| sudo apt-key adv –keyserver keyserver.ubuntu.com –recv-keys E084DAB9 | |
| # Ubuntu 12.04: precise | |
| # Ubuntu 14.04: trusty | |
| # Ubuntu 16.04: xenial | |
| # Basic format of next line deb https://<my.favorite.cran.mirror>/bin/linux/ubuntu <enter your ubuntu version>/ | |
| sudo add-apt-repository 'deb https://ftp.ussg.iu.edu/CRAN/bin/linux/ubuntu trusty/' | |
| sudo apt-get update |
$ java -version
java version "1.7.0_171"
OpenJDK Runtime Environment (rhel-2.6.13.0.el7_4-x86_64 u171-b01)
OpenJDK 64-Bit Server VM (build 24.171-b01, mixed mode)
For scala to be set up JDK 8 or greater version is required
if JDK/OpenJDK version is less than 1.8 then follow the below steps