Skip to content

Instantly share code, notes, and snippets.

View absognety's full-sized avatar
🎯
Focusing

Vikas Chitturi - Open Source Contributor absognety

🎯
Focusing
View GitHub Profile
@absognety
absognety / setup-notes.md
Last active June 3, 2020 04:46 — forked from eddies/setup-notes.md
Spark 2.0.0 and Hadoop 2.7 with s3a setup

Standalone Spark 2.0.0 with s3

###Tested with:

  • Spark 2.0.0 pre-built for Hadoop 2.7
  • Mac OS X 10.11
  • Python 3.5.2

Goal

Use s3 within pyspark with minimal hassle.

# Install R + RStudio on Ubuntu 14.04
sudo apt-key adv –keyserver keyserver.ubuntu.com –recv-keys E084DAB9
# Ubuntu 12.04: precise
# Ubuntu 14.04: trusty
# Ubuntu 16.04: xenial
# Basic format of next line deb https://<my.favorite.cran.mirror>/bin/linux/ubuntu <enter your ubuntu version>/
sudo add-apt-repository 'deb https://ftp.ussg.iu.edu/CRAN/bin/linux/ubuntu trusty/'
sudo apt-get update