Skip to content

Instantly share code, notes, and snippets.

@camallen
Forked from cosmincatalin/ readme.md
Created June 21, 2017 09:49
Show Gist options
  • Save camallen/026333d33358ebbd21d1e4cf00f6e45c to your computer and use it in GitHub Desktop.
Save camallen/026333d33358ebbd21d1e4cf00f6e45c to your computer and use it in GitHub Desktop.
AWS EMR bootstrap to install R packages from CRAN

AWS EMR bootstrap to install R packages from CRAN

This bootstrap is useful if you want to deploy SparkR applications that run arbitrary code on the EMR cluster's workers. The R code will need to have its dependencies already installed on each of the workers, and will fail otherwise. This is the case if you use functions such as gapply or dapply.

How to use the bootstrap

  1. You will first have to download the gist to a file and then upload it to S3 in a bucket of your choice.
  2. Using the AWS EMR Console create a cluster and choose advanced options.
  3. In Step 3 you can configure your bootstraps. Choose to Configure and add a Custom action
    • For the Name you can fill something like Install CRAN dependencies
    • For the Script location you will need to point to where you have uploaded the gist (Eg. s3://my-bucket/emr/bootstrap/install-cran-packages.sh)
    • As Optional arguments you can add the following:
      • --packages - Where you list all of the CRAN packages that you depend on, separated by semicolon. Eg: --packages magrittr;dplyr;tydr

Other interesting material

Take a look at my other Related gists:

#!/bin/bash
PACKAGES=""
while [[ $# > 1 ]]; do
key="$1"
case $key in
# The packages to install separated by semicolon
# Eg: --packages magrittr;dplyr
--packages)
PACKAGES="$2"
shift
;;
*)
echo "Unknown option: ${key}"
exit 1;
esac
shift
done
echo "*****************************************"
PACKAGES_ARR=(${PACKAGES//;/ })
for i in "${PACKAGES_ARR[@]}"
do
:
echo " Installing ${i}"
echo "*****************************************"
sudo R -e "install.packages('${i}', repos='https://cran.rstudio.com/')" 1>&2
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment