Skip to content

Instantly share code, notes, and snippets.

@ScotterC
Last active December 13, 2015 18:29
Show Gist options
  • Select an option

  • Save ScotterC/4956080 to your computer and use it in GitHub Desktop.

Select an option

Save ScotterC/4956080 to your computer and use it in GitHub Desktop.
Setting up Wukong-Hadoop on EMR, bootstrap script. (Wukong 3.0)
#!/bin/bash
# Update, upgrade and install development tools:
sudo apt-get update
sudo apt-get -y upgrade
sudo apt-get -y install build-essential git-core curl libssl-dev \
libreadline5 libreadline5-dev \
zlib1g zlib1g-dev \
libmysqlclient-dev \
libcurl4-openssl-dev \
libxslt-dev libxml2-dev
sudo mkdir /usr/local/rbenv
# Install rbenv
sudo git clone git://github.com/sstephenson/rbenv.git /usr/local/rbenv
# Add rbenv to the path:
sudo touch /etc/profile.d/rbenv.sh
sudo chmod 777 /etc/profile.d/rbenv.sh
echo '# rbenv setup' > /etc/profile.d/rbenv.sh
echo 'export RBENV_ROOT=/usr/local/rbenv' >> /etc/profile.d/rbenv.sh
echo 'export PATH="$RBENV_ROOT/bin:$PATH"' >> /etc/profile.d/rbenv.sh
echo 'eval "$(rbenv init -)"' >> /etc/profile.d/rbenv.sh
sudo mkdir /usr/local/rbenv/shims
sudo mkdir /usr/local/rbenv/versions
source /etc/profile.d/rbenv.sh
# Install ruby-build:
pushd /tmp
sudo git clone git://github.com/sstephenson/ruby-build.git
cd ruby-build
sudo ./install.sh
popd
# Install Ruby 1.9.3-p194:
sudo ruby-build 1.9.3-p194 /usr/local/
# Production installing gems skipping ri and rdoc
sudo apt-get update
sudo gem install bundler --no-rdoc --no-ri
sudo gem install gorillib --no-rdoc --no-ri
sudo gem install wukong-hadoop --no-rdoc --no-ri
# EMR INPUTS
# input: {s3_bucket}/input/data_input_folder
# output: {s3_bucket}/output/data_output_folder
# mapper: wu-local s3://{s3_bucket}/scripts/mapper.rb --run=mapper # wukong 3.0 script
# reducer: wu-local s3://{s3_bucket}/scripts/reducer.rb --run=reducer # wukong 3.0 script
# bootstrap: s3://{s3_bucket}/scripts/wukong_setup.sh # This gist
@ScotterC
Copy link
Copy Markdown
Author

Some of this is not great. modding rbenv.sh 777 is definitely unnecessary and installing ruby 1.9.3 on every emr job is really slow. More of a code spike then a finished script - but it does work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment