Skip to content

Instantly share code, notes, and snippets.

@adragomir
Created March 24, 2012 11:50
Show Gist options
  • Save adragomir/2181488 to your computer and use it in GitHub Desktop.
Save adragomir/2181488 to your computer and use it in GitHub Desktop.

Prerequisites

Intellij: You can get a free edition of Intellij IDEA here

Eclipse: You need eclipse and m2eclipse

Mac OS X

For Mac OS X, make sure that you have the Java package that is available on the Apple developer connection.

  1. Go to [https://developer.apple.com/]
  2. Create an account / login
  3. Go to Member Center
  4. Sign in if necessary.
  5. Go to the Mac Dev Center
  6. Go to View all downloads
  7. Look in the page for something like "Java for Mac OS X Developer Preview ..."
  8. Click, you can select from more than 1 download. Select the one for your operating system (10.6, 10.7)
  9. Open the DMG, and install the package.
  10. You should now have a new Java installation, inside /Library/Java/JavaVirtualMachines/VERSION (the version changes all the time, make the appropriate change)

This packages the Java VM in a consistent way. Test that you have the JAVA_HOME variable exported:

$ export JAVA_HOME=/Library/Java/JavaVirtualMachines/VERSION/Contents/Home/

Pick a folder where you will install all the package. We will call this folder ROOT from now on.

Prerequisites for building or running the entire stack

If you want to build or run the entire stack, not only develop jobs, you also need:

Passwordless SSH connection to localhost

  • On Mac OS X, ssh daemon can be started in System Preferences > Sharing > Remote Login Service
  • Generate an SSH key
$ ssh-keygen -t rsa -P ""
  • Make sure that you can connect without password
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  • You can test the connecting by doing ssh localhost. It should connect without a password.

  • Create the data folders, and make sure that you can write to them

sudo mkdir -p /var/{,log/hbase,log/hadoop,log/zookeeper,hadoop_datastore,zookeeper_datastore}
sudo chown `id -un`:`id -gn` /var/{log/hbase,log/hadoop,log/zookeeper,hadoop_datastore,zookeeper_datastore}

Hadoop (branch: cloudera-cdh3u3)

git clone https://github.com/apache/hadoop-common.git
cd hadoop-common
ant 
  • Format the namenode (one time only)

DO NOT EXECUTE THIS STEP MORE THAN ONCE. If you do, you will delete all the data in your Hadoop installation !

bin/hadoop namenode -format
  • Now, you can start the service:
ROOT/hadoop/bin/start-all.sh

**You can check to see if it works by opening the Hadoop Map/Reduce and Hadoop DFS status pages in the browser. **

Troubleshooting and more details.

If it does not work, the first place to look are the log files, in ROOT/hadoop/logs. You can find more details about Hadoop installation at Apache here.

Zookeeper (branch: 3.4.3)

  • Get the source, change the branch and build
git clone https://github.com/apache/zookeeper.git
cd zookeeper
git co -b 3.4.3 remotes/origin/3.4.3
ant
  • Start the service:
/zookeper/bin/zkServer.sh start

Hbase (branch: 0.92)

  • Get the source, change the branch and build
git clone https://github.com/apache/hbase.git
cd hbase
git co -b 0.92 remotes/origin/0.92
./build.sh
  • Start the service (this step needs to run after installing hadoop-lzo-compression)
bin/start-hbase.sh

Troubleshooting and more details

If it does not work, the first place to look are the log files, in ROOT/hbase/logs. More details about HBase can be found here, at the Apache HBase Book.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment