brew install hadoop
- Check if Hadoop has been successfully installed:
hadoop version
- Add following three lines to your
~/.bash_profile
:
export HADOOP_PREFIX=/usr/local/Cellar/hadoop/1.1.2/
export JAVA_HOME=$(/usr/libexec/java_home)
export PATH=$PATH:$HADOOP_PREFIX/bin
- Add following to
$HADOOP_PREFIX/libexec/conf/hadoop-env.sh
:
export HADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
- Make sure that you have ssh private (
~/.ssh/id_rsa
) and public (~/.ssh/id_rsa.pub
) keys already setup. If not:
`ssh-keygen -t rsa`
- Make sure that "Remote login" is enabled in your system preferences. For this, go to "System Preferences" -> "Sharing". "Remote login" should be checked.
- Add your public key to authorised keys list:
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
- Check if SSH setup was successful by logging in to localhost:
$ ssh localhost
Last login: Thu May 23 21:36:20 2013
- Make sure you have JDK installed:
$ java -version
java version "1.6.0_27"
OpenJDK Runtime Environment (IcedTea6 1.12.5) (6b27-1.12.5-0ubuntu0.12.04.1)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
Otherwise:
sudo apt-get install openjdk-6-jdk
- If you don't have JDK installed, install it:
sudo apt-get install openjdk-6-jre-headles
- Make sure that you have ssh private (
~/.ssh/id_rsa
) and public (~/.ssh/id_rsa.pub
) keys already setup. If not:
`ssh-keygen -t rsa`
- Add your public key to authorised keys list:
`cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys`
- Install SSH server:
sudo apt-get install openssh-server
- Check if you can SSH to
localhost
:
$ ssh localhost
You may see this warning:
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is 99:3f:f0:6a:8f:3d:7f:a7:2f:c0:75:07:47:98:3c:bd.
Are you sure you want to continue connecting (yes/no)?
Don't worry, this is supposed to happen. Verify that the fingerprint matches the one here and type "yes".
If you try to connect again, you should be greeted with following message:
$ ssh localhost
Welcome to Ubuntu 12.04.2 LTS (GNU/Linux 3.5.0-23-generic x86_64)
* Documentation: https://help.ubuntu.com/
Last login: Thu May 23 21:37:46 2013 from localhost
- Download Hadoop 1.1.2 archive and extract its contents:
wget http://mirrors.enquira.co.uk/apache/hadoop/core/hadoop-1.1.2/hadoop-1.1.2.tar.gz
tar xzf hadoop-1.1.2.tar.gz
sudo mv hadoop-1.1.2 /usr/local/hadoop
- Add following three lines to the end of
~/.bashrc
:
export HADOOP_PREFIX=/usr/local/hadoop
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-amd64
export PATH=$PATH:$HADOOP_PREFIX/bin
- Restart Bash or
source ~/.bashrc
- Check if
hadoop
responds:
$ hadoop version
Hadoop 1.1.2
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782
Compiled by hortonfo on Thu Jan 31 02:03:24 UTC 2013
From source with checksum c720ddcf4b926991de7467d253a79b8b
vim /usr/local/hadoop/conf/hadoop-env.sh
Change:
# The java implementation to use. Required.
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
To:
# The java implementation to use. Required.
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-amd64
So far the most comprehensive guide on setting up Hadoop on Windows can be found here: http://ebiquity.umbc.edu/Tutorials/Hadoop/00%20-%20Intro.html
Linux and Mac OSX:
wget http://www.gutenberg.org/files/42778/42778-0.txt
hadoop jar $HADOOP_PREFIX/libexec/hadoop-examples-*.jar wordcount 42778-0.txt output
Check the output:
less output/part-r-00000