Skip to content

Instantly share code, notes, and snippets.

@mmiliaus
Last active December 17, 2015 00:28
Show Gist options
  • Save mmiliaus/5520871 to your computer and use it in GitHub Desktop.
Save mmiliaus/5520871 to your computer and use it in GitHub Desktop.
Hadoop Setup
  1. Install SSH client and server:

    sudo apt-get install openssh-client sudo apt-get install openssh-server

  2. Adding a dedicated Hadoop system user:

    $ sudo addgroup hadoop $ sudo adduser --ingroup hadoop hduser

  3. Configure SSH:

    su - hduser ssh-keygen -t rsa -P "" cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

  4. SSH to localhost

    ssh localhost

    If the SSH connect should fail, these general tips might help:

    • Make sure you have SSH Server installed;
    • Enable debugging with ssh -vvv localhost and investigate the error in detail;
    • Check the SSH server configuration in /etc/ssh/sshd_config, in particular the options PubkeyAuthentication (which should be set to yes) and AllowUsers (if this option is active, add the hduser user to it). If you made any changes to the SSH server configuration file, you can force a configuration reload with sudo /etc/init.d/ssh reload.
  5. Download Hadoop: http://mirrors.enquira.co.uk/apache/hadoop/core/hadoop-1.1.2/

    tar xzf hadoop-1.1.2.tar.gz sudo mv hadoop-1.1.2 hadoop

  6. Run wordcount example:

    hadoop jar ~/hadoop/hadoopexamples.jar wordcount

References:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment