Skip to content

Instantly share code, notes, and snippets.

@shaik2many
Last active August 29, 2015 14:11
Show Gist options
  • Save shaik2many/6f5a7414ba65a76faf1c to your computer and use it in GitHub Desktop.
Save shaik2many/6f5a7414ba65a76faf1c to your computer and use it in GitHub Desktop.
linux hadoop installation commands
$ sudo -s
$ dpkg -l (gives list of all installed packages)
$ /etc/init-d/networking restart
$ service ssh restart
$ vi /etc/apt/sources.list.d/cdh3.list
(make sure u have "deb http://archive.cloudera.com/debian lucid-cdh3u5 contrib")
$pwd (print working directory)
$clear (clear CLI - command line interface)
$ls -a (list hidden and unhidden files)
$ls -al (combine both a and l flags after - symbol; l - prints all details)
$cd (with no arguments takes you to the home directory)
$touch abc.txt (creates an empty file)
$cp (copies file and or directories from source to destination; use -r flag to copy directories
$cp -r Documents More_docs (copies entire directory)
$cp Docs/abc.txt MyDocs (copies abc.txt to MyDocs folder)
$rm (remove)
$rm -r MyDocs (to remove entire directory, be very careful to use this, there is no way to undo this)
$mv (moves from source to destination; this command can also used to rename files or directories)
$mv abc.txt def.txt
$mv current_folder abc_folder
$echo hello friend (will print what ever arguments you provide)
==== Directory structures in Unix ====
/ - is the root
/users
/applications
/users/user01
/users/~ (this represents your home directory)
/users/~/photos
/users/~/music
$chmod (change mode)
===============
(NOTE: IF YOU ARE AT ROOT YOU DO NOT HAVE TO HAVE SUDO AT THE BEGINNING)
$ sudo apt-get update
$ sudo apt-get install hadoop-0.20-namenode hadoop-0.20-jobtracker
$ sudo apt-get install hadoop-0.20-datanode hadoop-0.20-tasktracker
$ sudo apt-get install openjdk-6.jdk {"jps" to list "java process"}
$ sudo apt-get purge hadoop-0.20.datanode
$ dpkg -l | grep -i hadoop
$ dpkg -l | grep -i jdk
----------
Formatting the NameNode
----------
Set the JAVA_HOME environment variable
export JAVA_HOME = /usr/lib/jvm/java-6-openjdk/jre/
Add entry in "hadoop-env.sh"
(Note: All hadoop configuration in /usr/lib/hadoop/conf/)
$ vi /usr/lib/hadoop/conf/hadoop-env.sh
export JAVA_HOME = /usr/lib/jvm/java-6-openjdk/jre/
Configure "dfs.name.dir" in "/usr/lib/hadoop/conf/hdfs-site.xml"
(this location stores the meta data of the name node)
Create disk paths
sh /home/ultim/create_disk.sh
sh /home/user1/scripts/make_dir.sh
(inside this we are creating directories and rights as follows)
/data/d1/hadoop-data/hadoop-hdfs(Note: OWNER RIGHTS using chown command to "hdfs" user)
/data/d1/mapred/local (Note: OWNER RIGHTS using chown command to "mapred" user)
/data/d2/hadoop-data/hadoop-hdfs(Note: OWNER RIGHTS using chown command to "hdfs" user)
/data/d2/mapred/local(Note: OWNER RIGHTS using chown command to "mapred" user)
sudo chown -R mapred:mapred /data/d1/mapred/
sudo chown -R mapred:mapred /data/d2/mapred/
Format Namenode ((Note: "hdfs" is a superuser for hdfs-filesystem))
sudo -u hdfs hadoop namenode -format
Start a daemon
sudo /etc/init.d/hadoop-0.20-namenode start (this is in the MasterNode)
sudo /etc/init.d/hadoop-0.20-jobtracker start (this is also in the MasterNode)
========================
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment