Skip to content

Instantly share code, notes, and snippets.

@edwardw
Created December 30, 2011 17:49
Show Gist options
  • Save edwardw/1540780 to your computer and use it in GitHub Desktop.
Save edwardw/1540780 to your computer and use it in GitHub Desktop.
Hush, not Harsh

HBase: The Definitive Guide has a sample application called hush. But it can't run as is. As this writing, the latest release of HBase is 0.90.5, which lacks certain features the book needs, such as coprocessor. So hush even can't compile against 0.90.5 HBase jar. It requires HBase 0.91.0-SNAPSHOT. Revision 1130916, to be precise, according to the book's website. Lars George, the author, kindly provides such a distribution in his apache site. But hush still need some minor tweaks to function properly as advertised.

That's a lot of rough edges. The book should've done better in regard to explain how to run its own sample application. Or it might not be the book's fault. I read somewhere that a team picked another NoSQL database over HBase because they felt HBase has 'too many moving parts'. This is probably still true. Since Hadoop reaches 1.0 several days ago, HBase should be able to improve its packaging/dependency story now.

In the meantime, I try to install latest HBase in a truly distributed fashion. The goal is to successfully run hush against said HBase cluster. Hope this post will save time for someone who attempts the same.

$ git clone git://git.apache.org/hbase.git hbase
$ cd hbase
$ git checkout 0.92.0rc2
$ mvn package -DskipTests=true

On master node (http://hadoop.apache.org/common/docs/r1.0.0/):

$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
$ wget -c http://mirror.atlanticmetro.net/apache//hadoop/core/hadoop-1.0.0/hadoop-1.0.0-bin.tar.gz
$ tar xzf hadoop-1.0.0-bin.tar.gz
$ cd hadoop-1.0.0
$ mkdir conf
$ cp etc/hadoop/hadoop-env.sh conf
$ vim conf/hadoop-env.sh
export JAVA_HOME=/usr/java
$ vim conf/core-site.xml
<configuration>
     <property>
         <name>fs.default.name</name>
         <value>hdfs://192.168.1.6:9000</value>
     </property>
</configuration>
$ vim conf/hdfs-site.xml
<configuration>
     <property>
         <name>dfs.replication</name>
         <value>1</value>
     </property>
     <property>
        <name>dfs.name.dir</name>
        <value>/home/edward/ws/sandbox/hadoop-1.0.0/name_dir</value>
     </property>
     <property>
        <name>dfs.data.dir</name>
        <value>/home/edward/ws/sandbox/hadoop-1.0.0/data_dir</value>
     </property>
</configuration>
$ vim conf/masters
192.168.1.6
$ vim conf/slaves
192.168.1.6
$ bin/hadoop namenode -format
$ chmod +x sbin/*.sh
$ sbin/start-dfs.sh
$ bin/hadoop dfsadmin -safemode leave (name node starts in safe mode, because dfs.replication => 1?)
$ bin/hadoop fs -put ...
$ bin/hadoop fs -get ...
$ sbin/stop-dfs.sh

Following doesn't work! APIs between HBase 0.91.0 and 0.92.0 have changed quite a lot.

$ tar xvf hbase-0.92.0.tar.gz
$ cd hbase-0.92.0
$ vim conf/hbase-env.sh
export JAVA_HOME=/usr/java
export HBASE_MANAGES_ZK=false
$ vim conf/hbase-site.xml
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://192.168.1.6:9000/hbase</value>
  </property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>192.168.1.6</value>
  </property>
</configuration>
$ vim conf/regionservers
192.168.1.6
$ bin/start-hbase.sh
$ vim hush/src/main/resources/hbase-site.xml
<configuration>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>192.168.1.6</value>
  </property>
</configuration>
$ mvn package
$ hush/bin/start-hush.sh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment