Skip to content

Instantly share code, notes, and snippets.

@thanoojgithub
Last active April 16, 2017 18:28
Show Gist options
  • Save thanoojgithub/426af3106045791233ecd23719ebebde to your computer and use it in GitHub Desktop.
Save thanoojgithub/426af3106045791233ecd23719ebebde to your computer and use it in GitHub Desktop.
Apache Hive Install and Configuration
Apache Hive - Installation and Configuration
UBUNTU 14.04 LTS
JAVA - Oracle JDK 8
HADOOP 2.7.3
HIVE 2.1.1
MySQL 5.5 server
1.
https://hive.apache.org/downloads.html
http://www.apache.org/dyn/closer.cgi/hive/
Download and extract using tar : tar -xzvf hive-x.y.z.tar.gz
2.
~/.bashrc
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export HADOOP_HOME=/home/thanooj/bigdata/hadoop-2.7.3
export HADOOP_MAPRED_HOME=/home/thanooj/bigdata/hadoop-2.7.3
export HADOOP_COMMON_HOME=/home/thanooj/bigdata/hadoop-2.7.3
export HADOOP_HDFS_HOME=/home/thanooj/bigdata/hadoop-2.7.3
export YARN_HOME=/home/thanooj/bigdata/hadoop-2.7.3
export HADOOP_CONF_DIR=/home/thanooj/bigdata/hadoop-2.7.3/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=/home/thanooj/bigdata/hadoop-2.7.3/lib/native
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true-Djava.library.path=$HADOOP_HOME/lib
export HIVE_HOME=/home/thanooj/bigdata/apache-hive-2.1.1-bin
export HIVE_CONF_DIR=$HIVE_HOME/conf
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin
source ~/.bashrc
3.
hive-env.sh
export HADOOP_HOME=/home/thanooj/bigdata/hadoop-2.7.3
4.
$ $HADOOP_HOME/bin/hadoop fs -mkdir /tmp
$ $HADOOP_HOME/bin/hadoop fs -mkdir /warehouse
$ $HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp
$ $HADOOP_HOME/bin/hadoop fs -chmod g+w /warehouse
5.
Change hive-default.xml to hive-site.xml
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/metastore_db?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
NOTE: change all '${system:java.io.tmp' to '/tmp'
6.
cd $HIVE_HOME/scripts/metastore/upgrade/mysql/
< Login into MySQL >
mysql> drop database IF EXISTS metastore_db;
mysql> create database metastore_db;
mysql> use metastore_db;
mysql> source hive-schema-2.1.0.mysql.sql;
7.
thanooj@thanooj-Inspiron-3521:~$ jps
3105 SecondaryNameNode
3413 NodeManager
3733 Jps
2871 DataNode
3274 ResourceManager
2733 NameNode
thanooj@thanooj-Inspiron-3521:~$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/thanooj/bigdata/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/thanooj/bigdata/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/home/thanooj/bigdata/apache-hive-2.1.1-bin/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> show databases;
OK
default
mydb
Time taken: 1.254 seconds, Fetched: 2 row(s)
hive> use mydb;
OK
Time taken: 0.024 seconds
hive> show tables;
OK
Time taken: 0.039 seconds
hive>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment