Skip to content

Instantly share code, notes, and snippets.

@thanoojgithub
Last active May 29, 2018 08:01
Show Gist options
  • Save thanoojgithub/08894dda03de4150ad73 to your computer and use it in GitHub Desktop.
Save thanoojgithub/08894dda03de4150ad73 to your computer and use it in GitHub Desktop.
Hive Basics
% tar xzf apache-hive-x.y.z-bin.tar.gz
% export HIVE_HOME=~/home/ubuntu/hive-1.2.1
% export PATH=$PATH:$HIVE_HOME/bin
hadoop: /home/ubuntu/softwares/hadoop-2.7.2/bin/hadoop /home/ubuntu/softwares/hadoop-2.7.2/bin/hadoop.cmd
ubuntu@ubuntu:~$ whereis hive
hive: /home/ubuntu/softwares/apache-hive-2.1.0-SNAPSHOT-bin/bin/hive /home/ubuntu/softwares/apache-hive-2.1.0-SNAPSHOT-bin/bin/hive.cmd
Note : CTRL + L = clear the Hive CLI screen
ubuntu@ubuntu:~$ start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/ubuntu/softwares/hadoop-2.7.2/logs/hadoop-ubuntu-namenode-ubuntu.out
localhost: starting datanode, logging to /home/ubuntu/softwares/hadoop-2.7.2/logs/hadoop-ubuntu-datanode-ubuntu.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/ubuntu/softwares/hadoop-2.7.2/logs/hadoop-ubuntu-secondarynamenode-ubuntu.out
ubuntu@ubuntu:~$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/ubuntu/softwares/hadoop-2.7.2/logs/yarn-ubuntu-resourcemanager-ubuntu.out
localhost: starting nodemanager, logging to /home/ubuntu/softwares/hadoop-2.7.2/logs/yarn-ubuntu-nodemanager-ubuntu.out
ubuntu@ubuntu:~$ jps
6452 DataNode
6962 NodeManager
6824 ResourceManager
7063 Jps
6310 NameNode
6667 SecondaryNameNode
Note:
5 Hadoop Daemons:
6452 DataNode
6962 NodeManager
6824 ResourceManager
6310 NameNode
6667 SecondaryNameNode
Below 2 Daemons are optional for Pseudo Distributed Mode:
WebAppProxy
Map Reduce Job History Server
5 Hadoop Daemons should be up and running, for successful excution on Hadoop/YARN env.
If any daemons have not started, then do a format, like:
ubuntu@ubuntu:~$ hdfs namenode -format
ubuntu@ubuntu:~$ hdfs datanode -format
hdfs-site.xml
-------------
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/ubuntu/softwares/hadoop-2.7.2/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/ubuntu/softwares/hadoop-2.7.2/hdfs/datanode</value>
</property>
</configuration>
Note: If you getting 'org.apache.hadoop.hive.ql.exec.mr.MapRedTask. PermGen space', then, change the value of 'mapred.child.java.opts' property
mapred-site.xml
---------------
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx2048m</value>
</property>
</configuration>
Note:
Before doing shutdown VM, recommended to stop all Daemons, like:
ubuntu@ubuntu:~$ stop-dfs.sh
ubuntu@ubuntu:~$ stop-yarn.sh
ubuntu@ubuntu:~$ hdfs dfs -ls /
ubuntu@ubuntu:~$ hdfs dfs -copyFromLocal /home/ubuntu/input/input /
ubuntu@ubuntu:~$ hdfs dfs -ls /
Found 1 items
drwxr-xr-x - ubuntu supergroup 0 2016-02-11 23:42 /input
ubuntu@ubuntu:~$ hdfs dfs -ls /input
Found 5 items
-rw-r--r-- 1 ubuntu supergroup 421 2016-02-11 23:42 /input/cust_credit.txt
-rw-r--r-- 1 ubuntu supergroup 421 2016-02-11 23:42 /input/cust_credit~
-rw-r--r-- 1 ubuntu supergroup 209 2016-02-11 23:42 /input/cust_debit.txt
-rw-r--r-- 1 ubuntu supergroup 209 2016-02-11 23:42 /input/cust_debit~
-rw-r--r-- 1 ubuntu supergroup 61 2016-02-11 23:42 /input/raw.txt
ubuntu@ubuntu:~$ hdfs dfs -ls /input/raw.txt
-rw-r--r-- 1 ubuntu supergroup 61 2016-02-11 23:42 /input/raw.txt
ubuntu@ubuntu:~$ hdfs dfs -cat /input/raw.txt
1,sriram,2015-10-12
2,seeta,2015-09-12
3,lakshman,2015-11-12
Note:
we can use '-put'
ubuntu@ubuntu:~$ hdfs dfs -put /home/ubuntu/input/input/raw_log.txt /input
To remove HDFS folder:
ubuntu@ubuntu:~$ hdfs dfs -rm -r hdfs://localhost:9000/input
ubuntu@ubuntu:~$ hive
log4j: reset attribute= "false".
log4j: Threshold ="null".
log4j: Level value for root is [INFO].
log4j: root level set to INFO
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n].
log4j: Adding appender named [console] to category [root].
log4j: Reading configuration from URL file:/home/ubuntu/softwares/hive-1.2.1/conf/hive-log4j.properties
log4j: Hierarchy threshold set to [ALL].
log4j: Parsing for [root] with value=[INFO,DRFA, EventCounter].
log4j: Level token is [INFO].
log4j: Category root set to INFO
log4j: Parsing appender named "DRFA".
log4j: Parsing layout options for "DRFA".
log4j: Setting property [conversionPattern] to [%d{ISO8601} %-5p [%t]: %c{2} (%F:%M(%L)) - %m%n].
log4j: End of parsing for "DRFA".
log4j: Setting property [datePattern] to [.yyyy-MM-dd].
log4j: Setting property [file] to [/home/ubuntu/softwares/hive-1.2.1/log/hive.log].
log4j: setFile called: /home/ubuntu/softwares/hive-1.2.1/log/hive.log, true
log4j: setFile ended
log4j: Appender [DRFA] to be rolled at midnight.
log4j: Parsed "DRFA" options.
log4j: Parsing appender named "EventCounter".
log4j: Parsed "EventCounter" options.
log4j: Parsing for [Datastore.Schema] with value=[ERROR,DRFA].
log4j: Level token is [ERROR].
log4j: Category Datastore.Schema set to ERROR
log4j: Parsing appender named "DRFA".
log4j: Appender "DRFA" was already parsed.
log4j: Handling log4j.additivity.Datastore.Schema=[null]
log4j: Parsing for [JPOX.General] with value=[ERROR,DRFA].
log4j: Level token is [ERROR].
log4j: Category JPOX.General set to ERROR
log4j: Parsing appender named "DRFA".
log4j: Appender "DRFA" was already parsed.
log4j: Handling log4j.additivity.JPOX.General=[null]
log4j: Parsing for [DataNucleus] with value=[ERROR,DRFA].
log4j: Level token is [ERROR].
log4j: Category DataNucleus set to ERROR
log4j: Parsing appender named "DRFA".
log4j: Appender "DRFA" was already parsed.
log4j: Handling log4j.additivity.DataNucleus=[null]
log4j: Parsing for [Datastore] with value=[ERROR,DRFA].
log4j: Level token is [ERROR].
log4j: Category Datastore set to ERROR
log4j: Parsing appender named "DRFA".
log4j: Appender "DRFA" was already parsed.
log4j: Handling log4j.additivity.Datastore=[null]
log4j: Parsing for [JPOX.Query] with value=[ERROR,DRFA].
log4j: Level token is [ERROR].
log4j: Category JPOX.Query set to ERROR
log4j: Parsing appender named "DRFA".
log4j: Appender "DRFA" was already parsed.
log4j: Handling log4j.additivity.JPOX.Query=[null]
log4j: Parsing for [JPOX.Datastore] with value=[ERROR,DRFA].
log4j: Level token is [ERROR].
log4j: Category JPOX.Datastore set to ERROR
log4j: Parsing appender named "DRFA".
log4j: Appender "DRFA" was already parsed.
log4j: Handling log4j.additivity.JPOX.Datastore=[null]
log4j: Parsing for [JPOX.MetaData] with value=[ERROR,DRFA].
log4j: Level token is [ERROR].
log4j: Category JPOX.MetaData set to ERROR
log4j: Parsing appender named "DRFA".
log4j: Appender "DRFA" was already parsed.
log4j: Handling log4j.additivity.JPOX.MetaData=[null]
log4j: Parsing for [org.apache.zookeeper.server.NIOServerCnxn] with value=[WARN,DRFA].
log4j: Level token is [WARN].
log4j: Category org.apache.zookeeper.server.NIOServerCnxn set to WARN
log4j: Parsing appender named "DRFA".
log4j: Appender "DRFA" was already parsed.
log4j: Handling log4j.additivity.org.apache.zookeeper.server.NIOServerCnxn=[null]
log4j: Parsing for [JPOX.Plugin] with value=[ERROR,DRFA].
log4j: Level token is [ERROR].
log4j: Category JPOX.Plugin set to ERROR
log4j: Parsing appender named "DRFA".
log4j: Appender "DRFA" was already parsed.
log4j: Handling log4j.additivity.JPOX.Plugin=[null]
log4j: Parsing for [org.apache.zookeeper.ClientCnxnSocketNIO] with value=[WARN,DRFA].
log4j: Level token is [WARN].
log4j: Category org.apache.zookeeper.ClientCnxnSocketNIO set to WARN
log4j: Parsing appender named "DRFA".
log4j: Appender "DRFA" was already parsed.
log4j: Handling log4j.additivity.org.apache.zookeeper.ClientCnxnSocketNIO=[null]
log4j: Parsing for [JPOX.Enhancer] with value=[ERROR,DRFA].
log4j: Level token is [ERROR].
log4j: Category JPOX.Enhancer set to ERROR
log4j: Parsing appender named "DRFA".
log4j: Appender "DRFA" was already parsed.
log4j: Handling log4j.additivity.JPOX.Enhancer=[null]
log4j: Finished configuring.
Logging initialized using configuration in file:/home/ubuntu/softwares/hive-1.2.1/conf/hive-log4j.properties
hive>
hive> create schema thanooj;
OK
Time taken: 0.16 seconds
hive> show schemas;
OK
default
test
thanooj
Time taken: 0.048 seconds, Fetched: 3 row(s)
hive> use thanooj;
OK
Time taken: 0.017 seconds
hive> create table thanooj.cust_credit (ID INT, NAME STRING, doa STRING, loation STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n';
OK
Time taken: 1.267 seconds
hive> drop table cust_credit;
OK
Time taken: 1.6 seconds
hive> create EXTERNAL table thanooj.cust_credit (ID INT, NAME STRING, doa STRING, loation STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n';
OK
Time taken: 0.114 seconds
hive> LOAD DATA LOCAL INPATH '/home/ubuntu/input/input/cust_credit.txt' OVERWRITE INTO TABLE thanooj.cust_credit;
Loading data to table thanooj.cust_credit
OK
Time taken: 1.047 seconds
hive> select * from thanooj.cust_credit;
OK
1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya
7 sriram 2015-10-12 ayodhya
8 seeta 2015-09-12 midhila
9 lakshman 2015-11-12 ayodhya
10 bharata 2015-12-12 ayodhya
11 sathrugna 2015-12-12 ayodhya
12 hanuma 2015-11-12 ayodhya
Time taken: 1.967 seconds, Fetched: 12 row(s)
hive> LOAD DATA INPATH '/input/cust_credit.txt' OVERWRITE INTO TABLE thanooj.cust_credit;
Loading data to table thanooj.cust_credit
OK
Time taken: 0.276 seconds
hive> select * from thanooj.cust_credit;
OK
1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya
7 sriram 2015-10-12 ayodhya
8 seeta 2015-09-12 midhila
9 lakshman 2015-11-12 ayodhya
10 bharata 2015-12-12 ayodhya
11 sathrugna 2015-12-12 ayodhya
12 hanuma 2015-11-12 ayodhya
Time taken: 0.196 seconds, Fetched: 12 row(s)
hive> LOAD DATA INPATH '/input/cust_debit.txt' OVERWRITE INTO TABLE thanooj.cust_debit;
Loading data to table thanooj.cust_debit
OK
Time taken: 0.269 seconds
hive> select * from thanooj.cust_debit;
OK
1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya
Time taken: 0.106 seconds, Fetched: 6 row(s)
hive> ! hadoop fs -ls /input;
Found 4 items
-rw-r--r-- 1 ubuntu supergroup 421 2016-02-11 23:42 /input/cust_credit.txt
-rw-r--r-- 1 ubuntu supergroup 421 2016-02-11 23:42 /input/cust_credit~
-rw-r--r-- 1 ubuntu supergroup 209 2016-02-11 23:42 /input/cust_debit.txt
-rw-r--r-- 1 ubuntu supergroup 209 2016-02-11 23:42 /input/cust_debit~
-rw-r--r-- 1 ubuntu supergroup 61 2016-02-11 23:42 /input/raw.txt
-rw-r--r-- 1 ubuntu supergroup 61 2016-02-11 23:47 /input/raw_log.txt
hive> ! hadoop fs -ls /home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit;
log4j: reset attribute= "false".
log4j: Threshold ="null".
log4j: Level value for root is [INFO].
log4j: root level set to INFO
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n].
log4j: Adding appender named [console] to category [root].
2016-03-04 04:56:39 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
-rwx-wx-wx 1 ubuntu supergroup 421 2016-03-03 23:12 /home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit/cust_credit.txt
hive> ! hadoop fs -cat /home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit/cust_credit.txt;
log4j: reset attribute= "false".
log4j: Threshold ="null".
log4j: Level value for root is [INFO].
log4j: root level set to INFO
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n].
log4j: Adding appender named [console] to category [root].
2016-03-04 04:57:00 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
1,sriram,2015-10-12,ayodhya,55000
2,seeta,2015-09-12,midhila,50000
3,lakshman,2015-11-12,ayodhya,35000
4,bharata,2015-12-12,ayodhya,25000
5,sathrugna,2015-12-12,ayodhya,20000
6,hanuma,2015-11-12,ayodhya,25000
7,sriram,2015-10-12,ayodhya,55000
8,seeta,2015-09-12,midhila,50000
9,lakshman,2015-11-12,ayodhya,35000
10,bharata,2015-12-12,ayodhya,25000
11,sathrugna,2015-12-12,ayodhya,20000
12,hanuma,2015-11-12,ayodhya,25000
hive>
hive> select * from thanooj.cust_debit;
OK
1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya
Time taken: 0.145 seconds, Fetched: 6 row(s)
hive> select * from thanooj.cust_credit;
OK
1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya
7 sriram 2015-10-12 ayodhya
8 seeta 2015-09-12 midhila
9 lakshman 2015-11-12 ayodhya
10 bharata 2015-12-12 ayodhya
11 sathrugna 2015-12-12 ayodhya
12 hanuma 2015-11-12 ayodhya
Time taken: 0.117 seconds, Fetched: 12 row(s)
hive> select cc.* from cust_credit cc left join cust_debit cd on cc.id=cd.id;
OK
1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya
7 sriram 2015-10-12 ayodhya
8 seeta 2015-09-12 midhila
9 lakshman 2015-11-12 ayodhya
10 bharata 2015-12-12 ayodhya
11 sathrugna 2015-12-12 ayodhya
12 hanuma 2015-11-12 ayodhya
Time taken: 38.756 seconds, Fetched: 12 row(s)
hive> select * from cust_credit cc left join cust_debit cd on cc.id=cd.id;
OK
1 sriram 2015-10-12 ayodhya 1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila 2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya 3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya 4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya 5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya 6 hanuma 2015-11-12 ayodhya
7 sriram 2015-10-12 ayodhya NULL NULL NULL NULL
8 seeta 2015-09-12 midhila NULL NULL NULL NULL
9 lakshman 2015-11-12 ayodhya NULL NULL NULL NULL
10 bharata 2015-12-12 ayodhya NULL NULL NULL NULL
11 sathrugna 2015-12-12 ayodhya NULL NULL NULL NULL
12 hanuma 2015-11-12 ayodhya NULL NULL NULL NULL
Time taken: 32.4 seconds, Fetched: 12 row(s)
hive> select * from cust_credit cc left join cust_debit cd on cc.id=cd.id where cd.id is NULL;
OK
7 sriram 2015-10-12 ayodhya NULL NULL NULL NULL
8 seeta 2015-09-12 midhila NULL NULL NULL NULL
9 lakshman 2015-11-12 ayodhya NULL NULL NULL NULL
10 bharata 2015-12-12 ayodhya NULL NULL NULL NULL
11 sathrugna 2015-12-12 ayodhya NULL NULL NULL NULL
12 hanuma 2015-11-12 ayodhya NULL NULL NULL NULL
Time taken: 34.767 seconds, Fetched: 6 row(s)
hive> select * from cust_credit cc left outer join cust_debit cd on cc.id=cd.id;
OK
1 sriram 2015-10-12 ayodhya 1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila 2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya 3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya 4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya 5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya 6 hanuma 2015-11-12 ayodhya
7 sriram 2015-10-12 ayodhya NULL NULL NULL NULL
8 seeta 2015-09-12 midhila NULL NULL NULL NULL
9 lakshman 2015-11-12 ayodhya NULL NULL NULL NULL
10 bharata 2015-12-12 ayodhya NULL NULL NULL NULL
11 sathrugna 2015-12-12 ayodhya NULL NULL NULL NULL
12 hanuma 2015-11-12 ayodhya NULL NULL NULL NULL
Time taken: 32.588 seconds, Fetched: 12 row(s)
hive> select * from cust_credit cc left outer join cust_debit cd on cc.id=cd.id where cd.id is NULL;
OK
7 sriram 2015-10-12 ayodhya NULL NULL NULL NULL
8 seeta 2015-09-12 midhila NULL NULL NULL NULL
9 lakshman 2015-11-12 ayodhya NULL NULL NULL NULL
10 bharata 2015-12-12 ayodhya NULL NULL NULL NULL
11 sathrugna 2015-12-12 ayodhya NULL NULL NULL NULL
12 hanuma 2015-11-12 ayodhya NULL NULL NULL NULL
Time taken: 32.235 seconds, Fetched: 6 row(s)
hive> select cc.* from cust_credit cc left outer join cust_debit cd on cc.id=cd.id where cd.id is NULL;
OK
7 sriram 2015-10-12 ayodhya
8 seeta 2015-09-12 midhila
9 lakshman 2015-11-12 ayodhya
10 bharata 2015-12-12 ayodhya
11 sathrugna 2015-12-12 ayodhya
12 hanuma 2015-11-12 ayodhya
Time taken: 32.314 seconds, Fetched: 6 row(s)
ubuntu@ubuntu:~$ hdfs dfs -put /home/ubuntu/input/input/cust_current.txt /input
ubuntu@ubuntu:~$ hdfs dfs -ls /input
Found 3 items
-rw-r--r-- 1 ubuntu supergroup 72 2016-02-12 02:05 /input/cust_current.txt
-rw-r--r-- 1 ubuntu supergroup 61 2016-02-11 23:42 /input/raw.txt
-rw-r--r-- 1 ubuntu supergroup 61 2016-02-11 23:47 /input/raw_log.txt
hive> create table thanooj.cust_current (ID INT, NAME STRING, doa STRING, loation STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n';
OK
Time taken: 1.812 seconds
hive> LOAD DATA INPATH '/input/cust_current.txt' OVERWRITE INTO TABLE thanooj.cust_current;
Loading data to table thanooj.cust_current
OK
Time taken: 0.974 seconds
hive> select * from thanooj.cust_current;
OK
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
Time taken: 1.697 seconds, Fetched: 2 row(s)
hive> select * from cust_credit cc join cust_debit cd on cc.id=cd.id join cust_current ccu on cc.id=ccu.id;
Total MapReduce CPU Time Spent: 2 seconds 50 msec
OK
4 bharata 2015-12-12 ayodhya 4 bharata 2015-12-12 ayodhya 4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya 5 sathrugna 2015-12-12 ayodhya 5 sathrugna 2015-12-12 ayodhya
Time taken: 39.927 seconds, Fetched: 2 row(s)
hive> set hive.cli.print.current.db=true;
hive (thanooj)> set hive.cli.print.header=true;
hive (thanooj)> select a.id cust_id, case when a.name=b.id then a.name else b.loation end as cust_name from cust_credit a join cust_debit b on ( a.id = b.id );
Total MapReduce CPU Time Spent: 2 seconds 130 msec
OK
cust_id cust_name
1 ayodhya
2 midhila
3 ayodhya
4 ayodhya
5 ayodhya
6 ayodhya
Time taken: 25.493 seconds, Fetched: 6 row(s)
hive (thanooj)> select a.id cust_id, case when a.name=b.name then a.name else b.loation end as case_col from cust_credit a join cust_debit b on ( a.id = b.id );
Total MapReduce CPU Time Spent: 1 seconds 840 msec
OK
cust_id case_col
1 sriram
2 seeta
3 lakshman
4 bharata
5 sathrugna
6 hanuma
Time taken: 31.488 seconds, Fetched: 6 row(s)
hive> drop table thanooj.cust_credit_ext;
OK
Time taken: 0.896 seconds
hive> CREATE EXTERNAL TABLE thanooj.cust_credit_ext (id int, name string, doa string, location string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit/';
OK
Time taken: 0.064 seconds
hive> desc cust_credit_ext;
OK
id int
name string
doa string
location string
Time taken: 0.09 seconds, Fetched: 4 row(s)
hive> select * from thanooj.cust_credit_ext;
OK
1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya
7 sriram 2015-10-12 ayodhya
8 seeta 2015-09-12 midhila
9 lakshman 2015-11-12 ayodhya
10 bharata 2015-12-12 ayodhya
11 sathrugna 2015-12-12 ayodhya
12 hanuma 2015-11-12 ayodhya
Time taken: 0.294 seconds, Fetched: 12 row(s)
hive> select * from cust_credit cc join cust_debit cd on cc.id<>cd.id;
FAILED: SemanticException [Error 10017]: Line 1:51 Both left and right aliases encountered in JOIN 'id'
hive> select * from cust_credit cc join cust_debit cd on cc.id != cd.id;
FAILED: SemanticException [Error 10017]: Line 1:51 Both left and right aliases encountered in JOIN 'id'
hive> select id, collect_set(location)[0] as c_location from thanooj.cust_credit group by id;
Query ID = ubuntu_20160303231512_dd77dba6-8284-4b6f-9f71-68c9e68844ff
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1457070465534_0004, Tracking URL = http://ubuntu:8088/proxy/application_1457070465534_0004/
Kill Command = /home/ubuntu/softwares/hadoop-2.7.2/bin/hadoop job -kill job_1457070465534_0004
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2016-03-03 23:15:24,829 Stage-1 map = 0%, reduce = 0%
2016-03-03 23:16:05,390 Stage-1 map = 100%, reduce = 0%
2016-03-03 23:16:16,297 Stage-1 map = 0%, reduce = 0%
2016-03-03 23:17:16,694 Stage-1 map = 0%, reduce = 0%
2016-03-03 23:17:24,082 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.15 sec
2016-03-03 23:17:32,035 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.3 sec
MapReduce Total cumulative CPU time: 3 seconds 300 msec
Ended Job = job_1457070465534_0004
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 3.3 sec HDFS Read: 8635 HDFS Write: 123 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 300 msec
OK
id c_location
1 ayodhya
2 midhila
3 ayodhya
4 ayodhya
5 ayodhya
6 ayodhya
7 ayodhya
8 midhila
9 ayodhya
10 ayodhya
11 ayodhya
12 ayodhya
Time taken: 140.36 seconds, Fetched: 12 row(s)
hive> select id, location as c_location from thanooj.cust_credit group by id;
FAILED: SemanticException Line 0:-1 Expression not in GROUP BY key 'location'
hive> INSERT OVERWRITE LOCAL DIRECTORY '/home/ubuntu/test/' SELECT * from thanooj.cust_credit;
Query ID = ubuntu_20160318053609_42c6d022-d4f2-4736-9e3d-3c55b0acfe0f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there is no reduce operator
Starting Job = job_1458300898384_0002, Tracking URL = http://ubuntu:8088/proxy/application_1458300898384_0002/
Kill Command = /home/ubuntu/softwares/hadoop-2.7.2/bin/hadoop job -kill job_1458300898384_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2016-03-18 05:36:19,267 Stage-1 map = 0%, reduce = 0%
2016-03-18 05:36:37,026 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.51 sec
MapReduce Total cumulative CPU time: 1 seconds 510 msec
Ended Job = job_1458300898384_0002
Copying data to local directory /home/ubuntu/test
Copying data to local directory /home/ubuntu/test
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 1.51 sec HDFS Read: 3928 HDFS Write: 349 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 510 msec
OK
Time taken: 28.393 seconds
--sequence
hive> create table cust_credit_sec (id int, name string, doa string, location string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' stored as SEQUENCEFILE;
OK
Time taken: 0.412 seconds
hive> insert overwrite table cust_credit_sec select * from cust_credit;
Query ID = ubuntu_20160318065702_c4eb35c4-79aa-4d10-9773-be8c9a4058b6
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there is no reduce operator
Starting Job = job_1458300898384_0007, Tracking URL = http://ubuntu:8088/proxy/application_1458300898384_0007/
Kill Command = /home/ubuntu/softwares/hadoop-2.7.2/bin/hadoop job -kill job_1458300898384_0007
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2016-03-18 06:57:11,419 Stage-1 map = 0%, reduce = 0%
2016-03-18 06:57:27,226 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.12 sec
MapReduce Total cumulative CPU time: 2 seconds 120 msec
Ended Job = job_1458300898384_0007
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://localhost:9000/home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_sec/.hive-staging_hive_2016-03-18_06-57-02_702_1026185545452775762-1/-ext-10000
Loading data to table thanooj.cust_credit_sec
Table thanooj.cust_credit_sec stats: [numFiles=1, numRows=12, totalSize=580, rawDataSize=337]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 2.12 sec HDFS Read: 4336 HDFS Write: 660 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 120 msec
OK
Time taken: 25.934 seconds
hive> ! hadoop fs -cat /home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_sec/000000_1;
log4j: reset attribute= "false".
log4j: Threshold ="null".
log4j: Level value for root is [INFO].
log4j: root level set to INFO
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n].
log4j: Adding appender named [console] to category [root].
2016-03-18 06:57:51 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
SEQ"org.apache.hadoop.io.BytesWritableorg.apache.hadoop.io.Texte%6�
a�?�d� 1,sriram,2015-10-12,ayodhya2,seeta,2015-09-12,midhila"3,lakshman,2015-11-12,ayodhya!4,bharata,2015-12-12,ayodhya#5,sathrugna,2015-12-12,ayodhya 6,hanuma,2015-11-12,ayodhya ,sriram,2015-10-12,ayodhya8,seeta,2015-09-12,midhila"9,lakshman,2015-11-12,ayodhya"10,bharata,2015-12-12,ayodhya$11,sathrugna,2015-12-12,ayodhya!12,hanuma,2015-11-12,ayodhya
hive> ! hadoop fs -text /home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_sec/000000_1;
log4j: reset attribute= "false".
log4j: Threshold ="null".
log4j: Level value for root is [INFO].
log4j: root level set to INFO
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n].
log4j: Adding appender named [console] to category [root].
2016-03-18 06:58:30 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
1,sriram,2015-10-12,ayodhya
2,seeta,2015-09-12,midhila
3,lakshman,2015-11-12,ayodhya
4,bharata,2015-12-12,ayodhya
5,sathrugna,2015-12-12,ayodhya
6,hanuma,2015-11-12,ayodhya
7,sriram,2015-10-12,ayodhya
8,seeta,2015-09-12,midhila
9,lakshman,2015-11-12,ayodhya
10,bharata,2015-12-12,ayodhya
11,sathrugna,2015-12-12,ayodhya
12,hanuma,2015-11-12,ayodhya
hive> show create table thanooj.cust_credit_sec;
OK
CREATE TABLE `thanooj.cust_credit_sec`(
`id` int,
`name` string,
`doa` string,
`location` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.SequenceFileInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'
LOCATION
'hdfs://localhost:9000/home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_sec'
TBLPROPERTIES (
'COLUMN_STATS_ACCURATE'='true',
'numFiles'='1',
'numRows'='12',
'rawDataSize'='337',
'totalSize'='376',
'transient_lastDdlTime'='1458308265')
Time taken: 0.055 seconds, Fetched: 20 row(s)
hive> describe extended thanooj.cust_credit_sec;
OK
id int
name string
doa string
location string
Detailed Table Information Table(tableName:cust_credit_sec, dbName:thanooj, owner:ubuntu, createTime:1458307933, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:int, comment:null), FieldSchema(name:name, type:string, comment:null), FieldSchema(name:doa, type:string, comment:null), FieldSchema(name:location, type:string, comment:null)], location:hdfs://localhost:9000/home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_sec, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{numFiles=1, COLUMN_STATS_ACCURATE=true, transient_lastDdlTime=1458308265, numRows=12, totalSize=376, rawDataSize=337}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)
Time taken: 0.075 seconds, Fetched: 6 row(s)
hive> describe formatted thanooj.cust_credit_sec;
OK
# col_name data_type comment
id int
name string
doa string
location string
# Detailed Table Information
Database: thanooj
Owner: ubuntu
CreateTime: Fri Mar 18 06:32:13 PDT 2016
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://localhost:9000/home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_sec
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
numFiles 1
numRows 12
rawDataSize 337
totalSize 376
transient_lastDdlTime 1458308265
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.SequenceFileInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
serialization.format 1
Time taken: 0.062 seconds, Fetched: 34 row(s)
hive> SET hive.exec.compress.output=true;
hive> SET mapred.max.split.size=256000000;
hive> SET mapred.output.compression.type=BLOCK;
hive> SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec /OR/ org.apache.hadoop.io.compress.GzipCodec;
hive> SET hive.exec.dynamic.partition.mode=nonstrict;
hive> SET hive.exec.dynamic.partition=true;
--PARQUET
hive> create table cust_credit_parquet (id int, name string, doa string, location string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS PARQUET;
OK
Time taken: 0.325 seconds
hive> insert overwrite table cust_credit_parquet select * from cust_credit;
Query ID = ubuntu_20160328061221_6d885d6a-cd20-40f5-842d-09025df628c4
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there is no reduce operator
Starting Job = job_1459170227047_0001, Tracking URL = http://ubuntu:8088/proxy/application_1459170227047_0001/
Kill Command = /home/ubuntu/softwares/hadoop-2.7.2/bin/hadoop job -kill job_1459170227047_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2016-03-28 06:12:32,197 Stage-1 map = 0%, reduce = 0%
2016-03-28 06:12:53,069 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.44 sec
MapReduce Total cumulative CPU time: 2 seconds 440 msec
Ended Job = job_1459170227047_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://localhost:9000/home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_parquet/.hive-staging_hive_2016-03-28_06-12-21_397_2155480560380901326-1/-ext-10000
Loading data to table thanooj.cust_credit_parquet
Table thanooj.cust_credit_parquet stats: [numFiles=1, numRows=12, totalSize=769, rawDataSize=48]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 2.44 sec HDFS Read: 4451 HDFS Write: 852 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 440 msec
OK
Time taken: 34.265 seconds
hive> select * from thanooj. cust_credit_parquet;
OK
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya
7 sriram 2015-10-12 ayodhya
8 seeta 2015-09-12 midhila
9 lakshman 2015-11-12 ayodhya
10 bharata 2015-12-12 ayodhya
11 sathrugna 2015-12-12 ayodhya
12 hanuma 2015-11-12 ayodhya
Time taken: 0.073 seconds, Fetched: 12 row(s)
hive> ! hadoop fs -cat /home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_parquet/000000_2;
log4j: reset attribute= "false".
log4j: Threshold ="null".
log4j: Level value for root is [INFO].
log4j: root level set to INFO
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n].
log4j: Adding appender named [console] to category [root].
2016-03-28 06:15:33 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
PAR1ll,

��L
sriramseetlakshmanbharata sathrugnahanuma,srirambharata��"
ppL
2015-10-12
2015-09-12
2015-11-12
2015-12-12,
2015-12-12
2015-09-12�K�,,Layodhyamidhila,midhilaayodhya�\H
hive_schema%id
%name%
%doa%
%location%Lid��<
&�
name��&�<srirambharata&�
doa��&�<
2015-12-12
2015-09-12&�
location��&�<midhilaayodhya�(parquet-mr version 1.6.0HPAR1
hive> ! hadoop fs -text /home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_parquet/000000_2;
log4j: reset attribute= "false".
log4j: Threshold ="null".
log4j: Level value for root is [INFO].
log4j: root level set to INFO
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n].
log4j: Adding appender named [console] to category [root].
2016-03-28 06:17:45 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
PAR1ll,

��L
sriramseetlakshmanbharata sathrugnahanuma,srirambharata��"
ppL
2015-10-12
2015-09-12
2015-11-12
2015-12-12,
2015-12-12
2015-09-12�K�,,Layodhyamidhila,midhilaayodhya�\H
hive_schema%id
%name%
%doa%
%location%Lid��<
&�
name��&�<srirambharata&�
doa��&�<
2015-12-12
2015-09-12&�
location��&�<midhilaayodhya�(parquet-mr version 1.6.0HPAR1
--AVRO SERDE
hive> use thanooj;
OK
Time taken: 0.009 seconds
hive> set hive.exec.compress.output=true;
hive> SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;
hive> CREATE TABLE cust_credit_avro (id int, name string, doa string, location string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS AVRO;
OK
Time taken: 0.093 seconds
hive> show create table cust_credit_avro;
OK
CREATE TABLE `cust_credit_avro`(
`id` int COMMENT '',
`name` string COMMENT '',
`doa` string COMMENT '',
`location` string COMMENT '')
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION
'hdfs://localhost:9000/home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_avro'
TBLPROPERTIES (
'transient_lastDdlTime'='1459321615')
Time taken: 0.059 seconds, Fetched: 16 row(s)
hive> describe extended cust_credit_avro;
OK
id int
name string
doa string
location string
Detailed Table Information Table(tableName:cust_credit_avro, dbName:thanooj, owner:ubuntu, createTime:1459321615, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:int, comment:null), FieldSchema(name:name, type:string, comment:null), FieldSchema(name:doa, type:string, comment:null), FieldSchema(name:location, type:string, comment:null)], location:hdfs://localhost:9000/home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_avro, inputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.avro.AvroSerDe, parameters:{serialization.format=,, field.delim=,, line.delim=
}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{transient_lastDdlTime=1459321615}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)
Time taken: 0.11 seconds, Fetched: 7 row(s)
hive> insert overwrite table cust_credit_avro select * from cust_credit;
Query ID = ubuntu_20160330000846_a0e2bc20-9ceb-4276-9a3b-199ec5bdd4aa
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there is no reduce operator
Starting Job = job_1459320620757_0002, Tracking URL = http://ubuntu:8088/proxy/application_1459320620757_0002/
Kill Command = /home/ubuntu/softwares/hadoop-2.7.2/bin/hadoop job -kill job_1459320620757_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2016-03-30 00:08:54,040 Stage-1 map = 0%, reduce = 0%
2016-03-30 00:09:00,304 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.97 sec
MapReduce Total cumulative CPU time: 1 seconds 970 msec
Ended Job = job_1459320620757_0002
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://localhost:9000/home/ubuntu/softwares/hive-1.2.1/warehouse/thanooj.db/cust_credit_avro/.hive-staging_hive_2016-03-30_00-08-46_861_1164406953482225834-1/-ext-10000
Loading data to table thanooj.cust_credit_avro
Table thanooj.cust_credit_avro stats: [numFiles=1, numRows=12, totalSize=502, rawDataSize=0]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 1.97 sec HDFS Read: 5000 HDFS Write: 565 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 970 msec
OK
Time taken: 14.809 seconds
hive> select * from cust_credit_avro;
OK
1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya
7 sriram 2015-10-12 ayodhya
8 seeta 2015-09-12 midhila
9 lakshman 2015-11-12 ayodhya
10 bharata 2015-12-12 ayodhya
11 sathrugna 2015-12-12 ayodhya
12 hanuma 2015-11-12 ayodhya
Time taken: 0.133 seconds, Fetched: 12 row(s)
--hivevar
hive> show tables;
OK
cust_credit
cust_credit_ext
cust_credit_sec
Time taken: 0.32 seconds, Fetched: 3 row(s)
hive> set hivevar:tableName=cust_credit;
hive> select * from thanooj.${tableName};
OK
1 sriram 2015-10-12 ayodhya
2 seeta 2015-09-12 midhila
3 lakshman 2015-11-12 ayodhya
4 bharata 2015-12-12 ayodhya
5 sathrugna 2015-12-12 ayodhya
6 hanuma 2015-11-12 ayodhya
7 sriram 2015-10-12 ayodhya
8 seeta 2015-09-12 midhila
9 lakshman 2015-11-12 ayodhya
10 bharata 2015-12-12 ayodhya
11 sathrugna 2015-12-12 ayodhya
12 hanuma 2015-11-12 ayodhya
Time taken: 0.99 seconds, Fetched: 12 row(s)
hive> exit;
ubuntu@ubuntu:~$ stop-yarn.sh
stopping yarn daemons
stopping resourcemanager
localhost: stopping nodemanager
localhost: nodemanager did not stop gracefully after 5 seconds: killing with kill -9
no proxyserver to stop
ubuntu@ubuntu:~$ stop-dfs.sh
16/03/04 05:49:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
16/03/04 05:50:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ubuntu@ubuntu:~$
@thanoojgithub
Copy link
Author

Yes you will have to put this file in a directory and then create an external table on top of it. As per the documentation : An EXTERNAL TABLE points to any HDFS location for its storage, rather than being stored in a folder specified by the configuration property hive.metastore.warehouse.dir

Even if you create an internal table hive by default creates a directory for it inside the hive.metastore.warehouse.dir and the same behavior is expected while creating an external table except for the fact that the default directory is not used.

@thanoojgithub
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment