Skip to content

Instantly share code, notes, and snippets.

@thanoojgithub
Last active February 15, 2016 09:21
Show Gist options
  • Select an option

  • Save thanoojgithub/972cbc9aea6307453af1 to your computer and use it in GitHub Desktop.

Select an option

Save thanoojgithub/972cbc9aea6307453af1 to your computer and use it in GitHub Desktop.
JSON file into Hive table using SerDe
hive> LIST jars;
hive-hcatalog-core-1.2.1.jar
hive> DELETE JAR hive-hcatalog-core-1.2.1.jar;
Deleted [hive-hcatalog-core-1.2.1.jar] from class path
hive> ADD JAR /home/ubuntu/hive-1.2.1/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.jar;
Added [/home/ubuntu/hive-1.2.1/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.jar] to class path
Added resources: [/home/ubuntu/hive-1.2.1/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.jar]
hive> drop table thanooj.EmpJson;
OK
Time taken: 0.02 seconds
hive> CREATE TABLE thanooj.EmpJson ( name STRING, salary FLOAT, mobile BIGINT, location STRING) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';
OK
Time taken: 1.776 seconds
hive> LOAD DATA LOCAL INPATH '/home/ubuntu/input/EmpJson.txt' OVERWRITE INTO TABLE THANOOJ.EmpJson;
Loading data to table thanooj.EmpJson
Table thanooj.EmpJson stats: [numFiles=1, numRows=0, totalSize=233, rawDataSize=0]
OK
Time taken: 0.212 seconds
hive> select * from thanooj.EmpJson;
OK
sriram 5000.0 1234567890 ayodhya
seetha 4500.0 9876543210 midhila
lakshman 2000.0 5432167890 ayodhya
Time taken: 0.07 seconds, Fetched: 3 row(s)
json data:EmpJson.txt
---------------------
{"name": "sriram","salary": 5000,"mobile": 1234567890,"location": "ayodhya"}
{"name": "seetha","salary": 4500,"mobile": 9876543210,"location": "midhila"}
{"name": "lakshman","salary": 2000,"mobile": 5432167890,"location": "ayodhya"}
## complex type JSON::
---------------------
hive> CREATE TABLE thanooj.employeesCol ( name STRUCT<first:STRING,last:STRING>, salary FLOAT, mobile ARRAY<BIGINT>, address STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';
hive> LOAD DATA LOCAL INPATH '/home/ubuntu/input/empcol.txt' OVERWRITE INTO TABLE THANOOJ.employeesCol;
hive> set hive.cli.print.header=true;
hive> select name.first, name.last, salary, mobile[0], mobile[1], address.street, address.city, address.state, address.zip from thanooj.employeescol;
OK
first last salary _c3 _c4 street city state zip
John Doe 100000.0 1234567890 9876543210 1 Michigan Ave. Chicago IL 60600
Time taken: 0.061 seconds, Fetched: 1 row(s)
hive> select name.first, name.last, salary, mobile[0] as primaryMobile, mobile[1] as secondaryMobile, address.street, address.city, address.state, address.zip from thanooj.employeescol;
OK
first last salary primarymobile secondarymobile street city state zip
John Doe 100000.0 1234567890 9876543210 1 Michigan Ave. Chicago IL 60600
Time taken: 0.044 seconds, Fetched: 1 row(s)
hive>
JSON data : empcol.txt
----------------------
{"name": {"first":"John","last":"Doe"},"salary": 100000, "mobile": [1234567890, 9876543210], "address": {"street": "1 Michigan Ave.", "city": "Chicago", "state": "IL", "zip": 60600}}
@thanoojgithub
Copy link
Copy Markdown
Author

simple JSON data into Hive table using SerDe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment