Last active
January 24, 2017 18:35
-
-
Save cherniag/e77099106126a096ba531813f7542671 to your computer and use it in GitHub Desktop.
impala
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1.Import csv file to hdfs: | |
export HADOOP_USER_NAME=impala | |
./hadoop fs -put /Users/gennadii/dev/performance_impala.csv hdfs://<host>:<port>/user/impala/performance_impala.csv | |
2.Create IDEA datasource Impala using Impala JDBC driver (all jars in zip) | |
3.Create connection | |
jdbc:impala://<host>:<port> | |
4.Create table (the same name as file): | |
create table performance_impala( | |
field_string_1 string, | |
field_integer_1 int, | |
field_double_1 double, | |
field_date_1 timestamp) | |
row format delimited | |
fields terminated by ','; | |
5.Load data from csv (use absolute hdfs path) | |
LOAD DATA INPATH '/user/impala/performance_impala.csv' INTO TABLE performance_impala | |
6.Check: | |
select * from performance_impala limit 10; | |
7.Skip 1st header row: | |
alter table performance_impala set tblproperties('skip.header.line.count'='1'); |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment