Skip to content

Instantly share code, notes, and snippets.

@EngineerLabShimazu
Last active October 26, 2022 06:36
Show Gist options
  • Save EngineerLabShimazu/a9ffb30e886e9eeeb5bb3684718cc644 to your computer and use it in GitHub Desktop.
Save EngineerLabShimazu/a9ffb30e886e9eeeb5bb3684718cc644 to your computer and use it in GitHub Desktop.
Create a table in Athena from a csv file with header stored in S3.
CREATE EXTERNAL TABLE IF NOT EXISTS default.table
(
`id` int,
`name` string,
`timestamp` string,
`is_debug` boolean
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
'escapeChar'='\\',
'quoteChar'='\"',
'serialization.format' = ',',
'field.delim' = ','
)
LOCATION 's3://your-bucket-name-xxxxx/your/folder'
TBLPROPERTIES (
'skip.header.line.count'='1'
);
@EngineerLabShimazu
Copy link
Author

EngineerLabShimazu commented Nov 25, 2019

https://gist.github.com/GenkiShimazu/a9ffb30e886e9eeeb5bb3684718cc644#file-amazon_athena_create_table-ddl-L5
`timestamp` string

`timestamp` timestamp と行きたいところですが、timestampのフォーマットが合わないとquery投げた時にERRORになるんですよね🥺

@EngineerLabShimazu
Copy link
Author

EngineerLabShimazu commented Nov 25, 2019

https://gist.github.com/GenkiShimazu/a9ffb30e886e9eeeb5bb3684718cc644#file-amazon_athena_create_table-ddl-L16
'skip.header.line.count'='1'

csv fileにヘッダーがある場合は、このオプションでヘッダーを読み込まないようにできます💪

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment