Skip to content

Instantly share code, notes, and snippets.

@lfdesousa
Last active August 13, 2018 07:01
Show Gist options
  • Save lfdesousa/94c1f66fd8387c548b9f77a49e36f791 to your computer and use it in GitHub Desktop.
Save lfdesousa/94c1f66fd8387c548b9f77a49e36f791 to your computer and use it in GitHub Desktop.
Week 4 work for DW - UOL
-- Week4
CREATE TABLE flights_ls(year smallint, month smallint, day smallint, carrier varchar(80) distkey, origin char(3), dest char(3), aricraft_code char(3), miles int, departures int, minutes int, seats int, passangers int, freight_pounds int);
COPY flights_ls
FROM 's3://us-west-2-aws-training/awsu-spl/spl17-redshift/static/data/flights-usa'
credentials 'aws_iam_role=...'
GZIP
DELIMITER ','
REMOVEQUOTES;
COMMIT;
SELECT COUNT(*) FROM flights_ls;
SELECT *
FROM flights_ls
ORDER BY random()
LIMIT 10;
SELECT carrier,SUM (departures) FROM flights_ls GROUP BY carrier ORDER BY 2 DESC LIMIT 10;
ANALYZE COMPRESSION flights_ls;
create table airports_ls(
airport_code char(3) sortkey,
airport varchar(100)
);
COPY airports_ls
FROM 's3://us-west-2-aws-training/awsu-spl/spl17-redshift/static/data/lookup_airports.csv'
credentials 'aws_iam_role=...'
IGNOREHEADER 1
DELIMITER ','
REMOVEQUOTES
TRUNCATECOLUMNS;
COMMIT;
SELECT * FROM airports_ls;
create table vegas_flights_ls
distkey (origin)
sortkey (origin)
as
select flights_ls.*, airport
from flights_ls
join airports_ls on origin = airport_code
where dest = 'LAS';
select airport, to_char(sum(passangers), '999,999,999') as passangers
from vegas_flights_ls
group by airport
order by sum(passangers) desc
limit 10;
@lfdesousa
Copy link
Author

lfdesousa commented Aug 13, 2018

By default copy enables automatic compression, please have a look at the documentation below:
https://docs.aws.amazon.com/redshift/latest/dg/c_Loading_tables_auto_compress.html

Regarding the supported types of compression, we also have very interesting information in:
https://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html

Bearer in mind that if we want to always have compression, that we can also change out table schema to define this upfront, the syntax is depicted below:

CREATE TABLE table_name (column_name 
data_type ENCODE encoding-type)[, ...] 

A word about the tools used to execute all of this code:
https://www.sql-workbench.eu/
On top of this make sure that you get the correct drivers, in this case using jdbc, so they need to be downloaded from:
https://docs.aws.amazon.com/redshift/latest/mgmt/configure-jdbc-connection.html#download-jdbc-driver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment