Skip to content

Instantly share code, notes, and snippets.

@subelsky
Created April 18, 2014 17:39
Show Gist options
  • Save subelsky/11055667 to your computer and use it in GitHub Desktop.
Save subelsky/11055667 to your computer and use it in GitHub Desktop.
Quick SQL command to find large tables in redshift
-- based on http://stackoverflow.com/questions/21767780/how-to-find-size-of-database-schema-table-in-redshift
SELECT name AS table_name, ROUND((COUNT(*) / 1024.0),2) as "Size in Gigabytes"
FROM stv_blocklist
INNER JOIN
(SELECT DISTINCT id, name FROM stv_tbl_perm) names
ON names.id = stv_blocklist.tbl
GROUP BY name
ORDER BY "Size in Gigabytes" DESC
@digdeep
Copy link

digdeep commented Nov 24, 2014

Very useful. Thanks :)

@Promise-W
Copy link

pgshard=# SELECT name AS table_name, ROUND((COUNT(*) / 1024.0),2) as "Size in Gigabytes"
pgshard-# FROM stv_blocklist
pgshard-# INNER JOIN
pgshard-# (SELECT DISTINCT id, name FROM stv_tbl_perm) names
pgshard-# ON names.id = stv_blocklist.tbl
pgshard-# GROUP BY name
pgshard-# ORDER BY "Size in Gigabytes" DESC
pgshard-# ;
table_name | size in gigabytes
--------------------------------------------------------------------------+-------------------
archive_listing_per_day | 32.59
archive_listing_per_day_1 | 11.95
listing_1 | 0.86
(3 rows)

in this way,i have one question : the table archive_listing_per_day is copy of archive_listing_per_day_1,
but size of archive_listing_per_day is so amazing bigger than archive_listing_per_day_1 ? can you tell me why?

PS: archive_listing_per_day_1 is using command copy from amaze S3 to redshift,
and archive_listing_per_day is using select * into archive_listing_per_day from archive_listing_per_day_1;

@Promise-W
Copy link

forget ~ , ^_^,your sql is very useful,thank !

@bfleming-ciena
Copy link

bfleming-ciena commented Jul 15, 2016

Show the database name

SELECT name AS table_name, datname, ROUND((COUNT(*) / 1024.0),2) as "Size in Gigabytes"
FROM stv_blocklist
INNER JOIN
(SELECT DISTINCT id, name, datname, db_id FROM stv_tbl_perm
INNER JOIN
pg_database on stv_tbl_perm.db_id = pg_database.oid
) names
ON names.id = stv_blocklist.tbl
GROUP BY name, datname
ORDER BY "Size in Gigabytes" DESC

@vigneshprajapati
Copy link

Very useful!! Thanks @subelsky

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment