The included SQL and CloudFormation YAML snippets create AWS Glue tables for access logs from Application Load Balancers, which can be queried using Amazon Athena. The snippets are intended to be equivalent.
The tables that are created utilize partition projections, which allows for efficient, cost-effective querying of the log data in Amazon S3, without the need to manual create new partitions. Instead, the partitions are created based on the filesnames of the log files. Using the year, month, and day portions of the filenames, daily partitions are available when querying the log data.
Date-based partitions have a specific range in the configuration, which should be set to something appropriate for your use case.
When querying a table that includes the partitions, the SELECT
should include a WHERE
clause that filters data covered by the range of the partition. For example:
SELECT *
FROM __DATABASE_NAME__.__TABLE_NAME__
WHERE "date" >= date_format(current_date - interval '7' day, '%Y/%m/%d')
The date
field in the WHERE
clause matches the date
projection configured on the table. The name of this projection is arbitrary.
Note that while ELB access logs include a time
field, you must still use the date
projection fields to utilize partitioning. You can, for example, use both the date
and time
fields in a WHERE clause to access certain partitions, and also filter more specifically by the time.