Skip to content

Instantly share code, notes, and snippets.

@cptcanuck
Last active January 25, 2022 16:51
Show Gist options
  • Save cptcanuck/ac88fdfbd7088e2e19174d8e6aaa43de to your computer and use it in GitHub Desktop.
Save cptcanuck/ac88fdfbd7088e2e19174d8e6aaa43de to your computer and use it in GitHub Desktop.
How to setup cloudtrail lake

References

My Setup Guide

Step 1:

  • Click 'Create event data store'
  • Enter the name for the datastore - ex: LakeTrail-Organization-ManagementEvents
  • Set the retention period - the default is 7 years (2555 days)
  • If you only want the trail active for your current region (doesn't really follow AWS best practices, but), check "Include only the current region"
    • You will need to flip to every region and enable the Lake manually if that's how you want to segment things
  • If you want to enable for your organization, select "Enable for all accounts ..."
  • Add any tags you need, and click Next

Step 2:

  • Select the types of events you want to capture - the defaults are sane
  • Click Next

Step 3:

  • Review your configuration, and if you're happy, click Create event data store

This will take some time - the status will be 'Creation in progress' for a bit. Grab a sip of coffee.

One the state is 'Enabled' everything is setup. You will need to wait a bit though - just like regular CloudTrail, the service takes many minutes to deliver logs, and it does not import existing Trail data.

Querying the data

The query editor is ok - I wish they'd just used Athena, it's more functional and follows a more standard UI design

In the editor, you can start with something simple just to see how much data you have:

select count(*) from <insert ugly data store ID> select * from <why isn't this just the name of the event store?>

Note - when you hit Run, you will see the 'Command Output' table populate with the summary of the run. To get the actual results you need to click on the Query Resuts tab which is to the left (why) of the Command Output tab.

You can also use the one of the pre-canned queries by clicking on the Sample queries tab, and clicking on the Query SQL (why not the query name?)

This will create a new tab in the editor view, populated with a query that should get you going.

Unfortunately they didn't build the example to cleverly use todays date so it would just work, you need to update it, so don't forget or you'll be querying in the past - which gets you a Successful query, even though you likely don't have data from 2021-07-22 (default datetime).

  • Really cool feature is the Results History - you can see all the queries run, and their results, so you don't have to run the same query again if you forgot to snag the output, or want to see what something returned yesterday vs today

Disabling service

  • Login as

IAM permissions required

I haven't been able to figure out everything, but at least:

  • ListEventDataStores
  • GetEventDataStore
  • GetQueryResults
  • StartQuery
  • DescribeQuery
  • CreateEventDataStore

and some Organizations stuff for sure, but I haven't figured that out yet.

need to figure out more, as the default 'ReadOnly' role doesn't have sufficient permissions to run queries.

Notes/Issues/Thoughts

  • CloudTrail Lake does not create a trail in the CloudTrail service

    • This means that Config rules that validate that CloudTrail is enabled may not work
    • CloudTrail Insights will not work
    • The normal CloudTrail UI will not work for investigations
  • When you delete an Event Store, it sits in Pending Deletion status for some time, and you can do Actions -> Restore on it

  • CloudTrail Lake does not create an S3 bucket you an interact with in any way

  • There doesn't appear to be any way to see how much data you're storing

  • Unlike Athena, you cannot click on the field names to populate them in your query, but at least they're easy to copy/paste

  • Would be nice if there was a way to have a button that would update date fields to match todays date, or the ability to do relative dates or a variable for $todays_date so that a saved query just works

  • I can't find a way to validate that everything is enabled properly in organization accounts

    • Looking in Cloudtrail I can't find events in accounts indicating it's turned on?
    • select count(*), recipientAccountId from <id> group by recipientAccountId to see all the accounts that sent CT data
  • Clikcing the cog in the results and trying to turn columns on/off doesn't work

  • No way to export data to CSV

  • You can only have 5 Query tabs open in the UI - this is too little.

  • Seems like saved searches are not shared between users - stored as cookies? How do we build a library of useful searches for people to use? Having a Tab of 'Common Searches' we can populate with examples would be huge

  • how can I make scheduled queries of this data to do something like reporting? (ie: run daily to find out which services hit throttling to help us find problems)

  • status: failed on jobs. it would be nice if there was more to why

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment