Skip to content

Instantly share code, notes, and snippets.

@jimathyp
Last active June 19, 2021 21:46
Show Gist options
  • Select an option

  • Save jimathyp/677a481849e241d16f97c07d8db85b4e to your computer and use it in GitHub Desktop.

Select an option

Save jimathyp/677a481849e241d16f97c07d8db85b4e to your computer and use it in GitHub Desktop.

AWS Glue

Ran crawler on bucket/folder/

Problem: Created a table for every file in that folder (thousands)

~ $ aws-vault exec <role> -- aws athena list-data-catalogs
{
    "DataCatalogsSummary": [
        {
            "CatalogName": "AwsDataCatalog",
            "Type": "GLUE"
        }
    ]
}

https://awscli.amazonaws.com/v2/documentation/api/latest/reference/athena/list-data-catalogs.html

~ $ aws-vault exec <role> -- aws athena list-databases --catalog-name AwsDataCatalog
{
    "DatabaseList": [
        {
            "Name": "default"
        },
        {
            "Name": "raw"
        }
    ]
}

https://awscli.amazonaws.com/v2/documentation/api/latest/reference/athena/list-databases.html

aws athena get-database --catalog-name AwsDataCatalog --database-name raw
{
    "Database": {
        "Name": "raw"
    }
}
sudo apt install jq
https://stedolan.github.io/jq/tutorial/
aws glue get-tables --database-name 'raw' --page-size 10 --max-items 10 | jq 'TableList[].Name'
aws glue get-tables --database-name 'raw' --page-size 10 --max-items 100 | jq '.TableList[].Name'
aws glue delete-table --database-name .... --name ....
``
https://docs.aws.amazon.com/cli/latest/reference/glue/batch-delete-table.html
https://awscli.amazonaws.com/v2/documentation/api/latest/reference/glue/batch-delete-table.html
https://awscli.amazonaws.com/v2/documentation/api/latest/reference/glue/delete-table.html

aws glue batch-delete-table --database-name .... --tables-to-delete "..." "..." "..." ``

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment