-
-
Save k3karthic/4bc929885eef40dbe010 to your computer and use it in GitHub Desktop.
#!/bin/bash | |
TABLE_NAME=$1 | |
# Get id list | |
aws dynamodb scan --table-name $TABLE_NAME | grep ID | awk '{ print $2 }' > /tmp/truncate.list | |
# Delete from id list | |
cat /tmp/truncate.list | xargs -IID aws dynamodb delete-item --table-name $TABLE_NAME --key '{ "id": { "S": "ID" }}' | |
# Remove id list | |
rm /tmp/truncate.list |
Hello,
do you think this will work on a table with 100 million items? And how much time it will need to erase them one by one?
Hello,
do you think this will work on a table with 100 million items? And how much time it will need to erase them one by one?
This script first download's all the keys and then deletes each item one by one. It can be made faster by using the parallel option of xargs command, but the initial scan would still be sequential.
For a large number of items, it would be better to write a program which uses DynamoDB parallel scan and use BatchWriteItem to delete multiple keys in one API call.
If downtime is not a problem, it should be possible to drop the table and recreate it later, but I have never tried this on such a large table before.
Just FYI: I just released a simple CLI program that does exactly that: Use segments and parallel scan etc to be as fast as possible.
It is very early in development, but feedback is always appreciated.
@k3karthic I've made small improvement - so you can specify primary key, rather than relying on "ID" - https://gist.github.com/toshke/d972b56c6273639ace5f62361e1ffac1
it requiresjq
installed though
I added on to yours to be able to delete only rows with a given Partition Key prefix and Sort Key
https://gist.github.com/michaelrios/05dbf08efeb2efab86f12013bcb1129f
aws dynamodb scan --table-name "$TABLE_NAME" | jq -c '.Items[]' | \
xargs -L1 -I{} -0 aws dynamodb delete-item --table-name "$TABLE_NAME" --key '{}'
See --select SPECIFIC_ATTRIBUTES
in aws dynamodb scan help
to tune the key.
@k3karthic I've made small improvement - so you can specify primary key, rather than relying on "ID" - https://gist.github.com/toshke/d972b56c6273639ace5f62361e1ffac1
it requires
jq
installed though