Last active
March 31, 2021 18:04
-
-
Save JensRantil/063b7c56ca4a8dfe1c50 to your computer and use it in GitHub Desktop.
How to count number of tombstones per partition key in one or multiple sstables.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# | |
# Counts number of tombstones per partition key in one or multiple sstables. | |
# | |
# Usage: ./tombstone-count.sh /var/lib/cassandra/data/mykeyspace/mytable/*-Data.db | |
# | |
# Sample output: | |
# "40e6a9839bf44bdaa624cc53e96733fe" 8 | |
# "8e177ab222c14f868bcb6d2922b18d2b" 8 | |
# "28aaa9db0dad4ae78cabe8bcc25d14a3" 9 | |
# "8367c6c14d8e4ccdbd14e85d4a7d3b1f" 9 | |
# "ecaf2f2409b24fa990a18e79f05b4b30" 12 | |
# "3294ffc4dad44853b675dfdb34911576" 13 | |
# (partition keys without any tombstone(s) are not printed). | |
# Get `jq` here: http://stedolan.github.io/jq/download/ | |
# ltrim taken from http://stackoverflow.com/a/27158086/260805 | |
# The various stages below: | |
# 1. Choose which file(s) you'd like to check tombstones for here. | |
# 2. Convert to JSON. | |
# 3. Count tombstones per primary key. | |
# 4. Convert from JSON to CSV. | |
# 5. Sum duplicates of primary keys. | |
# 6. Sort by the primary key with the most tombstones. | |
ls "$@" \ | |
| xargs --verbose -L 1 sstable2json \ | |
| jq '.[] | {key: .key, length: [.columns[] | select(.[3]=="t")] | length }' \ | |
| awk -F: 'function ltrim(s) { sub(/^[ \t\r\n]+/, "", s); return s } /"key"/ {key=$2;} /"length"/ && $2>0 {print ltrim(key), ltrim($2);}' \ | |
| awk -F, '!($1 in myarr) { myarr[$1]=0 } {myarr[$1] += $2;} END {for(i in myarr) print i, myarr[i];}' \ | |
| sort -n -k 2 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Find an updated version for Cassandra 3.0.x at https://gist.github.com/fholzer/d6b7f1ce98906b5730cae67c179e0dd2