Skip to content

Instantly share code, notes, and snippets.

@cdimartino
Last active December 17, 2015 17:19
Show Gist options
  • Save cdimartino/5645603 to your computer and use it in GitHub Desktop.
Save cdimartino/5645603 to your computer and use it in GitHub Desktop.
#!/bin/bash
cass_data_dir=$1
backup_dir=/var/lib/cassandra/key_extract
s3_upload_loc=s3://spokes.data-axle.infogroup.com/admin/place_directory/index_rebuild/cass_extract_reindex-201306041450/
#processors=`cat /proc/cpuinfo | grep processor | wc -l`
processors=8
process_count() {
echo $(pgrep /usr/lib/cassandra/bin/sstablekeys | wc -l || echo 0)
}
for keyspace in `ls $cass_data_dir`; do
i=0
mkdir -p $backup_dir
#rm -f $backup_dir/$keyspace*
for file in `find $cass_data_dir/$keyspace -maxdepth 2 -name "*Data.db" ! -name "*segment*"`; do
if [ ! -d $backup_dir/$keyspace ]; then
mkdir -p $backup_dir/$keyspace;
fi
echo "Processing $file"
outfile=$backup_dir/$keyspace/$keyspace-`hostname`-$i.gz
if [ -f $outfile ]; then
continue;
fi
while true; do
if [ $(process_count) -lt "$processors" ]; then
(/usr/lib/cassandra/bin/sstablekeys $file 2>/dev/null | gzip > $outfile) &
i=$(expr $i + 1)
break;
else
sleep 1;
fi;
done
done
done
while true; do
if [ $(process_count) -gt "0" ]; then
sleep 1;
else
break;
fi
done
echo "Syncing $backup_dir to $s3_upload_loc"
s3cmd sync $backup_dir/ $s3_upload_loc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment