Skip to content

Instantly share code, notes, and snippets.

@qxj
Created February 23, 2016 10:29
Show Gist options
  • Save qxj/e0c940dc8f44de832b48 to your computer and use it in GitHub Desktop.
Save qxj/e0c940dc8f44de832b48 to your computer and use it in GitHub Desktop.
Clean temporary files generated by hive under hdfs:///tmp
#!/usr/bin/env bash
#
# Copyright (C) 2016 Julian Qian
#
# @file clean_hadoop_tmp.sh
# @author Julian Qian <[email protected]>
# @created 2016-02-23 17:45:55
#
hadoop=hadoop
expired_days=7
now=$(date +%s)
$hadoop fs -du /tmp/hive-* | awk '{print $3}' | while read line; do
file=${line##*/}
date=${file:5:10}
if [[ -n $date ]]; then
days=$(( ($now - $(date +%s -d $date))/60/60/24 ))
if (( $days > $expired_days )); then
dir=$line
echo "remove expired log $dir ..."
sudo -u hdfs $hadoop fs -rm -r -skipTrash $dir
fi
fi
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment