Skip to content

Instantly share code, notes, and snippets.

@msilvey
Created July 12, 2013 19:03
Show Gist options
  • Select an option

  • Save msilvey/5986887 to your computer and use it in GitHub Desktop.

Select an option

Save msilvey/5986887 to your computer and use it in GitHub Desktop.
A loop to remove old staging dirs. This is a part of the workaround for a bug tracked here: https://issues.apache.org/jira/browse/MAPREDUCE-5351
#!/bin/bash
NOW=`date +%s`
SIXHOURSAGO=`echo "$NOW - 21600" |bc`
HADOOPBIN="/usr/bin/hadoop"
IFS=$'\n'
for i in `$HADOOPBIN fs -ls /user/root/.staging/`; do
IFS=' '
JOBDATE=`echo $i|awk '{print $6" "$7}'`
JOBTS=`date --date="$JOBDATE" +"%s"`
if [ $SIXHOURSAGO -gt $JOBTS ]; then
FILE=`echo $i|awk '{print $8}'`
$HADOOPBIN fs -rm -r $FILE
fi
done
@prash0704
Copy link
Copy Markdown

Hi, I need your help with the above script.

I used your script to help me with same issue i have right now, but my lead wants it to be hard coded so that anytime in future it wont end up deleting the filesystem.
Can you please help

NOW=`date +%s`
ONEWEEKAGO=`echo "$NOW - 604800" |bc`
HADOOPBIN="/usr/bin/hdfs"
HDFSPATH="/mnt/hadoop/user/user"
TMPPATH="/appdata/tmp"

LISTUSER=$(ls -l  $HDFSPATH | awk '{print $9}')
for y in $LISTUSER; do
    IFS=$'\n'
    for i in `$HADOOPBIN dfs -ls /user/$y/.staging/`; do
      IFS=' '
      JOBDATE=`echo $i|awk '{print $6" "$7}'`
      JOBTS=`date --date="$JOBDATE" +"%s"`
      if [[ $ONEWEEKAGO -gt $JOBTS ]]; then
        FILE=`echo $i|awk '{print $8}'`
           if [ -f ${FILE} ]; then
                if [[ "${FILE}" =~ "/user/${y}/.staging/" ]]; then

                        $HADOOPBIN fs -rm -r ${FILE}

                fi
          else
            echo "no match"
         fi
      fi
    done
done```

Here i am trying to delete the dir for all the users

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment