Skip to content

Instantly share code, notes, and snippets.

@JoyBrad
Last active August 24, 2022 07:41
Show Gist options
  • Save JoyBrad/c1442196b48e5efcad138125bc080463 to your computer and use it in GitHub Desktop.
Save JoyBrad/c1442196b48e5efcad138125bc080463 to your computer and use it in GitHub Desktop.
Shell script to fix weird characters in folder/filenames and remove duplicate entries
#!/bin/bash
# This script was created for fixing zip files downloaded from using (https://github.com/dulldusk/phpfm/)
# This is a known issue but not fixed yet (https://github.com/dulldusk/phpfm/issues/27)
# The problem I have faced so far is that the php program replaces all _ characters with some unicode character ž and that
# all folders are accompanied by extra files with the same name (as mentioned in the issue page also)
# Requires: bash, 7z, zip
# Usage: make it executable ofcourse, and
# ./fix-duplicate-names-in-zip.sh test.zip
# and chillax. It will perform required actions on folders/files inside the zip itself.
# For safer side, backup your zip before running it.
# You can modify the script according to your need if you observe some other problem
# with the zip.
renamelist=renamelist.txt
>| $renamelist
rencount=0
replaceChar(){
# replaces all 'ž' (/u009e) characters from folder/file names with '_' character
filelist=`7z -slt l $1 | grep -oP "(?<=Path = ).+" | tail -n +2 | grep $'\u009e' | sort`
IFS=$'\n'
for f in $filelist
do
fbasename=`basename $f`
fdirpath=`dirname $f`
newpath=''
newname=''
if echo "$fbasename" | grep $'\u009e'; then
rencount=$((rencount+1))
newname=`echo $fbasename | sed 's/\xc2\x9e/_/g'`
if [ ! -z "$fdirpath" ] && [ "$fdirpath" != "." ]; then
newpath=`echo $fdirpath | sed 's/\xc2\x9e/_/g'` #works with 7z
newpath="$newpath/"
fi
echo $newpath$fbasename >> $renamelist
echo $newpath$newname >> $renamelist
fi
done
renameContent=`cat $renamelist`
if [ ! -z "$renameContent" ]; then
echo -e "\e[7;38;5;34mRenaming files and folders to fix ž characters..\e[0m"
7z rn $1 @$renamelist
sleep 1
else
echo -e "\e[7;38;5;34mNothing to rename..\e[0m"
return
fi
if 7z -slt l $1 | grep -oP "(?<=Path = ).+" | tail -n +2 | grep $'\u009e'; then
replaceChar $1
fi
}
replaceChar $1
echo -e "\e[7;38;5;34mTotal $rencount files/folders renamed..\e[0m"
# exit
removeRedund(){
# removes extra files with same names accompanying folders
#works fine with either of below
filelist=`zipinfo -1 $1 | sort` #preserves order
# filelist=`7z -slt l $1 | grep -oP "(?<=Path = ).+" | tail -n +2 | sort`
echo -e "\e[7;38;5;34mRemoving duplicate entries of files named same as folder names recursively..\e[0m"
IFS=$'\n'
for file in $filelist
do
fdir=`dirname $file`
flist=`echo "$filelist" | grep $fdir | sort`
if [ ! -z "$fdir" ]; then
echo -en "\r\033[KChecking $file... "
fi
for fname in $flist
do
if [[ "$fname" =~ "${file}/" ]]; then
zip -d $1 "$file" -UN=UTF8
break
fi
done
done
}
removeRedund $1
echo ""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment