Skip to content

Instantly share code, notes, and snippets.

@us10096698
Created October 28, 2013 03:06
Show Gist options
  • Save us10096698/7190891 to your computer and use it in GitHub Desktop.
Save us10096698/7190891 to your computer and use it in GitHub Desktop.
Japanese Kanji File Encode Converter (from CP932/ISO-2022-JP-1/EUC-JP to UTF-8 with LF)
#!/bin/sh
#
# Japanese Kanji File Encode Converter (to UTF-8)
# Usage: $ sh ./enc_converter.sh "<FilePath>"
# <Filepath> is used for argument of "ls" command.
# Ex) sh ./enc_converter.sh "~/test/*.txt"
# Expect Output of 'file' Command
cp932="Non-ISO extended-ASCII text"
jis="ASCII text"
euc="ISO-8859 text"
utf8="UTF-8 Unicode text"
# Converter Function
function conv() {
echo "[$1] convert from [$2] to [$3]"
iconv -f $2 -t $3 < $1 > $1.tmp
tr -d '\r' < $1.tmp > $1.tmp2
mv $1.tmp2 $1
rm $1.tmp
}
# Main Loop
infiles=$1
for f in `ls $1`
do
str=`file $f`
#echo $str
case $str in
*$cp932* ) conv $f "CP932" "UTF-8" ;;
*$jis* ) conv $f "ISO-2022-JP-1" "UTF-8" ;;
*$euc* ) conv $f "EUC-JP" "UTF-8" ;;
*$utf8* ) conv $f "UTF-8" "UTF-8" ;;
* ) echo "[$f] this is not supported encode.($str)" ;;
esac
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment