为 UTF-8 编码的文本文件添加 BOM
-
在 macOS 和 Windows 下同步 Cocos2D 项目
Cocos2D 项目,在 macOS 下用 Xcode 编辑后,UTF-8 编码文件保存时不带 BOM,导致在 Windows 下用 Visual Studio 编译期报错,给这些文件添加 BOM 后,可以解决这些编译期错误,并且不会导致 Xcode 中编译有问题。
本工具是一组 bash 脚本,需要在 bash 命令行下执行,同时请确保系统内存在以下软件
-
file
命令 -- 标准 Unix 程序,一般都会内置 -
uconv
命令 -- Unicode国际化组件 中的一个命令行工具sudo apt install icu-devtools
on Ubuntusudo yum install icu
on CentOS
在 bash 命令行内,执行
add-bom-for-files-in-folder.sh path-of-files-to-convert
其中 path-of-files-to-convert
是一个路径,该路径下所有以 不带签名的 UTF-8(UTF-8 without Signature) 编码的文件,都会被转换为 带签名的UTF-8编码(UTF-8 with Signature)
此脚本列出指定编码格式的文件,请用 find-file-with-encoding.sh -h
查看使用说明。
#!/bin/bash
# ENCODING=UTF-8Unicodetext
TARGET=.
usage(){
echo "find all files with specified encoding in a directory and all subdirectories"
echo ""
echo "$0"
echo " -h --help show this message and exit"
echo " -e --encoding specified encoding -e=$ENCODING"
echo " -l --list-encodings list all encodings and one file can find currently"
echo " -t --target-dir target directory to check -t=$TARGET"
echo ""
}
parse_arg() {
while [ "$1" != "" ]; do
PARAM=`echo $1 | awk -F= '{print $1}'`
VALUE=`echo $1 | awk -F= '{print $2}'`
case $PARAM in
-h | --help)
usage
exit
;;
-e | --encoding)
ENCODING=$VALUE
# echo "got ENCODING=$ENCODING"
;;
-l | --list-encodings)
LIST="1"
# echo "got LIST=$LIST"
;;
-t | --target-dir)
TARGET=$VALUE
# echo "got TARGET=$TARGET"
;;
*)
echo "ERROR: unknown parameter \"$PARAM\""
usage
exit 1
;;
esac
shift
done
echo $*
}
get_type () {
INFO=`file - < "$1" | cut -d: -f2`
TYPE=`echo $INFO | cut -d, -f2`
TYPE=`echo $TYPE | sed 's,^ *,,; s, *$,,'`
TYPE=`echo ${INFO//[[:space:]]/}`
echo "$TYPE"
}
declare -A EMap
# BOMs:
# UTF-8Unicode(withBOM)text
# UTF-8Unicodetext
# ASCIItext
# ISO-8859text
find() {
for file in $1/*
do
if [ -d "$file" ]
then
if [ -z "$(ls -A $file)" ]
then
:
else
if (( $# > 1 ))
then
find "$file" "$2"
else
find "$file"
fi
fi
else
# echo "$file"
TYPE=`get_type "$file"`
# echo "$file : $TYPE"
EMap[$TYPE]="$file"
if [ -z "$2" ] || [[ "$TYPE" != *$2* ]]
then
:
else
echo "$file"
fi
fi
done
}
main(){
parse_arg $*
find $TARGET $ENCODING
if [ -z "$LIST" ]
then
:
else
echo ""
echo "All encodings:"
for i in "${!EMap[@]}"
do
echo "$i: ${EMap[$i]}"
done
fi
}
main $*
此脚本把指定 文件(File) 从 源编码格式(SourceEncoding)(默认是不带BOM的 UTF-8) 转换到 “带签名的UTF-8编码(UTF-8 with Signature)”。
#!/bin/bash
# echo $*
function get_type () {
INFO=`file - < "$1" | cut -d: -f2`
TYPE=`echo $INFO | cut -d, -f2`
TYPE=`echo $TYPE | sed 's,^ *,,; s, *$,,'`
echo "$TYPE"
}
function trim_string () {
result=${1##}
# result=${result%%}
echo $result
}
function print_with_spaces () {
echo "-$1-"
}
function test_trim() {
STR=" Hello World! "
print_with_spaces "$STR"
STR=`echo $STR | sed 's,^ *,,; s, *$,,'` # this line do the "Trim" action
print_with_spaces "$STR"
exit 0
}
function do_print_type () {
echo " $1"
}
function print_type () {
FILE=$1
TYPE=`echo $TYPE | sed 's,^ *,,; s, *$,,'`
echo "$FILE type: -$TYPE-" 1>&2
if [ "$TYPE" = "ASCII text" ]
then
do_print_type "ascii file"
elif [ "$TYPE" = "UTF-8 Unicode (with BOM) text" ]
then
do_print_type "utf-8 with BOM"
elif [ "$TYPE" = "UTF-8 Unicode text" ]
then
do_print_type "utf-8 without BOM"
elif [ "$TYPE" = "ISO-8859 text" ]
then
do_print_type "GB2312"
else
do_print_type "========== unknown type: $TYPE"
fi
}
# test_trim
if [ $# -lt 1 ]
then
echo "Usage: $0 File [SourceEncoding]"
exit
fi
File=$1
Src="UTF-8"
# if [ $# -ge 2 ]
# then
# Src=$2
# fi
for File in $@
do
echo "converting $File from $Src"
uconv -f $Src -t UTF-8 --add-signature "$File" -o "$File.new"
mv "$File.new" "$File"
done
exit
if [ $# -eq 0 ]
then
echo usage $0 files ...
exit 1
fi
for file in "$@"
do
# echo "# Processing: $file" 1>&2
if [ ! -f "$file" ]
then
echo Not a file: "$file" 1>&2
exit 1
fi
TYPE=`get_type "$file"`
# echo "$file type: -$TYPE-" 1>&2
print_type "$file" "$TYPE"
if echo "$TYPE" | grep -q '(with BOM)'
then
:
# echo "# $file already has BOM, skipping." 1>&2
else
:
# echo 1>&2
# ( mv "${file}" "${file}"~ && uconv -f utf-8 -t utf-8 --add-signature < "${file}~" > "${file}" ) || ( echo Error processing "$file" 1>&2 ; exit 1)
fi
done
此脚本组合以上两个独立的脚本,提供简化的操作接口
#!/bin/bash
if [ $# -lt 1 ]
then
echo "Usage: $0 path-of-files-to-convert"
exit
fi
find-file-with-encoding.sh -e=UTF-8Unicodetext -t=$1 | xargs convert-to-utf8-with-signature.sh
-
同步 Cocos2D 项目事项
在项目中添加新文件后,如果该文件需要被添加到 项目文件(VS下是
.vcxproj
文件)内,一般会出现链接期的错误提示:Error LNK1120 2 unresolved externals land E:\develop\proj\land\proj.win32\Release.win32\client.exe 1 Error LNK2001 unresolved external symbol "public: static void __cdecl DialogNewbieGuide::Dialog(class cocos2d::Node *)" (?Dialog@DialogNewbieGuide@@SAXPAVNode@cocos2d@@@Z) client E:\develop\proj\land\proj.win32\DialogClubMain.obj 1
这种情况,只要找到包含这些符号(本例中是
DialogNewbieGuide
)的文件,添加到项目中即可。 -
脚本文件
在本页面上复制文本保存文件时,请尽量使用 Unix 的换行方式(LF),如果使用 Windows 的换行方式(CRLF),可能在执行脚本时,出现如下错误:
/mnt/d/portable/_bin/add-bom-for-files-in-folder.sh: line 2: $'\r': command not found /mnt/d/portable/_bin/add-bom-for-files-in-folder.sh: line 10: syntax error: unexpected end of file
have a error :
./find-file-with-encoding.sh: line 84: cprogramtext,UTF-8Unicodetext: value too great for base (error token is "8Unicodetext")