first I create a file with unicode characters. my terminal is set to utf8, so it stores it as such:
$ echo "日本" > utf8-test.txt
$ cat utf8-test.txt
日本
$ file utf8-test.txt
utf8-test.txt: UTF-8 Unicode text
then I convert it to uft32
$ iconv -f utf8 -t utf32 utf8-test.txt -o utf32-test.txt
$ cat utf32-test.txt
���e,g
$ iconv -f utf32 utf32-test.txt -t utf8
日本
$ xxd utf32-test.txt
00000000: fffe 0000 e565 0000 2c67 0000 0a00 0000 .....e..,g......
ls -al utf8-test.txt utf32-test.txt
-rw-rw-r-- 1 user user 16 Apr 19 10:25 utf32-test.txt
-rw-rw-r-- 1 user user 7 Apr 19 10:25 utf8-test.txt
then I add it to git and push it to github
$ git add utf8-test.txt utf32-test.txt
$ git ci -m"test"
$ git push
when I fetch the raw content, the files behave as expected
$ curl -s https://raw.githubusercontent.com/tripodsan/hlxtest/master/.github/utf8-test.txt
日本
$ curl -s --output - https://raw.githubusercontent.com/tripodsan/hlxtest/master/.github/utf32-test.txt
���e,g
$ curl -s --output - https://raw.githubusercontent.com/tripodsan/hlxtest/master/.github/utf32-test.txt | iconv -f utf32 -t utf8
日本
but when I fetch the file via the API, the utf32 one makes trouble:
$ curl -s https://api.github.com/repos/tripodsan/hlxtest/contents/.github/utf8-test.txt | jq -r .content
5pel5pysCg==
$ curl -s https://api.github.com/repos/tripodsan/hlxtest/contents/.github/utf8-test.txt | jq -r .content | openssl base64 -d
日本
$ curl -s https://api.github.com/repos/tripodsan/hlxtest/contents/.github/utf32-test.txt | jq -r .content | openssl base64 -d
���e,g
$ curl -s https://api.github.com/repos/tripodsan/hlxtest/contents/.github/utf32-test.txt | jq -r .content | openssl base64 -d | iconv -f utf32 -t utf8
iconv: illegal input sequence at position 0
$ curl -s https://api.github.com/repos/tripodsan/hlxtest/contents/.github/utf32-test.txt | jq -r .content | openssl base64 -d | xxd
00000000: efbf bdef bfbd 0000 efbf bd65 0000 2c67 ...........e..,g
00000010: 0000 0a00 0000 ......