Last active
December 27, 2015 01:29
-
-
Save znz/7244868 to your computer and use it in GitHub Desktop.
uniq で丸数字が同一視されてしまう?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ cat n.txt | |
① | |
② | |
☀ | |
$ uniq n.txt | |
① | |
$ sort n.txt | |
① | |
② | |
☀ | |
$ tac n.txt | sort | |
☀ | |
② | |
① | |
$ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ cat n.txt | |
① | |
② | |
$ LANG=ja_JP.utf8 uniq -c n.txt | |
2 ① | |
$ LANG=C uniq -c n.txt | |
1 ① | |
1 ② | |
$ uniq --version | |
uniq (GNU coreutils) 8.20 | |
Copyright (C) 2012 Free Software Foundation, Inc. | |
ライセンス GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. | |
This is free software: you are free to change and redistribute it. | |
There is NO WARRANTY, to the extent permitted by law. | |
作者 Richard M. Stallman および David MacKenzie。 | |
$ lsb_release -a | |
No LSB modules are available. | |
Distributor ID: Ubuntu | |
Description: Ubuntu 13.04 | |
Release: 13.04 | |
Codename: raring | |
$ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
apt-get source locales
でソースをとってきてlocales/ja_JP
のLC_COLLATE
からEND LC_COLLATE
の間に書いていないコードポイントは同一視されているように見える。