Skip to content

Instantly share code, notes, and snippets.

@wen-long
Last active August 23, 2024 18:14
Show Gist options
  • Save wen-long/856eeab4a589c8b19283ab81e7ba48b9 to your computer and use it in GitHub Desktop.
Save wen-long/856eeab4a589c8b19283ab81e7ba48b9 to your computer and use it in GitHub Desktop.
learn hash
hash 重要吗?
Rust杂谈 HashMap性能比不上Java? https://www.bilibili.com/video/BV1RTiceCEr6/
https://github.com/rurban/smhasher
可以作为索引, 如果有专门讲 hash 的书没有提到 smhasher, 要么过时要么不专业
务必 clone 代码自己在自己的机器上测试并得出结论, 不要过于关心其他人的结果
smhasher 的 Summary 竟然将 rapidhash 放在首位, 这与我的实际测试不符
吹牛逼和傲慢在 hash 算法这个领域也有出现
在我的机器上, xxh3 性能是 rapidhash 的两倍. 运行在单核虚拟机, 宿主机宣称为 AMD Ryzen 7950x
homebrew 原提供的 xxhash -Os 参数不能提供最佳性能,已经修复为 -O3
详见 https://github.com/Homebrew/homebrew-core/pull/181691
```
$ xxhsum -b
xxhsum 0.8.1 by Yann Collet
compiled as 64-bit x86_64 autoVec little endian with GCC 11.2.0
Sample of 100 KB...
1#XXH32 : 102400 -> 99518 it/s ( 9718.6 MB/s)
3#XXH64 : 102400 -> 143061 it/s (13970.8 MB/s)
5#XXH3_64b : 102400 -> 610795 it/s (59647.9 MB/s)
11#XXH128 : 102400 -> 604307 it/s (59014.3 MB/s)
processor : 0
vendor_id : AuthenticAMD
cpu family : 25
model : 1
model name : AMD EPYC-Milan Processor
stepping : 1
microcode : 0x1000065
cpu MHz : 4499.996
cache size : 512 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr wbnoinvd arat avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm arch_capabilities
bugs : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass srso
bogomips : 8999.99
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
--- Testing xxh3 "xxHash v3, 64-bit" GOOD
[[[ Speed Tests ]]]
Bulk speed test - 262144-byte keys
Alignment 7 - 14.657 bytes/cycle - 41934.22 MiB/sec @ 3 ghz
Alignment 6 - 14.724 bytes/cycle - 42124.75 MiB/sec @ 3 ghz
Alignment 5 - 14.683 bytes/cycle - 42008.81 MiB/sec @ 3 ghz
Alignment 4 - 14.846 bytes/cycle - 42475.55 MiB/sec @ 3 ghz
Alignment 3 - 14.635 bytes/cycle - 41871.10 MiB/sec @ 3 ghz
Alignment 2 - 14.678 bytes/cycle - 41993.26 MiB/sec @ 3 ghz
Alignment 1 - 14.573 bytes/cycle - 41692.77 MiB/sec @ 3 ghz
Alignment 0 - 17.671 bytes/cycle - 50557.43 MiB/sec @ 3 ghz
Average - 15.058 bytes/cycle - 43082.23 MiB/sec @ 3 ghz
--- Testing rapidhash "rapidhash v1" GOOD
[[[ Speed Tests ]]]
Bulk speed test - 262144-byte keys
Alignment 7 - 8.065 bytes/cycle - 23075.42 MiB/sec @ 3 ghz
Alignment 6 - 8.061 bytes/cycle - 23062.59 MiB/sec @ 3 ghz
Alignment 5 - 8.066 bytes/cycle - 23076.17 MiB/sec @ 3 ghz
Alignment 4 - 8.107 bytes/cycle - 23194.39 MiB/sec @ 3 ghz
Alignment 3 - 8.069 bytes/cycle - 23084.34 MiB/sec @ 3 ghz
Alignment 2 - 8.072 bytes/cycle - 23094.01 MiB/sec @ 3 ghz
Alignment 1 - 8.070 bytes/cycle - 23089.31 MiB/sec @ 3 ghz
Alignment 0 - 8.101 bytes/cycle - 23177.19 MiB/sec @ 3 ghz
Average - 8.076 bytes/cycle - 23106.68 MiB/sec @ 3 ghz
```
另一台 Ryzen5 PRO 4650G
```
./SMHasher --test=Speed xxh3
--- Testing xxh3 "xxHash v3, 64-bit" GOOD
[[[ Speed Tests ]]]
Bulk speed test - 262144-byte keys
Alignment 7 - 13.483 bytes/cycle - 38574.27 MiB/sec @ 3 ghz
Alignment 6 - 13.476 bytes/cycle - 38556.37 MiB/sec @ 3 ghz
Alignment 5 - 13.406 bytes/cycle - 38354.21 MiB/sec @ 3 ghz
Alignment 4 - 13.398 bytes/cycle - 38331.72 MiB/sec @ 3 ghz
Alignment 3 - 13.452 bytes/cycle - 38486.57 MiB/sec @ 3 ghz
Alignment 2 - 13.452 bytes/cycle - 38487.14 MiB/sec @ 3 ghz
Alignment 1 - 13.427 bytes/cycle - 38414.08 MiB/sec @ 3 ghz
Alignment 0 - 16.680 bytes/cycle - 47721.07 MiB/sec @ 3 ghz
Average - 13.847 bytes/cycle - 39615.68 MiB/sec @ 3 ghz
./SMHasher --test=Speed rapidhash
--- Testing rapidhash "rapidhash v1" GOOD
[[[ Speed Tests ]]]
Bulk speed test - 262144-byte keys
Alignment 7 - 7.747 bytes/cycle - 22165.33 MiB/sec @ 3 ghz
Alignment 6 - 7.742 bytes/cycle - 22150.55 MiB/sec @ 3 ghz
Alignment 5 - 7.753 bytes/cycle - 22182.48 MiB/sec @ 3 ghz
Alignment 4 - 7.767 bytes/cycle - 22222.63 MiB/sec @ 3 ghz
Alignment 3 - 7.816 bytes/cycle - 22360.94 MiB/sec @ 3 ghz
Alignment 2 - 7.808 bytes/cycle - 22338.03 MiB/sec @ 3 ghz
Alignment 1 - 7.805 bytes/cycle - 22329.86 MiB/sec @ 3 ghz
Alignment 0 - 7.819 bytes/cycle - 22371.62 MiB/sec @ 3 ghz
Average - 7.782 bytes/cycle - 22265.18 MiB/sec @ 3 ghz
```
学术用语 pseudorandom function (PRF)
XXH3 诞生帖
https://fastcompression.blogspot.com/2019/03/presenting-xxh3.html
https://news.ycombinator.com/item?id=19402602
阅读比较新的 hash 算法作者本人发布的帖子
被认可和广泛采纳的 hash 可能宣传该 hash 如何 beat 了其他 hash
指令集
对于大/小输入的区别
hash 的吞吐和延迟
追求“多快好省”的 hash 算法, 因为存在又慢又差的 hash 算法
SipHash 诞生记, 从 Hash DoS 攻击讲起, 当时太多的软件基础不能抵抗该攻击
[29C3: Hash-flooding DoS reloaded: attacks and defenses (EN)](https://www.youtube.com/watch?v=Vdrab3sB7MU)
SipHash 早期文档 https://web.archive.org/web/20190207205346/https://www.131002.net/siphash/
SipHash 没有被 go 代码维护者足够 like, golang 标准库不会提供 SipHash
https://github.com/golang/go/issues/19659
有人提议 golang map 使用 SipHash 以加强安全, 因维护者观点不同被拒绝
> 1. golang map 的用户(开发者)是安全的第一责任人, key 如果使用用户输入是 map 用户的责任
> 2. 当前没有对 golang map Hash-flooding 的 POC
https://github.com/golang/go/issues/9365
尽管上面两条情况没有发生变化, go 还是修改了 aeshash 实现来避免攻击, 是因为心虚吗? 既然要修改 aeshash, 为何不使用 SipHash?
https://github.com/golang/go/commit/91059de095703ebc4ce6b8bad7a0a40dedeef7dc
hash 实现可以充分使用高级指令集, golang map 内部 hash 算法会使用 aes 指令(不是用于加密)
https://github.com/tildeleb/aeshash
内部 hash 可以实现的很混乱, 毕竟 key 无需暴露和传递
怎么就算 cryptographic hash function?
linux 内核文档 https://docs.kernel.org/security/siphash.html 提到 SipHash is a cryptographically secure PRF
不同于 golang 对 SipHash 无端的抵触, linux 内核决定用 SipHash 替换 "all of these hashing functions"
https://lwn.net/Articles/711167/
https://github.com/BLAKE3-team/BLAKE3
hash 的演进方向
1. beat 竞争对手, 加入 smhasher
2. 发帖演讲宣传
3. 公布正式文档
4. 提供便于使用的二进制, 固定实现, fingerprint functions
5. 采用高级指令集
6. 并行化
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment