Skip to content

Instantly share code, notes, and snippets.

@rob-p
Created November 10, 2025 17:33
Show Gist options
  • Select an option

  • Save rob-p/e57d90b9bffb4020b03ff63e0e2e70e6 to your computer and use it in GitHub Desktop.

Select an option

Save rob-p/e57d90b9bffb4020b03ff63e0e2e70e6 to your computer and use it in GitHub Desktop.
2bit rank perf
```
n = 100000
Ranker bits | 1t | 6t |
| latncy loop stream | latncy loop stream |
Ranker<Plain128, TrivialSB, WideSimdCount2, false> 4.00b | 13.2 3.4 3.6 | 2.3 0.9 0.7 |
Ranker<Plain256, TrivialSB, SimdCountSlice, false> 3.01b | 18.6 8.8 9.6 | 3.8 1.6 1.7 |
Ranker<Plain512, TrivialSB, SimdCountSlice, false> 2.51b | 20.9 13.6 14.4 | 3.5 2.2 2.3 |
Ranker<Plain512, SB8, U128Popcnt3, true> 2.26b | 24.3 21.5 21.4 | 4.2 3.4 3.5 |
Ranker<Plain512, SB8, SimdCountSlice, false> 2.26b | 23.4 18.4 19.0 | 4.1 3.1 3.2 |
Ranker<FullBlock, NoSB, U64PopcntSlice, false> 4.01b | 21.0 12.7 12.9 | 3.5 2.1 2.2 |
Ranker<QuartBlock, NoSB, SimdCount7, false> 4.01b | 11.1 2.8 2.9 | 1.9 0.5 0.5 |
Ranker<PentaBlock, TrivialSB, SimdCount7, false> 3.21b | 14.0 4.9 5.6 | 2.3 0.8 1.0 |
Ranker<HexaBlock, TrivialSB, WideSimdCount2, false> 2.68b | 15.3 6.9 7.3 | 2.7 1.1 1.2 |
Rank9 2.51b | 4.0 1.7 1.7 | 0.7 0.3 0.6 |
RSQVector<RSSupportPlain> 2.27b | 6.9 5.2 5.7 | 1.2 0.9 1.0 |
n = 1000000000
Ranker bits | 1t | 6t |
| latncy loop stream | latncy loop stream |
Ranker<Plain128, TrivialSB, WideSimdCount2, false> 4.00b | 109.7 27.0 19.7 | 44.2 11.3 14.2 |
Ranker<Plain256, TrivialSB, SimdCountSlice, false> 3.00b | 118.7 50.2 14.8 | 45.0 21.9 11.3 |
Ranker<Plain512, TrivialSB, SimdCountSlice, false> 2.50b | 124.8 71.4 15.8 | 63.9 21.9 11.3 |
Ranker<Plain512, SB8, U128Popcnt3, true> 2.25b | 119.9 102.0 21.9 | 63.5 45.2 6.6 |
Ranker<Plain512, SB8, SimdCountSlice, false> 2.25b | 119.6 84.9 19.6 | 47.1 29.1 7.4 |
Ranker<FullBlock, NoSB, U64PopcntSlice, false> 4.00b | 113.7 57.8 13.4 | 54.3 19.6 7.5 |
Ranker<QuartBlock, NoSB, SimdCount7, false> 4.00b | 98.9 20.7 8.9 | 47.4 8.2 8.0 |
Ranker<PentaBlock, TrivialSB, SimdCount7, false> 3.20b | 107.0 28.5 7.8 | 53.8 9.9 7.7 |
Ranker<HexaBlock, TrivialSB, WideSimdCount2, false> 2.67b | 105.6 35.5 9.0 | 43.8 11.2 5.2 |
Rank9 2.50b | 92.5 17.6 17.5 | 40.3 8.9 8.6 |
RSQVector<RSSupportPlain> 2.25b | 106.3 44.7 11.1 | 50.8 15.2 8.4 |
```
processor : 0
vendor_id : AuthenticAMD
cpu family : 25
model : 1
model name : AMD EPYC 7313 16-Core Processor
stepping : 1
microcode : 0xa0011de
cpu MHz : 2994.336
cache size : 512 KB
physical id : 0
siblings : 16
core id : 0
cpu cores : 16
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid
extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw i
bs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 invpcid_single hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 invpcid cqm rdt_a rdseed adx
smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd amd_ppin brs arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean
flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca
bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
bogomips : 5988.67
TLB size : 2560 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment