A GNU C / Asm implementation is here: https://gist.github.com/vsivsi/8511aca1bac528f49fbb45a636afa4b5
NOTE! This must be run on an Intel processor supporting AVX512F/DQ
To test: go test -count 1 -timeout 15m -run '^TestMask$' gist.github.com/vsivsi/fff8618ace4b02eb410dd8792779bf32
This should fail with something like:
% go test -count 1 -timeout 15m -run '^TestMask$' gist.github.com/vsivsi/fff8618ace4b02eb410dd8792779bf32
--- FAIL: TestMask (0.44s)
maskcheck_test.go:16: Failed for iteration: 0 with return value: 69247732
maskcheck_test.go:16: Failed for iteration: 1 with return value: 42154190
maskcheck_test.go:16: Failed for iteration: 2 with return value: 43317668
maskcheck_test.go:16: Failed for iteration: 3 with return value: 41575878
maskcheck_test.go:16: Failed for iteration: 4 with return value: 38750998
maskcheck_test.go:16: Failed for iteration: 5 with return value: 42730788
maskcheck_test.go:16: Failed for iteration: 6 with return value: 41721686
maskcheck_test.go:16: Failed for iteration: 7 with return value: 40051735
maskcheck_test.go:16: Failed for iteration: 8 with return value: 40951907
maskcheck_test.go:16: Failed for iteration: 9 with return value: 41829708
maskcheck_test.go:16: Failed for iteration: 10 with return value: 33994560
maskcheck_test.go:16: Failed for iteration: 11 with return value: 31676633
maskcheck_test.go:16: Failed for iteration: 12 with return value: 32128705
maskcheck_test.go:16: Failed for iteration: 13 with return value: 41683214
maskcheck_test.go:16: Failed for iteration: 14 with return value: 38697082
maskcheck_test.go:16: Failed for iteration: 15 with return value: 35869348
maskcheck_test.go:16: Failed for iteration: 16 with return value: 37629503
maskcheck_test.go:16: Failed for iteration: 17 with return value: 32931377
maskcheck_test.go:16: Failed for iteration: 18 with return value: 44475243
maskcheck_test.go:16: Failed for iteration: 19 with return value: 42445948
FAIL
FAIL gist.github.com/vsivsi/fff8618ace4b02eb410dd8792779bf32 0.744s
FAIL
Disabling async preemption rescues it: GODEBUG=asyncpreemptoff=1 go test -count 1 -timeout 15m -run '^TestMask$' gist.github.com/vsivsi/fff8618ace4b02eb410dd8792779bf32
% GODEBUG=asyncpreemptoff=1 go test -count 1 -timeout 15m -run '^TestMask$' gist.github.com/vsivsi/fff8618ace4b02eb410dd8792779bf32
ok gist.github.com/vsivsi/fff8618ace4b02eb410dd8792779bf32 0.858s