Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save wolfram77/e87a1f0e8bec0cce084241f884d760c3 to your computer and use it in GitHub Desktop.
Save wolfram77/e87a1f0e8bec0cce084241f884d760c3 to your computer and use it in GitHub Desktop.
Misra-Gries algorithm for finding heavy hitters in a list of numbers, using vector instructions : CODE

I was trying out the Misra-Gries algorithm for finding heavy hitters in a list of numbers, using vector instructions. But let me fill in some context first. I am trying to minimize the memory needed by Louvain/Leiden algorithms for community detection, and hopefully a bit of performance too. Currently the algorithms use a full-size per-thread hashtables, with each thread using |V| space for storing the associated weights for the hashtable. However, we could instead store a small hashtable, using the Misra-Gries algorithm, in the cache - obviously this might affect the performance (in terms of quality of the returned communities).

It was cool to go through step-by-step and be able to minimize the number of lines of generated machine code (with minimal conditional jumps). When you understand the vector instructions, you can write them in a higher-level logic without resorting to writing the instructions yourself, as these can be quite complicated. We let the compiler do its thing, but guide it with a short for-loops so it can actually do it. Note, setting flags with a -1 is more efficient due to how vector instructions work.

The code, with the generated instructions on x86 architecture is attached below. Compiler flags as -O3 and -mavx. May the vectors be with you!

Date: 27th October 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment