This is a short post that explains how to write a high-performance matrix multiplication program on modern processors. In this tutorial I will use a single core of the Skylake-client CPU with AVX2, but the principles in this post also apply to other processors with different instruction sets (such as AVX512).
Matrix multiplication is a mathematical operation that defines the product of
This script is modeled after tee
(see [man tee
][2]) and works on Linux, macOS, Cygwin, WSL/WSL2
It's like your normal copy and paste commands, but unified and able to sense when you want it to be chainable.
This project started as an answer to the StackOverflow question: [How can I copy the output of a command directly into my clipboard?][3]