AMX is a SIMD extension for x86-64 that provides hardware-accelerated matrix operations using tile registers. It's designed for high-performance matrix multiplication in AI/ML workloads.
Key Features:
- 8 tile registers (TMM0-TMM7)
- Each tile: up to 16 rows × 64 bytes
- Native support for INT8, BF16, and FP16 operations