Experimental investigation of whether trained neural networks implicitly learn weight matrices that decompose into hypercomplex-like algebraic structures (quaternion, Clifford algebra, Kronecker-factored).
Baseline GPT-2 small perplexity: 29.88 (WikiText-2 test, standard strided evaluation)