Shamoon Siddiqui shamoons

Building products + communities with code. Entrepreneur with more losses than wins. Lifelong learner with a passion for AI+ML / #Bitcoin.

leopd / almost-attention.py

Last active April 21, 2023 22:35

Explanatory (non-vectorized) code for how attention works

	# This code doesn't work, and isn't intended to.
	# The goal of this code is to explain how attention mechansisms work, in code.
	# It is deliberately not vectorized to make it clearer.

	def attention(self, X_in:List[Tensor]):
	# For every token transform previous layer's out
	for i in range(self.sequence_length):
	query[i] = self.Q * X_in[i]
	key[i] = self.K * X_in[i]
	value[i] = self.V * X_in[i]