Skip to content

Instantly share code, notes, and snippets.

@jgoodie
Last active February 2, 2025 22:44
Show Gist options
  • Save jgoodie/c78906348fd641ce28c0d19157f01672 to your computer and use it in GitHub Desktop.
Save jgoodie/c78906348fd641ce28c0d19157f01672 to your computer and use it in GitHub Desktop.
single head attention
import numpy as np
def single_head_attention(X, beta_q, beta_k, beta_v, omega_q, omega_k, omega_v):
query = beta_q + omega_q@X
key = beta_k + omega_k@X
value = beta_v + omega_v@X
dp = np.dot(key.T, query)
scaled_dp = dp/np.sqrt(query.shape[0])
attention_weights = scipy.special.softmax(scaled_dp, axis=0)
attention_output = value@attention_weights
return attention_output, attention_weights
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment