Overview:
- Problem: Models are big and it's hard to change facts that they learn.
- In a nutshell: mend generates a weight delta that's decomposed into a low-rank matrix like LoRA.
- "local, reliable, and general"
- "Local" means unrelated output is not changed. "Reliable" means the model takes the desired corrections. "General" meaning variations on similar questions which would need correction also are corrected.
- Works even on very large models.
Differences to Prior Art:
- ENN encodes editability into parameters of the model itself. MEND provides editability through independent model. (ENN is closer to fine-tune? MEND closer to LoRA?)
- KE uses raw edit example as input and produces a single rank-1 mask and rank-1 offset over fine tune grad. MEND maps model grads into model edits.
Open Questions:
- Are they generating a different weight edit on a per-token basis? Why?
- How are they applying the rank-1 model edit for each token in the input sequence?
If they're producing a layer's parameter edit as an output, why do they care about producing the low-rank embedding of the deltas? Accumulate and apply them.- Are they only applying the deltas at test time so that they can keep the low-rank deltas? And are they only keeping the low-rank deltas to avoid having to do parameter updates for all the weights?
Formulation:
- A base model is a differentiable function that maps an input x and parameters theta to an output y.
- A MEND model maps base model parameters, an edit input x_e, an edit_label y_e, a loss function l_e, and optional parameters to a new set of model parameters.
- The input to a MEND network g is the fine-tuning gradient ∇Wle(xe, ye, θ) at layer l and the output is the layer’s parameter edit, which we call ∇˜ W_l.
- "MEND outputs a rank-1 model edit for each token in the input and output sequence."
Aside:
- Inputs to MEND can differ by orders of magnitude, so normalization is required.