Notes on Fast Model Editing at Scale by Mitchell, Lin, Bosselut, Finn, and Manning (2022)

Overview:

Problem: Models are big and it's hard to change facts that they learn.
In a nutshell: mend generates a weight delta that's decomposed into a low-rank matrix like LoRA.
"local, reliable, and general"
"Local" means unrelated output is not changed. "Reliable" means the model takes the desired corrections. "General" meaning variations on similar questions which would need correction also are corrected.
Works even on very large models.

Differences to Prior Art:

ENN encodes editability into parameters of the model itself. MEND provides editability through independent model. (ENN is closer to fine-tune? MEND closer to LoRA?)
KE uses raw edit example as input and produces a single rank-1 mask and rank-1 offset over fine tune grad. MEND maps model grads into model edits.

Open Questions:

Are they generating a different weight edit on a per-token basis? Why?
How are they applying the rank-1 model edit for each token in the input sequence?
~~If they're producing a layer's parameter edit as an output, why do they care about producing the low-rank embedding of the deltas? Accumulate and apply them.~~
Are they only applying the deltas at test time so that they can keep the low-rank deltas? And are they only keeping the low-rank deltas to avoid having to do parameter updates for all the weights?

Formulation:

A base model is a differentiable function that maps an input x and parameters theta to an output y.
A MEND model maps base model parameters, an edit input x_e, an edit_label y_e, a loss function l_e, and optional parameters to a new set of model parameters.
The input to a MEND network g is the fine-tuning gradient ∇Wle(xe, ye, θ) at layer l and the output is the layer’s parameter edit, which we call ∇˜ W_l.
"MEND outputs a rank-1 model edit for each token in the input and output sequence."

Aside:

JosephCatrambone/notes_fast_model_editing_at_scale.md