Recall the standard difference-in-differences setup:
$$y_i = \gamma_0 + \gamma_1 post_i treat_i + \gamma_2 treat_i + \gamma_3 post_i + e_i$$
This is equivalent to estimating two separate equations for both the control and treatment groups:
$$y_i = \gamma_0 + \gamma_3 post_i + e_i$$
$$y_i = (\gamma_0 + \gamma_2) + (\gamma_1 + \gamma_3) post_i + e_i$$
Or,
$$y_i = \beta_0^r + \beta_1^r post_i + e_i$$
where $r ( = C, T)$ denotes the control and treatment groups. The DD estimate is defined as
$$\gamma_1 = \beta_1^T - \beta_1^C$$
However, the true forms of the regressions are:
$$y_i = \beta_0^r + \beta_1^r post_i + \beta_2^r t + \epsilon_i$$
because there are time-variant unobserved effects in the error (so $post_i$ is endogenous). Leaving these in the error, we obtain the omitted variable bias
$$E(\hat\beta_1^r|post_i) = \beta_1^r + \beta_2^r * \frac{Cov(post_i, t)}{Var(post_i)}$$
And so the DD estimator becomes
$$E(\hat\gamma_1 | post_i) = E(\hat\beta_1^T - \hat\beta_1^C|post_i) = (\beta_1^T - \beta_1^C) - [\beta_2^T * \frac{Cov(post_i, t)}{Var(post_i)} - \beta_2^C * \frac{Cov(post_i, t)}{Var(post_i)}]$$
Now the clever part. Assume that $\beta_2^T = \beta_2^C$, that is, the time-variant unobserved effects affect both treatment and control groups equally. This is the common trends assumption! Then $\gamma_1$ reduces to
$$E(\hat\gamma_1|post_i) = \beta_1^T - \beta_1^C = \gamma_1$$
So, the common trends assumption identifies the difference-in-differences estimator.
Gruber et al (2023) appendix 5
Wooldridge