Here is a link to an alternative derivation based on the regularization perspective:
https://www.overleaf.com/read/zjxtnpyjncqb
I am not sure if it is 100% correct, but it helped me better understand what was happening. Maybe someone else will find it useful.
Nathan Jacobs