- 
      
- 
        Save sbarratt/37356c46ad1350d4c30aefbd488a4faa to your computer and use it in GitHub Desktop. 
| def get_jacobian(net, x, noutputs): | |
| x = x.squeeze() | |
| n = x.size()[0] | |
| x = x.repeat(noutputs, 1) | |
| x.requires_grad_(True) | |
| y = net(x) | |
| y.backward(torch.eye(noutputs)) | |
| return x.grad.data | 
        
      
            RylanSchaeffer
  
      
      
      commented 
        Apr 29, 2020 
        via email 
      
    
  
I met this page about 1 year ago. This is really a nice trick, but it's a pity that it needs to forward pass a large batch and becomes a huge challenge to my GPU room. Recently I found an interesting way to bypass this problem. It's really interesting to solve a problem I encountered 1 year ago. https://github.com/ChenAo-Phys/pytorch-Jacobian
If I'm understanding this correctly, this code will forward pass noutputs times just to compute the jacobian once (but do it in a vectorized way)... The 1.5.0 autograd jacobian computation seems to compute the output once but then forloops over it and call backward one by one (@rjeli first comment) which will for sure be slow... Both tradeoffs seem sub optimal.
Anyone know if there's an update on this? Or is pytorch really not meant to compute jacobians?
@justinblaber , autodiff either computes matrix-vector products or vector-matrix products (depending on forward mode / reverse mode). The Jacobian is a matrix - there's no easy way to recover this by itself. Either you perform multiple backwards passes, using different elementary basis vector on each pass, or you blow the batch size up and do one massive backwards pass. There's no way around this.
how about this experimental api for jacobian: https://pytorch.org/docs/stable/_modules/torch/autograd/functional.html#jacobian
is it good?
how about this experimental api for jacobian: https://pytorch.org/docs/stable/_modules/torch/autograd/functional.html#jacobian
is it good?
I took a look and:
for j in range(out.nelement()):
            vj = _autograd_grad((out.reshape(-1)[j],), inputs, retain_graph=True, create_graph=create_graph)
It's just for-looping over the output and computing the gradient one by one (i.e. each row of the jacobian one by one). This will for sure be slow as hell if you have a lot of outputs. I actually think it's a tad bit deceiving that they advertise this functionality, because really the functionality just isn't there.
And actually, to be honest I wanted the jacobian earlier to do some gauss newton type optimization, but I've actually since discovered that the optim.LBFGS optimizer (now built into pytorch) might work well for my problem. I think it even has some backtracking type stuff built into it. So for now I don't think I even need the jacobian anymore.