-
-
Save apaszke/226abdf867c4e9d6698bd198f3b45fb7 to your computer and use it in GitHub Desktop.
import torch | |
def jacobian(y, x, create_graph=False): | |
jac = [] | |
flat_y = y.reshape(-1) | |
grad_y = torch.zeros_like(flat_y) | |
for i in range(len(flat_y)): | |
grad_y[i] = 1. | |
grad_x, = torch.autograd.grad(flat_y, x, grad_y, retain_graph=True, create_graph=create_graph) | |
jac.append(grad_x.reshape(x.shape)) | |
grad_y[i] = 0. | |
return torch.stack(jac).reshape(y.shape + x.shape) | |
def hessian(y, x): | |
return jacobian(jacobian(y, x, create_graph=True), x) | |
def f(x): | |
return x * x * torch.arange(4, dtype=torch.float) | |
x = torch.ones(4, requires_grad=True) | |
print(jacobian(f(x), x)) | |
print(hessian(f(x), x)) |
I did manage to get the code to run now. I made a "simplification" that broke it.
Your function f is:
def f(x):
return x * x * torch.arange(4, dtype=torch.float)
While mine was:
def f(x):
return x * x
I've since fixed it to:
def f(x):
return x * x * torch.ones_like(x)
and it works like a charm. @apaszke any idea why that is the case?
I did manage to get the code to run now. I made a "simplification" that broke it.
Your function f is:
def f(x): return x * x * torch.arange(4, dtype=torch.float)
While mine was:
def f(x): return x * x
I've since fixed it to:
def f(x): return x * x * torch.ones_like(x)
and it works like a charm. @apaszke any idea why that is the case?
you can switch torch.ones_like(x)
to 1
and it still works...
Hello Adam! How could I give credit to you if I use this code? Can it be a doc-string in documentation, paper citation or something?
Now the function torch.autograd.functional.jacobian
can do the same thing, I think.
def jacobian(y, x, create_graph=False):
# xx, yy = x.detach().numpy(), y.detach().numpy()
jac = []
flat_y = y.reshape(-1)
grad_y = torch.zeros_like(flat_y)
for i in range(len(flat_y)):
grad_y[i] = 1.
grad_x, = torch.autograd.grad(flat_y, x, grad_y, retain_graph=True, create_graph=True)
jac.append(grad_x.reshape(x.shape))
grad_y[i] = 0.
return torch.stack(jac).reshape(y.shape + x.shape)
def hessian(y, x):
return jacobian(jacobian(y, x, create_graph=True), x)
def f(xx):
# y = x * x * torch.arange(4, dtype=torch.float)
matrix = torch.tensor([[0.2618, 0.2033, 0.7280, 0.8618],
[0.1299, 0.6498, 0.6675, 0.0527],
[0.3006, 0.9691, 0.0824, 0.8513],
[0.7914, 0.2796, 0.3717, 0.9483]], requires_grad=True)
y = torch.einsum('ji, i -> j', (matrix, xx))
return y
if __name__ == "__main__":
# matrix = torch.rand(4, 4, requires_grad=True)
# print(matrix)
x = torch.arange(4, dtype=torch.float, requires_grad=True)
print(jacobian(f(x), x))
grad = torch.autograd.functional.jacobian(f, x).numpy()
# grad = grad.flatten()
print(grad)
# print(hessian(f(x, matrix), x))
output
[0.1299, 0.6498, 0.6675, 0.0527],
[0.3006, 0.9691, 0.0824, 0.8513],
[0.7914, 0.2796, 0.3717, 0.9483]], grad_fn=<ViewBackward>)
[[0.2618 0.2033 0.728 0.8618]
[0.1299 0.6498 0.6675 0.0527]
[0.3006 0.9691 0.0824 0.8513]
[0.7914 0.2796 0.3717 0.9483]]```
Hi,
I want to find a Hessian matrix for the loss function of the pre-trained neural network with respect to the parameters of the network. How can I use this method? Can someone please share an example? Thanks.
Hi,
I want to find a Hessian matrix for the loss function of the pre-trained neural network with respect to the parameters of the network. How can I use this method? Can someone please share an example? Thanks.
Hi,
I am looking for the same thing. Could you figure out how we can do it?
I think this has now been added to recent versions of torch's autograd module. Maybe look at the examples here
I think this has now been added to recent versions of torch's autograd module. Maybe look at the examples here
Right. I checked it. When I use this method I am getting multiple errors. I am looking for an example or similar code to see how the implementation is done.
Hello, I am relatively new to PyTorch and came across your Hessian function. It is much more elegant than some Hessian code from an academic paper that I am trying to reproduce. I've put together a toy example, but keep getting the error
I've been scouring the docs and googling, but for the life of me I can't figure out what I'm doing wrong. Any help you could offer would be greatly appreciated!
Here is my code:
The output with anomaly detection turned on is here:
Finally, I'm running my code in a Google Colab notebook with PyTorch 1.4 if that makes a difference.
Thanks!