-
-
Save sergeyprokudin/4a50bf9b75e0559c1fcd2cae860b879e to your computer and use it in GitHub Desktop.
import keras.backend as K | |
import numpy as np | |
def gaussian_nll(ytrue, ypreds): | |
"""Keras implmementation of multivariate Gaussian negative loglikelihood loss function. | |
This implementation implies diagonal covariance matrix. | |
Parameters | |
---------- | |
ytrue: tf.tensor of shape [n_samples, n_dims] | |
ground truth values | |
ypreds: tf.tensor of shape [n_samples, n_dims*2] | |
predicted mu and logsigma values (e.g. by your neural network) | |
Returns | |
------- | |
neg_log_likelihood: float | |
negative loglikelihood averaged over samples | |
This loss can then be used as a target loss for any keras model, e.g.: | |
model.compile(loss=gaussian_nll, optimizer='Adam') | |
""" | |
n_dims = int(int(ypreds.shape[1])/2) | |
mu = ypreds[:, 0:n_dims] | |
logsigma = ypreds[:, n_dims:] | |
mse = -0.5*K.sum(K.square((ytrue-mu)/K.exp(logsigma)),axis=1) | |
sigma_trace = -K.sum(logsigma, axis=1) | |
log2pi = -0.5*n_dims*np.log(2*np.pi) | |
log_likelihood = mse+sigma_trace+log2pi | |
return K.mean(-log_likelihood) |
Hi. I'm sorry for the inconvenience. Could you provide a complete example of a neural network using such a cost function? Because I could not understand how to obtain the average parameter and variance of the network. Thanks in advance for your attention, Gledson.
Hi. I'm sorry for the inconvenience. Could you provide a complete example of a neural network using such a cost function? Because I could not understand how to obtain the average parameter and variance of the network. Thanks in advance for your attention, Gledson.
Hi Gledson!
Take a look at this small network: https://gist.github.com/sergeyprokudin/bb66fff8c672f8caab6bbb1056c7bd20
you can compile this model with
model.compile(loss=gaussian_nll, optimizer='Adam')
and use predict_prob function to get mean and variance estimates.
Hope this helps,
Sergey
Hello sergeyprokudin, Thank you very much. I have other doubte. Don't you use softmax to predict a multiclass?
Can I do with sofmax and softplus?
mean = model.add(Dense(n_outputs, activation='softmax'))
sigma = model.add(Dense(n_outputs, activation='sofplus'))
model = Model(x_input, otput([mean,sigma]))
Hello sergeyprokudin, Thank you very much. I have other doubte. Don't you use softmax to predict a multiclass?
Can I do with sofmax and softplus?
mean = model.add(Dense(n_outputs, activation='softmax'))
sigma = model.add(Dense(n_outputs, activation='sofplus'))
model = Model(x_input, otput([mean,sigma]))
I'm afraid you are confusing regression and classification tasks. If you are interested in classification, you don't need Gaussian negative log-likelihood loss defined in this gist - you can use standard categorical crossentropy loss and softmax activations to get valid class probabilities that will sum to 1. You don't need to model sigmas separately as (in theory) your softmax outputs already provide you with confidence estimates. In practice, however, you might want to calibrate them (check this paper for discussion of the topic).
Hello @sergeyprokudin, How are you? In fact I would like to use a Gaussian layer in my classification model, which could calculate mean and variance. Is this possible?
Hello @sergeyprokudin https://github.com/sergeyprokudin, How are you? In fact I would like to use a Gaussian layer in my classification model, which could calculate mean and variance. Is this possible? Thank you very much. Em sex., 15 de mai. de 2020 às 19:44, Sergey Prokudin < [email protected]> escreveu:
…
@.**** commented on this gist. ------------------------------ Hello sergeyprokudin, Thank you very much. I have other doubte. Don't you use softmax to predict a multiclass? Can I do with sofmax and softplus? mean = model.add(Dense(n_outputs, activation='softmax')) sigma = model.add(Dense(n_outputs, activation='sofplus')) model = Model(x_input, otput([mean,sigma])) I'm afraid you are confusing regression and classification tasks. If you are interested in classification, you don't need Gaussian negative log-likelihood loss defined in this gist - you can use standard categorical crossentropy https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalCrossentropy loss and softmax activations to get valid class probabilities that will sum to 1. You don't need to model sigmas separately as (in theory) your softmax outputs already provide you with confidence estimates. In practice, however, you might want to calibrate them (check this paper https://arxiv.org/abs/1706.04599 for discussion of the topic). — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://gist.github.com/4a50bf9b75e0559c1fcd2cae860b879e#gistcomment-3305800, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIT3EDHN2LNBOW7K6MHAMFLRRWEPLANCNFSM4HQDOABQ .
Gaussian distribution is defined over continuous domain, while in classification you regularly want to model the parameters of some categorical distribution. What would be the implied interpretation of mean and variance in your case?
Hello @sergeyprokudin https://github.com/sergeyprokudin, How are you? In fact I would like to use a Gaussian layer in my classification model, which could calculate mean and variance. Is this possible? Thank you very much. Em sex., 15 de mai. de 2020 às 19:44, Sergey Prokudin < [email protected]> escreveu:
…
@.**** commented on this gist. ------------------------------ Hello sergeyprokudin, Thank you very much. I have other doubte. Don't you use softmax to predict a multiclass? Can I do with sofmax and softplus? mean = model.add(Dense(n_outputs, activation='softmax')) sigma = model.add(Dense(n_outputs, activation='sofplus')) model = Model(x_input, otput([mean,sigma])) I'm afraid you are confusing regression and classification tasks. If you are interested in classification, you don't need Gaussian negative log-likelihood loss defined in this gist - you can use standard categorical crossentropy https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalCrossentropy loss and softmax activations to get valid class probabilities that will sum to 1. You don't need to model sigmas separately as (in theory) your softmax outputs already provide you with confidence estimates. In practice, however, you might want to calibrate them (check this paper https://arxiv.org/abs/1706.04599 for discussion of the topic). — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://gist.github.com/4a50bf9b75e0559c1fcd2cae860b879e#gistcomment-3305800, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIT3EDHN2LNBOW7K6MHAMFLRRWEPLANCNFSM4HQDOABQ .Gaussian distribution is defined over continuous domain, while in classification you regularly want to model the parameters of some categorical distribution. What would be the implied interpretation of mean and variance in your case?
Yes. I now understand your explanation. In this case, could I consider the average to be the softmax value?
Best hegards.
Hello @sergeyprokudin https://github.com/sergeyprokudin, How are you? In fact I would like to use a Gaussian layer in my classification model, which could calculate mean and variance. Is this possible? Thank you very much. Em sex., 15 de mai. de 2020 às 19:44, Sergey Prokudin < [email protected]> escreveu:
…
@.**** commented on this gist. ------------------------------ Hello sergeyprokudin, Thank you very much. I have other doubte. Don't you use softmax to predict a multiclass? Can I do with sofmax and softplus? mean = model.add(Dense(n_outputs, activation='softmax')) sigma = model.add(Dense(n_outputs, activation='sofplus')) model = Model(x_input, otput([mean,sigma])) I'm afraid you are confusing regression and classification tasks. If you are interested in classification, you don't need Gaussian negative log-likelihood loss defined in this gist - you can use standard categorical crossentropy https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalCrossentropy loss and softmax activations to get valid class probabilities that will sum to 1. You don't need to model sigmas separately as (in theory) your softmax outputs already provide you with confidence estimates. In practice, however, you might want to calibrate them (check this paper https://arxiv.org/abs/1706.04599 for discussion of the topic). — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://gist.github.com/4a50bf9b75e0559c1fcd2cae860b879e#gistcomment-3305800, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIT3EDHN2LNBOW7K6MHAMFLRRWEPLANCNFSM4HQDOABQ .
Gaussian distribution is defined over continuous domain, while in classification you regularly want to model the parameters of some categorical distribution. What would be the implied interpretation of mean and variance in your case?Yes. I now understand your explanation. In this case, could I consider the average to be the softmax value?
Best hegards.
The class with the maximum probability value is a mode of a corresponding categorical probability distribution, not its mean value which is undefined in this case. Hope this helps!
Hello @sergeyprokudin https://github.com/sergeyprokudin, How are you? In fact I would like to use a Gaussian layer in my classification model, which could calculate mean and variance. Is this possible? Thank you very much. Em sex., 15 de mai. de 2020 às 19:44, Sergey Prokudin < [email protected]> escreveu:
…
@.**** commented on this gist. ------------------------------ Hello sergeyprokudin, Thank you very much. I have other doubte. Don't you use softmax to predict a multiclass? Can I do with sofmax and softplus? mean = model.add(Dense(n_outputs, activation='softmax')) sigma = model.add(Dense(n_outputs, activation='sofplus')) model = Model(x_input, otput([mean,sigma])) I'm afraid you are confusing regression and classification tasks. If you are interested in classification, you don't need Gaussian negative log-likelihood loss defined in this gist - you can use standard categorical crossentropy https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalCrossentropy loss and softmax activations to get valid class probabilities that will sum to 1. You don't need to model sigmas separately as (in theory) your softmax outputs already provide you with confidence estimates. In practice, however, you might want to calibrate them (check this paper https://arxiv.org/abs/1706.04599 for discussion of the topic). — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://gist.github.com/4a50bf9b75e0559c1fcd2cae860b879e#gistcomment-3305800, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIT3EDHN2LNBOW7K6MHAMFLRRWEPLANCNFSM4HQDOABQ .
Gaussian distribution is defined over continuous domain, while in classification you regularly want to model the parameters of some categorical distribution. What would be the implied interpretation of mean and variance in your case?Yes. I now understand your explanation. In this case, could I consider the average to be the softmax value?
Best hegards.The class with the maximum probability value is a mode of a corresponding categorical probability distribution, not its mean value which is undefined in this case. Hope this helps!
Okay, now I understand. My doubts were clarified. Thank you very much for the information.
Best Regards.
Hi,
Why do you use sum in this piece of code
sigma_trace = -K.sum(logsigma, axis=1)
?
Hi, may I know how to solve this error??
"ValueError: Dimensions must be equal, but are 128 and 64 for '{{node gaussian_nll/sub}} = Sub[T=DT_FLOAT](Cast, gaussian_nll/strided_slice)' with input shapes: [?,128,128,3], [?,64,128,3]."
Thanks for sharing your code!
There is a little error, in mse you need to divide by "K.exp(2*logsigma)" instead of "K.exp(logsigma)"
Oops. It's fine as it is, I had misread the brackets! Perfect, sorry for the noise!