Created
November 12, 2011 15:44
-
-
Save denzilc/1360709 to your computer and use it in GitHub Desktop.
Neural Network Cost Function
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
function [J grad] = nnCostFunction(nn_params, ... | |
input_layer_size, ... | |
hidden_layer_size, ... | |
num_labels, ... | |
X, y, lambda) | |
%NNCOSTFUNCTION Implements the neural network cost function for a two layer | |
%neural network which performs classification | |
% [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ... | |
% X, y, lambda) computes the cost and gradient of the neural network. The | |
% parameters for the neural network are "unrolled" into the vector | |
% nn_params and need to be converted back into the weight matrices. | |
% | |
% The returned parameter grad should be a "unrolled" vector of the | |
% partial derivatives of the neural network. | |
% | |
% Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices | |
% for our 2 layer neural network | |
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ... | |
hidden_layer_size, (input_layer_size + 1)); | |
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ... | |
num_labels, (hidden_layer_size + 1)); | |
% Setup some useful variables | |
m = size(X, 1); | |
% You need to return the following variables correctly | |
J = 0; | |
Theta1_grad = zeros(size(Theta1)); | |
Theta2_grad = zeros(size(Theta2)); | |
% ====================== YOUR CODE HERE ====================== | |
% Instructions: You should complete the code by working through the | |
% following parts. | |
% | |
% Part 1: Feedforward the neural network and return the cost in the | |
% variable J. After implementing Part 1, you can verify that your | |
% cost function computation is correct by verifying the cost | |
% computed in ex4.m | |
% | |
% Part 2: Implement the backpropagation algorithm to compute the gradients | |
% Theta1_grad and Theta2_grad. You should return the partial derivatives of | |
% the cost function with respect to Theta1 and Theta2 in Theta1_grad and | |
% Theta2_grad, respectively. After implementing Part 2, you can check | |
% that your implementation is correct by running checkNNGradients | |
% | |
% Note: The vector y passed into the function is a vector of labels | |
% containing values from 1..K. You need to map this vector into a | |
% binary vector of 1's and 0's to be used with the neural network | |
% cost function. | |
% | |
% Hint: We recommend implementing backpropagation using a for-loop | |
% over the training examples if you are implementing it for the | |
% first time. | |
% | |
% Part 3: Implement regularization with the cost function and gradients. | |
% | |
% Hint: You can implement this around the code for | |
% backpropagation. That is, you can compute the gradients for | |
% the regularization separately and then add them to Theta1_grad | |
% and Theta2_grad from Part 2. | |
% | |
X = [ones(m, 1) X]; | |
y = eye(num_labels)(y,:); | |
a1 = X; | |
z2 = a1 * Theta1'; | |
a2 = sigmoid(z2); | |
n = size(a2, 1); | |
a2 = [ones(n,1) a2]; | |
z3 = a2 * Theta2'; | |
a3 = sigmoid(z3); | |
regularization = (lambda/(2*m)) * (sum(sum((Theta1(:,2:end)).^2)) + sum(sum((Theta2(:,2:end)).^2))); | |
J = ((1/m) * sum(sum((-y .* log(a3))-((1-y) .* log(1-a3))))) + regularization; | |
delta_3 = a3 - y; | |
delta_2 = (delta_3 * Theta2(:,2:end)) .* sigmoidGradient(z2); | |
delta_cap2 = delta_3' * a2; | |
delta_cap1 = delta_2' * a1; | |
Theta1_grad = ((1/m) * delta_cap1) + ((lambda/m) * (Theta1)); | |
Theta2_grad = ((1/m) * delta_cap2) + ((lambda/m) * (Theta2)); | |
Theta1_grad(:,1) -= ((lambda/m) * (Theta1(:,1))); | |
Theta2_grad(:,1) -= ((lambda/m) * (Theta2(:,1))); | |
% ------------------------------------------------------------- | |
% ========================================================================= | |
% Unroll gradients | |
grad = [Theta1_grad(:) ; Theta2_grad(:)]; | |
end |
Author
denzilc
commented
Jul 29, 2012
via email
You should refer to the class. This code was made for last year's class.
The Coursera forums will be more appropriate. You should check them out.
…--Regards,
Denzil
On Mon, Jul 30, 2012 at 12:03 AM, Shishir01 < ***@***.*** > wrote:
use
if you have some idea regarding conjugate gradient descent to optimize the value of Theta1 and Theta2 please share me your concept.
Not anything specific at the moment.
On Monday, July 30, 2012, Shishir01 wrote:
if you have some idea regarding conjugate gradient descent to optimize the
value of Theta1 and Theta2 please share me your concept.
---
Reply to this email directly or view it on GitHub:
https://gist.github.com/1360709
##
…--Regards,
Denzil
with this function how to draw performance plot?
y = eye(num_labels)(y,:);
this line giving me error in matlab. How to correct this
ey = eye(num_labels);
y = ey(y,:);
if y = [1;1;0;0;0;0;0;0;1;1]
ey = eye(num_labels);
y = ey(y,:);
will wrong
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment