Last active
July 25, 2019 14:58
-
-
Save Jeraldy/1aa6ae6fefa46b7a9cc02b6573cfeefe to your computer and use it in GitHub Desktop.
Implementing an Artificial Neural Network in Pure Java (No external dependencies)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/** | |
* | |
* @author Deus Jeraldy | |
* @Email: [email protected] | |
* BSD License | |
*/ | |
// np.java -> https://gist.github.com/Jeraldy/7d4262db0536d27906b1e397662512bc | |
import java.util.Arrays; | |
public class NN { | |
public static void main(String[] args) { | |
double[][] X = {{0, 0}, {0, 1}, {1, 0}, {1, 1}}; | |
double[][] Y = {{0}, {1}, {1}, {0}}; | |
int m = 4; | |
int nodes = 400; | |
X = np.T(X); | |
Y = np.T(Y); | |
double[][] W1 = np.random(nodes, 2); | |
double[][] b1 = new double[nodes][m]; | |
double[][] W2 = np.random(1, nodes); | |
double[][] b2 = new double[1][m]; | |
for (int i = 0; i < 4000; i++) { | |
// Foward Prop | |
// LAYER 1 | |
double[][] Z1 = np.add(np.dot(W1, X), b1); | |
double[][] A1 = np.sigmoid(Z1); | |
//LAYER 2 | |
double[][] Z2 = np.add(np.dot(W2, A1), b2); | |
double[][] A2 = np.sigmoid(Z2); | |
double cost = np.cross_entropy(m, Y, A2); | |
//costs.getData().add(new XYChart.Data(i, cost)); | |
// Back Prop | |
//LAYER 2 | |
double[][] dZ2 = np.subtract(A2, Y); | |
double[][] dW2 = np.divide(np.dot(dZ2, np.T(A1)), m); | |
double[][] db2 = np.divide(dZ2, m); | |
//LAYER 1 | |
double[][] dZ1 = np.multiply(np.dot(np.T(W2), dZ2), np.subtract(1.0, np.power(A1, 2))); | |
double[][] dW1 = np.divide(np.dot(dZ1, np.T(X)), m); | |
double[][] db1 = np.divide(dZ1, m); | |
// G.D | |
W1 = np.subtract(W1, np.multiply(0.01, dW1)); | |
b1 = np.subtract(b1, np.multiply(0.01, db1)); | |
W2 = np.subtract(W2, np.multiply(0.01, dW2)); | |
b2 = np.subtract(b2, np.multiply(0.01, db2)); | |
if (i % 400 == 0) { | |
print("=============="); | |
print("Cost = " + cost); | |
print("Predictions = " + Arrays.deepToString(A2)); | |
} | |
} | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
However, what is happening, is that the NN hasn't learned what I thought it was going to learn. When I try four inputs, no matter what I give it, it outputs the learned pattern:
test input=[[1.0, 0.0, 1.0, 1.0], [0.0, 1.0, 1.0, 1.0]]
Cost = 0.02185745791739911
Test Prediction = [[0.003408446482782547, 0.9872463542830693, 0.9452257141301419, 0.014738669159170737]]
test input=[[1.0, 1.0, 0.0, 0.0], [0.0, 1.0, 0.0, 1.0]]
Cost = 0.03372416671312117
Test Prediction = [[0.002095908947609844, 0.9235729681325365, 0.998439571534914, 0.05041614843514693]]
So I think the NN has "learned" to output the static pattern from the training data rather than perform an XOR operation on a single pair.
When I test with 3 pairs, it outputs the first three static results of the training data:
test input=[[0.0, 0.0, 0.0], [1.0, 0.0, 0.0]]
Cost = 0.0018916002564269888
Test Prediction = [[0.00336438364815533, 0.9975507610865536, 0.9982574181545878]]
I can make each prediction wrong by inverting the test data from the training data.
It doesn't seem like the X inputs are making a difference, only the Y.