Neural Network in JavaScript
Friday, August 5th, 2016
Neural Networks are really useful when trying to model labeled data. By optimizing the weights for a specific data-set Neural Networks can allow developers to predict output for any given input data. This also allows developers to predict many classes for a given combination of inputs.
We will use matrices to compute the forward propagation and to update the weights. Each math operation in the Neural Network has been structured into modular functions. They are sigmoid,forwardPropagation, sigmoid_Derivative, costFunction, costFunction_Derivative, gradientDescent, train_network and predict_result.
This function implements the sigmoid where 'z' is a matrix containing the sum of each input multiplied by its weight. To perform matrix operations the 'mathjs' npm module is used. The sigmoid function saturates the input to either 0 or 1 depending on the values of 'z'.
This function initiates all the default variables and starts the Gradient Descent. For parameters it has the learning rate, cost function bound and the training data set(X, Y).
Finally after the Neural Network has been successfully trained it can predict an output for any combination of input values of the same size which were used in training.
Neural Network Layout
A Neural Network consists of at least one hidden layer, one output layer and one input layer. In this post I have discussed the implementation of a binary classifier neural network with many inputs leading to one output. The hidden layer consists of units that perform sigmoid activation on the inputs. We can use this process of sigmoid activation to get an output value for a set of given input values. This is known as forward propagation. When we find the difference between the actual output and the output we computed given our weights we get an error value. Plotting this error value gives us the cost function. To train the Neural Network in the best way the cost function needs to output a very low error when we complete training. We train the Neural Network by finding the derivative of the cost function with respect to the weights and subtract this gradient from the original weights to get updated weights. This will cause the cost function(error) to decrease over many iterations. The updating of weights using the cost function derivative over many iterations is known as Gradient Descent. This way we can train the Neural Network and be certain that it produces an approximate output for a set of inputs.
We will use matrices to compute the forward propagation and to update the weights. Each math operation in the Neural Network has been structured into modular functions. They are sigmoid,forwardPropagation, sigmoid_Derivative, costFunction, costFunction_Derivative, gradientDescent, train_network and predict_result.
Neural Network
var Neural_Network = function() {
this.MathJS = require('mathjs');
this.q = require('q');
};
This is the base function for the implementation.
Sigmoid
Neural_Network.prototype.sigmoid = function(z) {
var scope = {
z: (typeof(z) === "number") ? this.MathJS.matrix([
[z]
]) : z
},
sigmoid;
scope.ones = this.MathJS.ones(scope.z.size()[0], scope.z.size()[1]);
sigmoid = this.MathJS.eval('(ones+(e.^(z.*-1))).^-1', scope); //1/(1+e^(-z))
return sigmoid;
};
This function implements the sigmoid where 'z' is a matrix containing the sum of each input multiplied by its weight. To perform matrix operations the 'mathjs' npm module is used. The sigmoid function saturates the input to either 0 or 1 depending on the values of 'z'.
Forward Propagation
Neural_Network.prototype.forwardPropagation = function(X) {
var y_result, X = this.MathJS.matrix(X) || this.x, scope = {};
this.z2 = this.MathJS.multiply(X, this.W1);
scope.z2 = this.z2;
this.z2 = this.MathJS.eval('z2', scope);
this.a2 = this.sigmoid(this.z2);
this.z3 = this.MathJS.multiply(this.a2, this.W2);
scope.z3 = this.z3;
this.z3 = this.MathJS.eval('z3', scope);
y_result = this.sigmoid(this.z3);
return y_result;
};
This is an implementation for the forward propagation algorithm which is used to find the output value to any given combination of input values. In the Neural Network this function can be used to predict a new value after we complete training or we can use the output value to find the error.
This is the implementation for the sigmoid derivative. It is used in implementing the cost function derivative which is used to update the weights of the Neural Network.
Cost Function Derivative
This is the implementation for the cost function derivative. It is used to compute the derivative of the cost function with respect to the weights in the first layer 'dJdW1' and also in the second layer 'dJdW2'. Using these gradients the weights can be updated during Gradient Descent.
Sigmoid Derivative
Neural_Network.prototype.sigmoid_Derivative = function(z) {
var scope = {
z: z
},
sigmoid_Derivative;
scope.ones = this.MathJS.ones(z.size()[0], z.size()[1]);
sigmoid_Derivative = this.MathJS.eval('(e.^(z.*-1))./(ones+(e.^(z.*-1))).^2', scope); //(1+e^(-z))/(1+e^(-z))^2
return sigmoid_Derivative;
};
This is the implementation for the sigmoid derivative. It is used in implementing the cost function derivative which is used to update the weights of the Neural Network.
Cost Function
Neural_Network.prototype.costFunction = function() {
var J;
var scope = {};
this.y_result = this.forwardPropagation(this.x);
scope.y_result = this.y_result;
scope.y = this.y;
scope.x = this.x;
J = this.MathJS.sum(this.MathJS.eval('0.5*((y-y_result).^2)', scope))/this.x.size()[0];
return J;
};
This is the cost function for the Neural Network. It returns the sum of errors of each value in the 'y' matrix to show how much the values differ from the data set.
Cost Function Derivative
Neural_Network.prototype.costFunction_Derivative = function() {
this.y_result = this.forwardPropagation(this.x);
var scope = {};
scope.y_result = this.y_result;
scope.y = this.y;
scope.diff = this.MathJS.eval('-(y-y_result)', scope);
scope.sigmoid_Derivative_z3 = this.sigmoid_Derivative(this.z3);
var del_3 = this.MathJS.eval('diff.*sigmoid_Derivative_z3', scope);
var dJdW2 = this.MathJS.multiply(this.MathJS.transpose(this.a2), del_3);
scope.arrA = this.MathJS.multiply(del_3, this.MathJS.transpose(this.W2));
scope.arrB = this.sigmoid_Derivative(this.z2);
var del_2 = this.MathJS.eval('arrA.*arrB',scope);
var dJdW1 = this.MathJS.multiply(this.MathJS.transpose(this.x), del_2);
return [dJdW1, dJdW2];
};
Gradient Descent
Neural_Network.prototype.gradientDescent = function() {
var gradient = new Array(2),
cost,
scope = {},
defered = this.q.defer(),
i = 0;
console.log('Training ...\n');
while (1) {
gradient = this.costFunction_Derivative();
scope.W1 = this.W1;
scope.W2 = this.W2;
scope.rate = this.learningRate;
scope.dJdW1 = gradient[0];
scope.dJdW2 = gradient[1];
this.W2 = this.MathJS.eval('W2 - rate*dJdW2', scope);
this.W1 = this.MathJS.eval('W1 - rate*dJdW1', scope);
cost = this.costFunction()
if (cost < (1 / (this.threshold||this.MathJS.exp(6)))) {
defered.resolve();
break;
}
if (i % 100 === 0) {
console.log({'iteration': i, 'cost': cost}); //notify cost values for diagnosing the performance of learning algorithm.
}
i++;
}
return defered.promise;
};
The Gradient Descent helps in updating the weights and lowering the overall cost of the Neural Network. The weights are updated by subtracting the product of the learning rate constant and the gradients. This function runs in a loop until the cost has reduced to an acceptable bound.
Train Network
Neural_Network.prototype.train_network = function(learningRate, threshold, X, Y) {
this.x = this.MathJS.matrix(X);
this.y = this.MathJS.matrix(Y);
this.threshold = threshold;
if ((this.y.size()[0] !== this.x.size()[0])) {
console.log('\nPlease change the size of the input matrices so that X and Y have same number of rows.');
}
else if((this.y.size()[1]>1)){
console.log('\nPlease change the size of the input matrix Y such that there is only one column because this a single class classifier.');
} else {
this.inputLayerSize = this.x.size()[1];
this.outputLayerSize = 1;
this.hiddenLayerSize = this.x.size()[1] + 1;
this.learningRate = learningRate || 0.5;
this.W1 = (this.MathJS.random(this.MathJS.matrix([this.inputLayerSize, this.hiddenLayerSize]), -5, 5));
this.W2 = (this.MathJS.random(this.MathJS.matrix([this.hiddenLayerSize, this.outputLayerSize]), -5, 5));
return this.gradientDescent();
}
};
This function initiates all the default variables and starts the Gradient Descent. For parameters it has the learning rate, cost function bound and the training data set(X, Y).
Predict Result
Neural_Network.prototype.predict_result = function(X) {
var y_result = this.forwardPropagation(X);
return y_result;
};
Finally after the Neural Network has been successfully trained it can predict an output for any combination of input values of the same size which were used in training.
Sample Usage
var Neural_Network = require('advanced-neural-network');
var nn = new Neural_Network();
nn.train_network(0.9, undefined /*optional threshold value*/, [
[1, 1, 1, 1, 0, 1],
[0, 1, 0, 0, 1, 0],
[1, 0, 1, 1, 1, 1],
[0, 1, 1, 0, 0, 0],
[1, 0, 0, 1, 0, 1],
[0, 0, 1, 0, 0, 0],
[1, 1, 0, 1, 1, 1],
[1, 0, 0, 1, 0, 1]
], [
[1],
[0],
[1],
[1],
[0],
[1],
[1],
[0]
]).then(console.log(nn.predict_result([[1,0,0,1,0,1]])));
/*Output
Training ...
{ iteration: 0, cost: 1.383523290363864 }
{ iteration: 100, cost: 0.04008406998951956 }
{ iteration: 200, cost: 0.016181475081737937 }
{ iteration: 300, cost: 0.009841798424077541 }
{ iteration: 400, cost: 0.0069985481625215226 }
{ iteration: 500, cost: 0.005402782030422182 }
{ iteration: 600, cost: 0.00438707375793734 }
{ iteration: 700, cost: 0.003686178233980667 }
{ iteration: 800, cost: 0.003174502735338863 }
{ iteration: 900, cost: 0.0027851304470238596 }
{ iteration: 1000, cost: 0.0024792318930790076 }
{ _data: [ [ 0.030592746473324182 ] ],
_size: [ 1, 1 ],
_datatype: undefined }
*/