For implementing deep neural network in Python, we will be using the Keras library as we need to perform many highly computational numerical tasks. Before understanding the implementation, lets understand the libraries related to Keras.
Theano:
Theano is an open source numerical computation library for Python. It is very efficient for fast numeric computations using Python syntax. It can run on both the CPU as well as the GPU. (Making use of GPU is better when you have many highly computational tasks and parallel computation)
TensorFlow:
This is another open source numerical computation library. Again runs on both CPU and the GPU. Originally developed by researchers and engineers from the Google Brain team within Google’s AI organization.
Both these libraries are used for research purpose. i.e. it is used to build a Deep Network from the scratch and hence require a many lines of codes to be written. Hence for practical purpose, Keras library is used.
Keras:
Keras library basically wraps the Theano or Tensorflow libraries. It gives facility to user to build a deep neural network with very few lines of code.
Let’s see the Installation of these 3 libraries.
Note:
Please note that the TensorFlow only supports 64 bit version of Python 3.5 or higher. If you are using 32 bit version, please update the Python Anaconda first before the installation of the libraries.
Simple way to install the libraries are:
Open the Anaconda prompt
pip install theano
pip install tensorflow
pip install keras
But if any of these is giving errors you can use the following method.
-
Create a new environment with Anaconda and Python 3.5:
conda create -n tensorflow python=3.5 anaconda
-
Activate the environment:
activate tensorflow
-
After this you can install Theano, TensorFlow and Keras:
conda install theano conda install mingw libpython pip install tensorflow pip install keras
-
Update the packages:
conda update –all
-
Run Spyder:
spyder
Now that the libraries are installed lets understand the basic flow of how to implement Deep Neural Network using Keras. Please note in this section we are discussing Supervised Classification.
a. Import the essential libraries.
b. Import the dataset.
c. Select the required and relevant features from the dataset.
d. Encode the categorical variables if any.
e. Split the dataset into the Training and Test set.
f. Implement feature Scaling: This step relates to ‘Normalizing the inputs’ step discussed earlier and executing this step is highly required to make the learning faster.
a. Import the Keras library and also Sequential and the Dense modules. (Sequential module is used to initialize the network and Dense module is used to add layers to the ANN)
b. Initialize the network using Sequential object to create a Classifier.
c. Use the ‘add’ method to add the input and hidden layer. Add method takes ‘layer’ as an attribute. This ‘layer’ can be made with the help of ‘Dense’ function. Dense function takes many attributes as described below:
-
No of nodes in Hidden Layer: There is no rule to choose the no of nodes in hidden layer but one tip usually used is to take this number as the average of nodes of the input layer and that of the output layer. Otherwise you can experiment using parameter tuning.
-
Initialization of weights: Use uniform distribution to initialize uniform weights close to zero. (Parameter value ‘uniform’ is used)
-
Activation function: Usually rectifier function is selected for hidden layer and Sigmoid function is used for the output layer.
-
No of nodes in the input layer: This parameter is only to be specified in case of first layer. For all the subsequent layers, this parameter is automatically taken.
d. Add the output layer. Use the sigmoid activation function if there are 2 output classes. If there are more than 2 classes then use softmax function.
e. Compile the network. i.e. add Gradient Descent. Use compile method for the same. This method takes the attributes as follows:
-
Optimizer algorithm: Algorithm to be used to optimise and adjust weights. Use stochastic gradient descent and ‘adam’ is very efficient type of the same.
-
Loss function: The cost function which is to be minimized. If there are two output classes then the function is ‘binary_crossentropy’ and for more than 2 classes it is ‘categorical_crossentropy’.
-
Metrics: Criterion to evaluate the model. Usually it is ‘accuracy’.
f. Fit the ANN to the training set. Input the batch size after which the weights needs to be updated. Also input the number of epochs.
g. Predict the result for the test set.
h. Make confusion matrix to check the accuracy on test set.