**Neural Network Tutorials - Herong's Tutorial Examples** - 1.20, by Dr. Herong Yang

Walk-Through on Tariq's Code

This section provides a walk-through session on the Python code associated with Tariq's book 'Make Your Own Neural Network'. Explanations are provided on all main statements of the code. Graphical illustrations are provided on some key matrix operations used in the code.

In the last tutorial we learned how to install and run the Python code associated with Tariq's book "Make Your Own Neural Network". Now let's walk through Tariq's code and learn the neural network model used in the code.

1. If you open Tariq's code in a text editor, you see that the first section provides copyright information and imports 2 libraries: NumPy and SciPy. The Matplotlib library is commented out since it is not used.

# %% # python notebook for Make Your Own Neural Network # code for a 3-layer neural network, and code for learning the MNIST dataset # (c) Tariq Rashid, 2016 # license is GPLv2 # %% import numpy # scipy.special for the sigmoid function expit() import scipy.special # library for plotting arrays #hy import matplotlib.pyplot #hy # ensure the plots are inside this notebook, not an external window #hy %matplotlib inline

2. The next section starts to define the "neuralNetwork" class with the
standard __init__() method, which allows us to create a generic neural
network with 3 layers. 4 parameters, *inputnodes*, *hiddennodes*, *outputnodes*,
and *learningrate* are provided to control the network size and learning rate.

# %% # neural network class definition class neuralNetwork: # initialize the neural network def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):

3. The next 4 lines of code copies parameters *inputnodes*, *hiddennodes*,
and *outputnodes* to instance variables *self.inodes*, *self.hnodes*, and *self.onodes*,
representing the number of nodes in each layer.

# set number of nodes in each input, hidden, output layer self.inodes = inputnodes self.hnodes = hiddennodes self.onodes = outputnodes

4. The next few lines of code initializes two weight matrices, *self.wih* for weights
associated with links from input layer nodes to hidden layer nodes,
and *self.who* for weights
associated with links from hidden layer nodes to output layer nodes.

# link weight matrices, wih and who # weights inside the arrays are w_i_j, where link is from node i to node j in the next layer # w11 w21 # w12 w22 etc self.wih = numpy.random.normal(0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes)) self.who = numpy.random.normal(0.0, pow(self.hnodes, -0.5), (self.onodes, self.hnodes))

Note that the weight matrix is initialized with
with random numbers with a normal distribution that has a mean value of 0.0.
The standard deviation of this distribution is set to the inverse of the square root
of the inbound layer nodes N, or N^{-0.5}, which is coded as
*pow(self.inodes, -0.5)* for weight matrix *self.wih*, and
*pow(self.hnodes, -0.5)* for weight matrix *self.who*.
Random numbers that meet these given requirements are actually generated
by the NumPy function *numpy.random.normal()*.

5. The next few lines of code copies parameters *learningrate*
to instance variable *self.lr*.
Another instance variable *self.activation_function* is also
created to register the logistic sigmoid function provided as
scipy.special.expit() from the SciPy library.

# learning rate self.lr = learningrate # activation function is the sigmoid function self.activation_function = lambda x: scipy.special.expit(x) pass

That's the end of the __init__() method of the "neuralNetwork" class.

6. Tariq's code continues to define the train() method of the "neuralNetwork" class.
The train() method takes two parameters, *inputs_list*
and *targets_list*, representing input values of a single training sample
and expected output values of the same sample.

# train the neural network def train(self, inputs_list, targets_list): # convert inputs list to 2d array inputs = numpy.array(inputs_list, ndmin=2).T targets = numpy.array(targets_list, ndmin=2).T

Note that method parameters, *inputs_list*
and *targets_list*, are converted from 1-dimensional arrays (like [N]) to
2-dimensional matrices (like [1,N]) by the numpy.array() function.
Resulting 2-dimensional matrices are then transposed (like [N,1])
to be ready for matrix operations in the next step.
Those transposed matrices are stored in local variables, *inputs* and
*targets*. The transposition operation *().T*
on a matrix can be illustrated graphically as:

|-| ( [- - -] ) . T = |-| |-|

7. The next 4 lines of code moves signals of the training sample
from the input layer to the
hidden layer by performing a matrix dot operation for the weight matrix
*self.wih* and the input matrix *inputs* using the numpy.dot() function.
The resulting matrix, *hidden_inputs* is then passed through the
activation function to become the signal matrix in the hidden layer and
stored as *hidden_outputs*.

# calculate signals into hidden layer hidden_inputs = numpy.dot(self.wih, inputs) # calculate the signals emerging from hidden layer hidden_outputs = self.activation_function(hidden_inputs)

The above code can be illustrated graphically as:

hidden_outputs wih inputs |-| |- - -| |-| |- - -| |-| |-| = activation_function ( |- - -| dot |-| ) |-| |- - -| |-| |-| |- - -|

9. The next 4 lines of code moves signals stored in the hidden layer
to the output layer by performing a matrix dot operation for the weight matrix
*self.who* and signal matrix of the hidden layer *hidden_outputs* using the
numpy.dot() function.
The resulting matrix, *final_inputs* is then passed through the
activation function to become the signal matrix in the output layer and
stored as *final_outputs*.

# calculate signals into final output layer final_inputs = numpy.dot(self.who, hidden_outputs) # calculate the signals emerging from final output layer final_outputs = self.activation_function(final_inputs)

The above code can also be illustrated graphically as:

final_outputs who hidden_outputs |-| |-| |- - - - -| |-| |-| = activation_function ( |- - - - -| dot |-| ) |-| |- - - - -| |-| |-|

10. The code continues to calculate error values by comparing *final_outputs*
against the given *targets*. Those error values, *output_errors*,
are then distributed back to the hidden layer
according to the transposed weight matrix.

# output layer error is the (target - actual) output_errors = targets - final_outputs # hidden layer error is the output_errors, split by weights, recombined at hidden nodes hidden_errors = numpy.dot(self.who.T, output_errors)

The above code can also be illustrated graphically as:

hidden_errors who.T output_errors |-| |- - -| |-| |- - -| |-| |-| = ( |- - -| dot |-| ) |-| |- - -| |-| |-| |- - -|

11. Then, it's time to do the weight matrix adjustment using the formula of "adjustment = rate * ( error * output * (1 - output) dot (input) )". The code starts with the weight matrix between the hidden and output layers first.

# update the weights for the links between the hidden and output layers self.who += self.lr * numpy.dot((output_errors * final_outputs * \ (1.0 - final_outputs)), numpy.transpose(hidden_outputs))

Graphically, the weight matrix adjustment can be illustrated as:

who error output output |- - - - -| |-| |-| |1| |-| hidden_outputs.T |- - - - -| += lr * ( (|-| * |-| * (|1| - |-|)) dot |- - - - -| ) |- - - - -| |-| |-| |1| |-|

12. The next few lines adjusts the weight matrix between the input and hidden layers in the same way as the previous section.

# update the weights for the links between the input and hidden layers self.wih += self.lr * numpy.dot((hidden_errors * hidden_outputs * \ (1.0 - hidden_outputs)), numpy.transpose(inputs)) pass

That's the end of the train() method of the "neuralNetwork" class.

13. Tariq's code continues to define the query() method of the "neuralNetwork" class.
The query() method takes one parameters, *inputs_list*, only.
It performs only the signal forward propagation in the same way as the train() method.

# query the neural network def query(self, inputs_list): # convert inputs list to 2d array inputs = numpy.array(inputs_list, ndmin=2).T # calculate signals into hidden layer hidden_inputs = numpy.dot(self.wih, inputs) # calculate the signals emerging from hidden layer hidden_outputs = self.activation_function(hidden_inputs) # calculate signals into final output layer final_inputs = numpy.dot(self.who, hidden_outputs) # calculate the signals emerging from final output layer final_outputs = self.activation_function(final_inputs) return final_outputs

That's the end of the "neuralNetwork" class, which represents a 3-layer neural network model using logistic sigmoid function as the activation function.

14. Tariq's code continues to create an instance of the above
neural network model for the MNIST database starting with
model's parameters.
The *input_nodes* is set to 784, because the handwritten
digit samples are normalized in 784 (28x28) pixels.
So the darkness value (or grey scale) in each pixel is taken into
a single node in the input layer without any trancations or paddings.

# %% # number of input, hidden and output nodes input_nodes = 784

15. The *hidden_nodes* is set to 200 with no particular reason.
But it should be large enough so that the neural network has enough
memory (weight matrices) to remember handwritten digit patterns.
However it can not be too large to consume too much computing resources.
We will do some experiments later on *hidden_nodes* to see its impact
on the neural network model.

hidden_nodes = 200

16. The *output_nodes* is set to 10, because Tariq decided
to encode those 10 expected labels (to be recognized from input samples)
directly into 10 nodes in the output layer.
Each label is encoded with a single node turned on in the output layer.
For example, label 3 is expected as [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
in the output layer.
We will do some experiments later on *output_nodes* with different
encoding schema to see its impact on the neural network model.

output_nodes = 10

17. The *learning_rate* is set to 0.1 with no particular reason.
But it should be large enough so that the neural network can reach
the stable state quickly.
However it can not be too large causing the neural network to jump
back and forth around the stable point.
We will do some experiments later on *learning_rate* to see its impact
on the neural network model.

# learning rate learning_rate = 0.1

18. The following code creates an instance of "neuralNetwork" with above
parameters and stores it in a local variable *n*.

# create instance of neural network n = neuralNetwork(input_nodes,hidden_nodes,output_nodes, learning_rate)

19. The next few lines of code reads in the training dataset (with 60,000 samples) of the MNIST database
and stores it in a local variable *training_data_list*.

# %% # load the mnist training data CSV file into a list training_data_file = open("mnist_dataset/mnist_train.csv", 'r') training_data_list = training_data_file.readlines() training_data_file.close()

20. Tariq's code continues to train the neural network model
by running the training dataset 5 times (or epochs).
Repeating training multiple times can improve the accuracy of the model,
if the training dataset is not big enough.
We will do some experiments later on *epochs* to see its impact
on the neural network model.

# %% # train the neural network # epochs is the number of times the training data set is used for training epochs = 5 for e in range(epochs):

21. The next section of code loops through each sample in the training dataset. The code extracts input values (grey scales of 784 pixels) represented in a single line in CSV format from second position to the end of line. Remember that the first position stores the label of the expected digit of the sample. Input values are then normalized in the range of 0.01 to 1.0.

# go through all records in the training data set for record in training_data_list: # split the record by the ',' commas all_values = record.split(',') # scale and shift the inputs inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01

22. The code continues to prepare the expected output values and stores them in
*targets*.
Expected output values are set to 0.01 for all nodes, except the node
(set to 0.99) that corresponds to the expected label of the sample.

# create the target output values (all 0.01, except the desired label which is 0.99) targets = numpy.zeros(output_nodes) + 0.01 # all_values[0] is the target label for this record targets[int(all_values[0])] = 0.99

23. Finally, the train() method is called with above input and expected values of the sample to train the neural network once. After that, the execution continues to the next training sample and next epoch.

n.train(inputs, targets) pass pass

That's the end of training phase on the neural network model.

24. Tariq's code continues to evaluate the accuracy of the neural network model using the test dataset of the MNIST database by reading the test samples first.

# %% # load the mnist test data CSV file into a list test_data_file = open("mnist_dataset/mnist_test.csv", 'r') test_data_list = test_data_file.readlines() test_data_file.close()

25. The next section of code loops through the test dataset. Input values and expected label are extract from a single line in CSV format in the same way in the training phase.

# %% # test the neural network # scorecard for how well the network performs, initially empty scorecard = [] # go through all the records in the test data set for record in test_data_list: # split the record by the ',' commas all_values = record.split(',') # correct answer is first value correct_label = int(all_values[0]) # scale and shift the inputs inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01

26. The next few lines of code queries the neural network model with input values. The output values of the query is scanned for the node with the highest value to determine the predicted label.

# query the network outputs = n.query(inputs) # the index of the highest value corresponds to the label label = numpy.argmax(outputs)

27. The next few lines of code compares the predicted label against
the expected label.
The result is registered in a local variable *scorecard*.
After that, the execution continues to the next test sample.

# append correct or incorrect to list if (label == correct_label): # network's answer matches correct answer, add 1 to scorecard scorecard.append(1) else: # network's answer doesn't match correct answer, add 0 to scorecard scorecard.append(0) pass pass

28. Finally, test results are summarized as a performance score ( success rate on the test dataset) of the neural network model.

# %% # calculate the performance score, the fraction of correct answers scorecard_array = numpy.asarray(scorecard) print ("performance = ", scorecard_array.sum() / scorecard_array.size) # %%

Well, that's a long code walk-through session. Hope you enjoyed it!

Table of Contents

Deep Playground for Classical Neural Networks

Building Neural Networks with Python

►Simple Example of Neural Networks

"Make Your Own Neural Network" in Python

TensorFlow - Machine Learning Platform

PyTorch - Machine Learning Platform

CNN (Convolutional Neural Network)