This section provides a quick introduction of CNN (Convolutional Neural Network), which is an extension of the classical neureal network model by adding a sequence of mixed convolutional and pooling layers to the input layer.
What Is CNN (Convolutional Neural Network)?
CNN is an extension of the classical neural network model by adding two types of
special layers as filters to the input layer:
Convolutional Layer - Convolve a feature pattern that exists in
input samples with the full set of input features.
The main purpose of a convolutional Layer is to promote the given pattern
in the sample by enhancing feature structures similar to the pattern and
hiding other feature patterns.
Pooling Layer (Subsampling Layer) - Downsize the feature set with convolved feature pattens preserved.
The main purpose of a pooling layer is to reduce the computation effort without
losing too much accuracy.
Note that, a large set of input features may have many small feature patterns,
so we need apply multiple convolutional layers parallelly with different layers to enhance
But adding an extra parallel convolutional layer will
add another set of filter features to the total number of features.
Also note that, additional convolutional layers can also be used after
the feature set has been downsized by a pooling layer. Feature patterns
used in a convolutional layer after a pooling layer are considered to
enhance higher level feature structures.
So a sequence of mixed convolutional layers and pooling layers can be applied
to the layer until you reach a good feature set size with all feature patterns
preserved. Then you add some classical neural network layers, called dense layers,
to complete a CNN model.
Here is a good picture that illustrate a typical CNN model processing 2 dimensional
input features like images (source: easy-tensorflow.com):
The above CNN model contains the following layers:
Input Layer - 784 nodes for 784 features (28x28 grey scale intensities)
of images of handwritten digits.
Convolutional Layer Group 1 - 16 parallel convolutional layers to enhance 16 lower level
feature patterns. Total number of nodes is 28x28x16 = 12544.
Pooling Layer Group 1 - 16 parallel pooling layers to reduce
16 convolutional layers in group 1 from 28x28 to 14*14.
Total number of nodes is 14x14x16 = 3136.
Convolutional Layer Group 2 - 32 parallel convolutional layers to enhance 32 higher level
feature patterns. Total number of nodes is 14x14x32 = 6272.
Pooling Layer Group 2 - 32 parallel pooling layers to reduce
32 convolutional layers in group 2 from 14*14 to 7x7.
Total number of nodes is 7x7x32 = 1568.
Dense Layer - 1 classic neural network layer
connecting 1568 nodes from Pooling Layer Group 2 to 128 nodes.
Output Layer - 1 classic neural network layer
connecting 128 nodes from the previous layer to generate an output vector of 10 elements.