This module describes each of the different types of layers we employed in our convolutional neural network.
Convolutional
Convolutional layers produce output feature maps by convolving an input with each of its kernels, trained to recognize different characteristics. Each kernel is an arrangement of weights into a square filter. The first convolutional layer in our network convolves the input image with a set of 20 5x5 kernels to produce 20 feature maps and the second convolutional layer convolves the input (a set of pooled feature maps) with 40 4x4 kernels to produce higher-level feature maps. Each neuron in our convolutional layers uses ReLU (Rectified Linear Units) as its activation function.
The filters in the convolutional layers were trained to recognize particular features. The first convolutional layer detects features such as edges and total “mass” of the image, while the second convolutional layer detects higher-level features including the intersections of features detected in the first layer. The features that each kernel detects were trained through the learning process, where the weights in the kernels were updated during the SGD algorithm.
Pooling
Pooling layers produce an output by reducing the size of its input using some function. The output of each convolutional layer in our network is used as the input to a pooling layer. The pooling layers take 2x2 regions of the input and pass the maximum value of each region as its output. In this way, the pooling layer effectively reduces the size of the data being handled in the network while still preserving the important features that were detected in the convolutional layers. Significant activations of neurons are preserved as a result of taking the maximum value in a region.
Fully-connected
Neurons in fully connected layers are connected to every neuron in the previous layer and every neuron in the next layer. Each connection has a corresponding weight and bias associated with it. The last 2 layers in our network are both fully connected layers. The first fully connected layer detects the presence of the higher level features found in the second convolutional layer, using the ReLU activation function. The second fully connected layer is the softmax layer, using the softmax activation function.