After tinkering with hyper-parameters, various cost functions, and different activation functions, we built a convolutional neural network ourselves using Python's Theano library and trained it to recognize handwritten digits from 0-9.
Overview
We built our network to detect handwritten digits between 0 and 9, comprised of 6 total layers (seen below). We trained and tested our network using the MNIST data set of handwritten digits (50,000 training images, 10,000 test images). During the training, the weights and biases of each neuron in each layer were adjusted to minimize the cost function we used in the stochastic gradient descent algorithm. The first layer in our network is a convolutional layer that takes a 28x28 image and convolves it with 20 different 5x5 filters creating 20 24x24 feature maps. The second layer – a pooling layer – samples the 24x24 feature maps down to 12x12 feature maps. The third layer is another convolutional layer that convolves 40 different 4x4 filters, tuned to detect higher- level features, into 40 8x8 feature maps. The fourth layer pools the output of the second convolutional layer into 4x4 maps. The fifth layer is a fully connected layer of 100 neurons that detects the presence of features in the 4x4 feature maps. Finally the 6th layer is a softmax layer that maps combinations of features from the 5th layer into the output, a length-10 vector where the index of the largest entry was the digit our network guessed.
We also designed an applet that allows a user to visualize the flow of data in our network. It lets the user see the effect of our network on an input image (either an image in the MNIST dataset or an image drawn on the screen) and see the output of each layer of our network in real-time. We also designed the GUI to display the trained filters of each convolutional layer and the outputs of each neuron in every layer. This allowed us to gain a visual understanding of how the filters in our network were trained, which characteristics the filters were trained to detect, and how the filters used convolution to recognize the different features of the digits. Additionally it allowed us to test and prove just how accurate our network was, as well as easily identify and fix any cases where our network did not successfully classify digits.