This module outlines the conclusions we made about CNNs and the future steps we could take with our project.
Conclusion and future steps
Given our results, we conducted some additional exploration into manipulations of the kernels produced by the convolutional layers. One such manipulation was the production of a library of useful looking filters generated by multiple separately trained networks, and the injection of these filters into the first convolutional layer of an untrained network, then training the network. We experimented by freezing the initial layer so the weights and biases associated with the kernels in the first layer would remain as the ones that were injected, and proceed to train the network. The result was a substantial speedup in learning, reaching an accuracy of 95.43% after a single epoch, compared to the 80.21% and 60.32% obtained by the networks that generated the kernels that were injected into the test network. Unfreezing the layer and continuing training did not yield any useful results. The accuracy of the network did not increase substantially more after additional training. The primary conclusion is that initializing weights and biases in the convolutional layers to ones previously learned in a separate network can substantially decrease the time it takes for a network to learn. A future elaboration would be to not even generate kernels using previously trained networks, but to initialize weights using prototypical image processing filters. If convolutional nets are able to be initialized with useful filters initially, they may be able to substantially reduce the amount of time required for a network to learn, or perhaps increase accuracy.
Acknowledgements
Our mentor Mayank Kumar, PhD student. Rice University
Nielsen, Michael. Neural Networks and Deep Learning. http://neuralnetworksanddeeplearning.com/index.html