Results
We were able to successfully locate the license plate in about 70% of our sample pictures, and of the characters we extracted, we were able to recognize them with about 72.7% accuracy (619 sets of training data).
Future improvements
Compression and cropping
Since we were targeting Texas plates, which only have red, blue, and white colors, we were able to black out many parts of the images by wiping out all green regions. In the future, however, we would like to be able to recognize plates not from Texas that might have green components. Therefore, we should find a criteria for finding the plates other than color.
Letter recognition
Acquiring state pattern and convention attributes
In many license plates, it is difficult to tell the difference between a zero and an O, even for a human. Therefore, for the purposes of this project, zeros and Os were considered the same. However, in many states, it is actually possible to tell the difference because the license plate has a set pattern (e.g., 2 letters, 2 numbers, 2 letters). In the future, we could identify what state the plate comes from and then make use of this knowledge to get more accuracy in letter recognition.
Multi-class support vector machine
In addition, one of the characteristics of SVM is that is solves a two-class problem. In order to get around this, for our project, we used a one-against-the-rest approach. This meant that we basically used the SVM machine to answer the question, "Is this a __?" 35 (A-Z, 1-9) times for each unknown letter/digit. In the future, we will want to look at more efficient and accurate methods. One of the possible improvement can be found in the work by
T.-K. Huang, R. C. Weng, and C.-J. Lin.
Generalized Bradley-Terry Models and Multi-class Probability Estimates. Journal of Machine Learning Research
Automated training set generation and extraction efficiency
Finally, currently, any digit that we feed into the SVM will register as something; we have no way of telling whether the image is in fact a letter/digit. In the future, we would like to train the machine to be able to tell characters from non-characters. This will allow less rigorous (and time-consuming) computation in the image-processing section and give our algorithm greater flexibility.
References
Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines, 2001. Software available at
(External Link)
Special thanks
Thanks to:
- Dr. Aswin Sankaranarayanan, our mentor
- Dr. Richard Baraniuk, the ELEC 301 instructor
- Dr. Fatih Porikli (MERL), for providing us with a license plate dataset
- Drew Bryant and Brian Bue for technical advising