The
OpenCV (Open Source Computer Vision) library provided the implementations for many of the methods explained in this module. Detailed explanations of the functionality are available on the OpenCV website.
Why use color isolation?
Many computer vision methods exist for feature detection (such as detecting faces or characters in an image). We wanted the possibility to track multiple types of objects with our virtual theremin. Feature detection does not allow for this flexibility. We also needed our object recognition to be fast in order to drive real time audio. Feature detection algorithms work, but are known to be a bit slow. In order to address these concerns, we opted to use color to track objects. When the user launches our program, the user must select the object they wish to track. The color of this object is stored by the program. Every future frame retrieved by the webcam is scanned for all pixels within range of that color, allowing for the isolation of the object after a few steps.
Color spaces
Rgb
Cameras operate in the RGB color space. In the RGB color space, images store three channels of information: red, green, and blue. For each channel, 8 bits of data is stored. RGB images use additive color mixing: zeros in all channels represents black and 255 in each channel represents white. Although this color space is easy to use, highlight and shadow affect RGB values. Therefore, tracking using RGB values is not a realistic option to track a constant color.
Hsv
Our virtual theremin first converts RGB to HSV. HSV is an alternative color space that uses three different channels of information: Hue, Saturation, and Value. The HSV color space represents colors according to the figure below. Unlike RGB, highlight and shadow do not cause changes in hue values. Therefore, it is possible to track an object in variable lighting conditions using HSV. The conversion to HSV from RGB is as follows:
If the above formula results in H less than 0, it is converted to H+360.
Object detection
Gaussian blur
Before any detection is attempted, a slight blur is applied to the image using a Gaussian kernel. This blur smoothes out any noise in the initial image. The blurred image is then converted to the HSV color space.
Image thresholding
The user initially selects a point in the image. The HSV value at this point is used as the center color in the threshold range. Using the values set by the margin sliders, the HSV image is converted to a binary image by thresholding the HSV image to the specified range around the selected color.
Object isolation
Depending on the environment, the thresholding image may or may not contain other noise. The example photo was taken against a solid background of a different color. Therefore, there is no other noise visible in this image. However, in order to determine which portion of the thresholded image is our desired object, we assume that noise is present.
We first find all of the contours from the edges in the thresholded image. For each found contour, we compute the area enclosed. We assume that the contour enclosing the largest area is the desired object. Once this specific contour has been determined, we fit an ellipse to the contour. From this ellipse, we can determine the angle of the major axis and the position of the center. The isolated image, shown below, has this ellipse drawn in red around the contour shaded in blue.