<< Chapter < Page | Chapter >> Page > |
In the beginning, we made the template image and the video separately. In our experiment we used the large green breadboard as the object to be tracked. We first took a picture of the breadboard lying on the table using a high-resolution camera. Then, we switched the camera to video mode with a size of only 480×640 and recorded a video with someone moving the board around. We ran the code and the tracking result was disappointing.
Then we realized that lots of details in the template image could not even be seen on the object in the video. For example, the screws and pinholes were not at all visible. Hence, we decided to crop out the template from a screenshot of the video so as to ensure the same level of detail. And it worked!
Another problem with matching video frames is that when the shutter speed is relatively slow, the image in each frame blurs and the features of the object are no longer distinguishable.
To avoid motion blur, we had to slow down our movements when recording the video. Otherwise, the tracked position would be completely off.
However, the overall resolution and quality of the video matter, too. We tried converting one of our video files online because one member’s MATLAB cannot read in a .mov file. But after converting the file, the quality of the tracking result went down dramatically. We believe that it was partially due to the even lower resolution of the converted file. When the resolution is too low, a sharp corner can no longer be distinguished from a round corner. On the other hand, there could also be a problem introduced by the compression method that was used by that website, because clearly we could see non-uniformity in the once uniform white board area. This compression method actually introduced additional features or errors into the video frames.
Since the resolution of our recording devices was not very good, a complex background could also be problematic, because there would be plenty of similar features in the background, but there would not be enough details to fully distinguish them. As a result, we had to use a simple background, such as a wall, and had to avoid wearing shirts full of complex patterns.
An object that doesn’t have distinct SURF features is extremely problematic. We were experimenting with a blue pencil box. If the pencil box, lying on the table by itself, was used as the template image, OpenSurf would not output any feature descriptor that was above the threshold. Not until we lowered the threshold all the way down to 0.00001 (it was always set to 0.0008 for our project) did we find six features (normally more than a dozen for a small template and more than a hundred for a full video frame), and even those six features were degenerated!
If we cropped the template in such a way that some background was included, then only the outline or features in the background would be detected, and looking for matching features in the testing image would not make any sense. As is shown below, the matching failed.
In order to avoid all of the problems that occurred during our experiments, we decided that for the demo video we would only have one dancer, so as to get a closer shot and a higher resolution. The person would wear plain clothing, act in front of a wall, and move slowly and avoid out-of-plane rotation in order to increase the accuracy of object tracking.
However, because the feature matching is not very robust, and even one or two mismatches would introduce huge error into the orientation of the object, the output angular velocity is extremely noisy. That is why we finally decided to discard the angular velocity data.
Notification Switch
Would you like to follow the 'Dwts - dancing with three-dimensional sound' conversation and receive update notifications?