<< Chapter < Page | Chapter >> Page > |
In order to make best use of the Laugh Track Assassinator 's algorithm, we need to be able to run it in real time with as wide a range of source materials as possible. To accomplish this lofty goal, we have implemented a DirectShow filter. DirectShow is Microsoft's technology for manipulating media on the Windows platform. Nearly all media players, such as Windows Media Player, Media Player Classic, and various DVD program, use DirectShow to render video and audio. By writing a DirectShow filter, our algorithm can be used to manipulate nearly any type of media, be it a DVD, an encoded movie, or a live TV video stream.
All DirectShow operations are based on filters. Filters describe the translation of data from one source or type to another. DirectShow automatically finds what filters are needed to play a particular media file. The generated graph can be visualized in Microsoft's GraphEdit program. Here is what the generated graph looks like for a source video file with the Laugh Track Assassinator filter inserted:
DirectShow has generated an AVI splitter to transform the file data into an audio and video stream. The video is then sent to the ffdshow Video Decoder filter, which is then sent to the Video Renderer . The audio stream is sent from the file, through the MP3 Decoder , an AC3Filter , the Laugh Track Assassinator , and finally rendered to the speakers through the DirectSound filter.
To create the DirectShow-compatible filter we used Microsoft's Windows SDK , and rewrote the audio transform filter example. (The Windows SDK can be downloaded from Microsoft here ). We then coded the two main steps in our algorithm: a low pass filter and a threshold detection scheme.
In order to find a balance between frequency resolution and speed, we chose a 1000-point finite impulse response low pass filter . We had Matlab generate the one thousand filter weights, and then we converted them into a C++ format suitable for DirectShow. Since the filter requires 1000 previous samples to calculate one low pass filtered sample, we created a 1000 point circular buffer to hold the last 1000 samples of the input at any given time.
The final step in our removal algorithm requires a threshold detection in both amplitude (vertical) and time (horizontal). The requirement for a time-based threshold meant we had to delay the input signal by at least the width of the horizontal threshold. In the end we decided on a 1 second delay to allow for the width threshold of 0.8 seconds, as well as making it easier to resynchronize the video signal with the audio afterward.
Notification Switch
Would you like to follow the 'Elec 301 projects fall 2007' conversation and receive update notifications?