<< Chapter < Page | Chapter >> Page > |
After the song data is imported, the signal is then resampled to 8000 samples per second in order to reduce the number of columns in the spectrogram. This will speed up later computations but still leaves enough resolution in the data for accurate results.
Then the data is high-pass filtered using a 30 th order filter with a cutoff frequency around 2KHz (half the bandwidth of the resampled signal). Filtering is used because the higher frequencies in songs are more unique to each individual song. The bass, however, tends to overshadow these frequencies, thus the filter is used make fingerprint include more high frequencies points. Testing has shown that the algorithm has a much easier time distinguishing songs after they are high-pass filtering.
The spectrogram of the signal is then taken in order to view the frequencies present in each time slice. The spectrogram below is from a 10 second noisy recording.
Each vertical time slice in the bin is then analyzed for prominent local maxima as described in the next section.
In the first time slice, the five greatest local maxima are stored as points in the fingerprint. Then a threshold is created by convolving these five maxima with a Gaussian curve, creating a different value for the threshold at each frequency. An example threshold is shown in the figure below. The threshold is used to spread out the data stored in the fingerprint, since peaks that are close in time and frequency are stored as one point.
For each of the remaining time slices, up to five local maxima above the threshold are added to fingerprint. If there are more than five maxima, then the five greatest in amplitude are chosen. The threshold is then updated by adding new Gaussian curves centered at the frequencies of the newly found peaks. Finally the threshold is scaled down so that it decays exponentially over time. The following figure shows how the threshold changes over time.
The final list of the time and frequencies of the local maxima above the threshold are returned as the song’s fingerprint.
Notification Switch
Would you like to follow the 'Digital song analysis using frequency analysis' conversation and receive update notifications?