The following is the fingerprint of the sample signal from the examples above.
From the graph, it is easy to see patterns and different notes in the song. Lets see how the algorithm addresses the three issues identified in the first paragraph:
Uniqueness – The algorithm only stores the prominent peaks in the spectrogram. Different songs have a different pattern of peaks in frequency and time, thus each song will have a unique fingerprint.
Sparseness – The algorithm only picks up at most five peaks per time slice. This limits the number of peaks in the resulting fingerprint. The threshold spreads out the positions of peaks so that the fingerprint is more representational of the data.
Noise Resistant – Unless the background noise is loud enough to create peaks greater than the peaks present in the song, then very little noise will show up in the fingerprint. Also, a ten second segment has around 6000 data points, so a matched filter will be able to detect a match between two fingerprints, even with a reasonable amount of added noise.
The next section will detail the process used to compare the fingerprint of the song segment to the fingerprints of the songs in the library.