<< Chapter < Page Chapter >> Page >

Fourier analysis of notes

Once we have the start times for each note, we know the duration of each note as well as their start and end times. We create a hamming window of length 1.25 times the length of the note of interest. The window is then used to extract out the note. The following figure illustrates this:

ham
We decided to use a hamming window because it mitigates errors in the onset detection algorithm. Moreover, it doesn't allow for as many spurious oscillations in the spectrum as a rectangular window does. Changes in the spectrum are better captured.

Now that we have the signal of interest, we take its fourier transform using the fft algorithm. We use the frequency domain because the pitch of a note is determined by what frequencies it has power in, while the distribution of power among harmonics determines the sound of the instrument. E.g. 440 Hz corresponds to an "A". For an A on a piano, there will be power at 440, 880, 1320 Hz,etc, and the shape of this distribution distinguishes it from an A on a flute or guitar. We detect the frequencies with the highest power in the fft along with the corresponding power. The HPCP algorithm explained in the next module analyzes these peaks to find the probabilities of the note being anyone of the 12 pitches.

Peak detection

Peak detection refers to the detection of the peaks of the Fourier transform of each note. These correspond to the harmonics of the fundamental frequency of whatever key is being pressed. We used a MATLAB function called "findpeaks" to help us. ...A difficult part of peak detection is determining what constitutes a "peak". For our purposes, we normalized our Fourier transform plot to have a max of 1, and considered all points above a threshold to be a peak. We chose our threshold after carefully looking at the Fourier transforms of many signals.

Hamming window

In the extraction of notes, you may ask yourself, why did we choose to using a hamming window? Why not a boxcar filter? The answer to this question is simple. Imagine the boxcar filter in the frequency domain. What do we see? We basically see a low pass filter (what we want) with lots of ripples attached to both ends of it (what we don't want). The ripples will introduce lots of error as we are letting in frequencies we do not want.

Boxcar filter

square

In comparison, if we look at the frequency domain of a hamming window, while there are also ripples attached to the end of our filter, the ripples are generally much smaller. While not perfect, this is highly preferable to the boxcar filter.

Hamming filter

ham

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Noise resilient piano note & Chord recognition. OpenStax CNX. Dec 24, 2013 Download for free at http://cnx.org/content/col11603/1.4
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Noise resilient piano note & Chord recognition' conversation and receive update notifications?

Ask