<< Chapter < Page | Chapter >> Page > |
As far as the ability to correct pitch with minimal distortion, the PSOLA algorithm is far superior to either the standard Time Shift or even Modified Phase Vocoder systems. This is because the PSOLA algorithm takes into account the
In the way of detection routines, the autocorrelator provided better results than the HPS algorithm. This is because the autocorrelator was less sensitive to noise, while HPS detected notes even in portions of relative silence. If we had expanded the HPS to include a"silence"detector, we might have seen an improvement. However, HPS also suffered from a severe tendency to misclass the octave of the pitch. This is a result of the harmonic nature of the spectrum, the very thing that makes HPS work. In order to correct this, we would have had to add more layers of detectors to determine if the highest peak really was also the lowest to appear in the transform, since we want the fundamental frequency and not multiples of it.
The autocorrelator provided good results, but was unreliable in sections of high frequency, such as an's'sound. In this regions, the r(s) function was extremely badly behaved and did not really have any local minima. To deal with this case, we introduced a"threshold"and declared that if the minimum was above this value, the region must be noisy. This is a quick fix--a better method would be to take a transform of this function and examine its behavior, but that can be left for future investigation.
As far as correction is concerned, a more improved dynamic programming algorithm in the PSOLA method would improve the"phasiness"heard in the output of that method. We do not believe that either of the other two methods--Time Shifting and Mod Phase Vocoder--are likely to ever be as useful as PSOLA, since they introduce formant errors as they are changing pitch.
Another area in which to develop a better algorithm would be in the mapping between detected pitch and desired pitch. Currently we use a logarithmic rounder, but this is simplistic and assumes that the singer is always closer to the desired note than any other, which is clearly not always the case. It would be nice to implement a"note tracker"which follows the detected notes and perhaps tries to determine a melody, but this is another project entirely. Also, we would like to make the correction seem more"natural"by correcting by small amounts in each window, leading to a"pick-up"or"pull-down"sound in the result, as if the singer corrected the pitch himself. This is opposed to the robotic,"Cher-Effect"results we have currently, which were generated using an instantaneous correction to exactly the desired pitch.
Remember that Matlab indexes from 1!
Among other things, we also learnt that computation time is indeed a significant factor in the decision of which algorithm to use. Our PSOLA algorithm out-performed the Modified Phase Vocoder by many fold, and led us to use it because we wanted fast calculation. Also, when the Autocorrelation algorithm was being developed, we noticed that we could exploit several properties of the r(s) function that would lead to speed increases. After making these optimizations, that algorithm performed up to 70% faster than before.
Overall, we learned that there is no pure method for implementing algorithms. We researched the topic and gathered together several different approaches, but it was up to us to make them happen. When we sat down to actually write the code, the question of what variables to use and what order to perform the operations mattered overall, but we could only see this after the entire program had been written. The project was largely a matter of refinement, getting a base-version working and then improving on that.
Notification Switch
Would you like to follow the 'Ece 301 projects fall 2003' conversation and receive update notifications?