<< Chapter < Page | Chapter >> Page > |
Through this overlap and add approach the signal retains most of its correct shape. For both algorithms the original signal was broken down into overlapping windows of a specified size and hop size (which should be consistent with the values provided to the detection algorithm). Then, for each window the detected period (one divided by the detected fundamental frequency) and target period is computed and used to build up the new data for that window. After the construction of each window, which is described further under the two approaches, the windows themselves where then overlapped and added to create the new pitch corrected output signal. When the detection algorithm decides that a given window is unvoiced (i.e. has no fundamental frequency), both algorithms just copy that window as is, without any modification. A Hanning window is used to filter out the inconsistencies created from adding together overlapping windows of the output signal. This helps in the smoothing process so that there are not large discontinuities between added segments.
The key to PSOLA is the determination and utilization of pitch markers in the original signals. The idea is that these markers should be equally spaced throughout the signal (at intervals equal to the detected fundamental period), but also that they should be placed at a location for which the signal has a maximum value (a peak). These two constraints are often in conflict, especially since our assumption that the fundamental period is constant for the entire window is not entirely true. As a result, following the highest peak in the signal from period to period may require relaxing the requirement that the markers be exactly equally spaced. On the other hand, if we only follow the maximum peak without regard for the fundamental period, our markers no longer have any regard for the pitch of the window and are not useful.
In order to strike this compromise, we created a matrix where each column contains two periods of the signal and the center row starts at 0 and increments by one period each column. Then we used a dynamic path finding algorithm (created by Vladimir Goncharoff and Patrick Gries from the University of Chicago in Illinois) to find a path that went through the maximum peak as much as possible, but which did not exceed a given slope as it went through the matrix. Since a slope of 0 (horizontal line) means the markers are equally spaced, the slope is the factor that is adjusted to strike the compromise between following peaks and maintaining periodicity. Empirically, we found a suitable value of this slope to be around 4. In the diagram below these pitch marks are labeled as mi-1, mi and mi+1.
Notification Switch
Would you like to follow the 'Ece 301 projects fall 2003' conversation and receive update notifications?