<< Chapter < Page | Chapter >> Page > |
The third and final voice manipulation tool we developed changes the length of the signal without altering its pitch or clarity, and the basic strategy to do so is extremely simple. After breaking the signal into chunks by matricizing, some of the chunks are either trashed or repeated in order to compress or extend the length of the signal. Since nobody can perceive a voice’s changing within the span of .02 seconds or less, this repetition never creates an audibly repeated noise. It can only create an audibly lengthened or shortened noise. Playing this sound back, though, sounds incredibly choppy, as if you were listening to the sound version of strobe lights. But if concatenating or removing signal windows in and of itself does not create the desired result, what could the problem be?
Upon closer inspection, it is obvious that the phase of the complex sinusoids at the beginning of a chunk is often very different than the phase at the end of the same chunk or of a previous chunk. After slapping two windows together, this sharp phase difference becomes very clear, producing our unacceptably choppy sound. To correct this, the length changing algorithm makes another run past each window after the new signal has been constructed, this time taking care to compute the phase at the end of the previous chunk ,the old , and the phase the beginning of the next chunk , the new . Next, every value of the next chunk’s DFT gets multiplied by . As a result, the phase at the beginning of the next chunk equals the phase at the end of the previous one, and the phase will transition smoothly between all other points in time. This process is repeated for each and every chunk, resulting in the complete removal of the stutters.
Unaltered voice | Original |
Length Changed Voice | Slower Faster |
Notification Switch
Would you like to follow the 'Speech synthesis' conversation and receive update notifications?