<< Chapter < Page Chapter >> Page >

Voicing decisions

Determining where a sound was voiced or unvoiced was not always an easy task. We had also compounded any error made by removing six of the reflection coefficients for an unvoiced sound. To keep the vocal tract model simple, Richard chose to have either voiced or unvoiced excitation but not both together. This design choice made some of the voiced consonants a bit tricky if they require both voiced and unvoiced excitation at the same time. But these errors were fixed by post editing of the synthesized word.

We developed an editing station that allowed us to listen to each word frame by frame. With this capability we could fix voicing decisions by changing the frame from a voiced frame to an unvoiced frame or vice versa. If that didn’t solve the problem we had two additional methods. One was to find a similar word and capture the good frames from it and copy them over to the word we were working on. If that didn’t work the last resort was to have the professional speaker say the word again and hope it was better. With experience we learned to have the professional speaker say each word in the vocabulary three times. Yes, the expedient solution to the problem.

Initially we did the editing - we, meaning our engineering team. This slowed us down quite a bit as the engineers were prone to rewrite portions of the editing software rather than edit the words. We were also doing foreign languages and needed our editors to be experts in those languages. The answer was to hire linguists to do the editing and put the engineers back to work designing the product. Amazingly, it took a bit of effort to get the linguists comfortable with the editing process. The editing was done on a mini-computer with the linguists sitting at a monitor and keyboard doing the editing. One day I saw one of the linguists carefully typing on the keyboard. I stopped her (it happened to be one of the women) and asked what was the problem. She said that she was afraid she would break the computer. I pointed to the engineers sitting near the editing station and said “see those engineers over there? They are waiting for you to break the computer so they can fix it. So, it’s ok to break it.” The reason the engineers were anxious to fix it was they were instructed not to play with the editing station or software unless there was a problem. That seemed to fix the issue and editing got on its way.

Limit cycles

Limit cycles began to plague us as we began to produce the vocabulary words. They were caused by some of the data word length decisions we had made for the synthesizer. It would show up as audible tones when the word or phrase was synthesized. Once again they were fixed in the post editing of the processed words. Initially we used the repeat option to try to eliminate the frame that was causing the issue. Later on one of our linguists found a magic frame that could be added at any point in the phrase which would eliminate the effect of the limit cycles. As I remember, they were his secret and advantage in the editing process.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, The speak n spell. OpenStax CNX. Jan 31, 2014 Download for free at http://cnx.org/content/col11501/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'The speak n spell' conversation and receive update notifications?

Ask