<< Chapter < Page | Chapter >> Page > |
When we speak, pitch and the vocal tract's transfer function are not static; they change according to their control signals toproduce speech. Engineers typically display how the speech spectrum changes over time with what is known as a spectrogram [link] . Note how the line spectrum, which indicates how the pitchchanges, is visible during the vowels, but not during the consonants (like the ce in "Rice").
The fundamental model for speech indicates how engineers use the physics underlying the signal generation process and exploit itsstructure to produce a systems model that suppresses the physics while emphasizing how the signal is "constructed." Fromeveryday life, we know that speech contains a wealth of information. We want to determine how to transmit and receiveit. Efficient and effective speech transmission requires us to know the signal's properties and its structure (as expressed bythe fundamental model of speech production). We see from [link] , for example, that speech contains significant energy from zero frequency up to around5 kHz.
Effective speech transmission systems must be able to cope with signals having this bandwidth. It isinteresting that one system that does not support this 5 kHz bandwidth is the telephone: Telephone systems act like a bandpass filter passing energy between about 200 Hz and 3.2 kHz. The most importantconsequence of this filtering is the removal of high frequency energy. In our sample utterance, the "ce" sound in "Rice""contains most of its energy above 3.2 kHz; this filtering effect is why it is extremely difficult to distinguish thesounds "s" and "f" over the telephone. Try this yourself: Call a friend and determine if they can distinguish between the words"six" and "fix". If you say these words in isolation so that no context provides a hint about which word you are saying, yourfriend will not be able to tell them apart. Radio does support this bandwidth (see more about AM and FM radio systems ).
Efficient speech transmission systems exploit the speech signal's special structure: What makes speechspeech? You can conjure many signals that span the samefrequencies as speech—car engine sounds, violin music, dog barks—but don't sound at all like speech. We shall learnlater that transmission of any 5 kHz bandwidth signal requires about 80 kbps (thousands of bitsper second) to transmit digitally. Speech signals can be transmitted using less than 1 kbps because of its special structure. To reduce the "digital bandwidth" sodrastically means that engineers spent many years to develop signal processing and coding methods that could capture thespecial characteristics of speech without destroying how it sounds. If you used a speech transmission system to send aviolin sound, it would arrive horribly distorted; speech transmitted the same way would sound fine.
Exploiting the special structure of speech requires going beyond the capabilities of analog signal processing systems. Manyspeech transmission systems work by finding the speaker's pitch and the formant frequencies. Fundamentally, we need to do morethan filtering to determine the speech signal's structure; we need to manipulate signals in more ways than are possible withanalog systems. Such flexibility is achievable (but not without some loss) with programmable digital systems.
Notification Switch
Would you like to follow the 'Fundamentals of electrical engineering i' conversation and receive update notifications?