<< Chapter < Page | Chapter >> Page > |
Overview of the Modern Algorithm:
The speech recognition system consists of speech segmentation, feature extraction, and optimal feature-matching with a trained library of stored features.
Speech Segmentation:
Speech segmentation is fairly uniform across systems, segmenting a string of spoken words into individual components. This can be easily accomplished by segmenting at points where power of the sampled signal goes to zero.
Feature Extraction:
Feature extraction may be done in a variety of ways, depending on the features one chooses to extract. Industry standard is extraction of the coefficients that collectively represent the short-term power spectrum of the recorded sound, known as mel-frequency cepstrum coefficients (MFCCs). MFCCs are derived by:
MFCC feature-extraction is typically used in conjunction with Hidden Markov Model feature-matching.
Prior to MFCCs, speech recognition systems used linear predictive coding (LPC). By assuming sibilants and plosive sounds to be occasional anomalies and therefore inverse-filtering out the formants, the values of the signal could be predicted on a local timescale by a series of linear representations after having extracted the coefficients.
Feature-matching:
Feature-matching is traditionally implemented via dynamic time warping (DTW), which allowing for the matching of sampled words with stored templates despite stretched and compressed differences in speed and timing. This technique has fallen out of favor thanks to the current industry gold standard of speech recognition: the Hidden Markov Model (HMM).
As speech signals are short-time stationary processes, modeling speech signals as HMMs is feasible - and offers great advantages over DTW due to extensive training features and implications towards a tremendously robust recognition system. The HMM-based approach is complex, but at the highest level involves the following:
Notification Switch
Would you like to follow the 'Elec 301 project: voice recognition' conversation and receive update notifications?