Limits...
A bio-inspired feature extraction for robust speech recognition.

Zouhir Y, Ouni K - Springerplus (2014)

Bottom Line: The proposed method is motivated by a biologically inspired auditory model which simulates the outer/middle ear filtering by a low-pass filter and the spectral behaviour of the cochlea by the Gammachirp auditory filterbank (GcFB).The evaluation results show that the proposed method gives better recognition rates compared to the classic techniques such as Perceptual Linear Prediction (PLP), Linear Predictive Coding (LPC), Linear Prediction Cepstral coefficients (LPCC) and Mel Frequency Cepstral Coefficients (MFCC).The used recognition system is based on the Hidden Markov Models with continuous Gaussian Mixture densities (HMM-GM).

View Article: PubMed Central - PubMed

Affiliation: Research Unit: Signals and Mechatronic Systems, SMS, Higher School of Technology and Computer Science (ESTI), University of Carthage, Carthage, Tunisia.

ABSTRACT
In this paper, a feature extraction method for robust speech recognition in noisy environments is proposed. The proposed method is motivated by a biologically inspired auditory model which simulates the outer/middle ear filtering by a low-pass filter and the spectral behaviour of the cochlea by the Gammachirp auditory filterbank (GcFB). The speech recognition performance of our method is tested on speech signals corrupted by real-world noises. The evaluation results show that the proposed method gives better recognition rates compared to the classic techniques such as Perceptual Linear Prediction (PLP), Linear Predictive Coding (LPC), Linear Prediction Cepstral coefficients (LPCC) and Mel Frequency Cepstral Coefficients (MFCC). The used recognition system is based on the Hidden Markov Models with continuous Gaussian Mixture densities (HMM-GM).

No MeSH data available.


Block diagram of the proposed Perceptual linear predictive auditory Gammachirp (PLPaGc) method.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4230714&req=5

Fig5: Block diagram of the proposed Perceptual linear predictive auditory Gammachirp (PLPaGc) method.

Mentions: Our feature extraction method for speech recognition of noisy speech signal is based on auditory filter modelling. The proposed method, as illustrated by a block diagram in Figure 5, consists of seven steps. In the first step, the power spectrum is calculated by performing the square of Discrete Fourier Transform to the windowed segment of speech signal. The second step is the Outer and middle ear filtering, which is performed by a second order low-pass filter with a resonance frequency equal to 4 kHz (Martens and Van Immerseel 1990; Van Immerseel and Martens 1992). In the third step, the result is processed by applying the gammachirp auditory filterbank composed of 34 Gammachirp filters (Zouhir and Ouni 2013), where the centre frequencies of the filter are equally spaced in ERB-rate scale between 50 Hz and 8000 Hz (Glasberg and Moore 1990; Moore 2012). The output is pre-emphasized, in the fourth step, by the simulated equal loudness curve. The latter allows obtaining the non-equal sensitivity approximation of human auditory system at different frequencies (Hermansky 1990). The fifth step is the Intensity loudness Conversion step. The aim of this step consists in simulating the nonlinear relationship between the intensity of speech signal and perceived loudness by performing a cubic-root amplitude compression. In the sixth step, the autoregressive all-pole model is calculated using inverse DFT and the Levinson-Durbin recursion (Hermansky 1990). The last step of our method consists in applying a cepstral transformation to obtain the proposed Perceptual Linear Predictive Auditory Gammachirp coefficients (PLPaGc).Figure 5


A bio-inspired feature extraction for robust speech recognition.

Zouhir Y, Ouni K - Springerplus (2014)

Block diagram of the proposed Perceptual linear predictive auditory Gammachirp (PLPaGc) method.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4230714&req=5

Fig5: Block diagram of the proposed Perceptual linear predictive auditory Gammachirp (PLPaGc) method.
Mentions: Our feature extraction method for speech recognition of noisy speech signal is based on auditory filter modelling. The proposed method, as illustrated by a block diagram in Figure 5, consists of seven steps. In the first step, the power spectrum is calculated by performing the square of Discrete Fourier Transform to the windowed segment of speech signal. The second step is the Outer and middle ear filtering, which is performed by a second order low-pass filter with a resonance frequency equal to 4 kHz (Martens and Van Immerseel 1990; Van Immerseel and Martens 1992). In the third step, the result is processed by applying the gammachirp auditory filterbank composed of 34 Gammachirp filters (Zouhir and Ouni 2013), where the centre frequencies of the filter are equally spaced in ERB-rate scale between 50 Hz and 8000 Hz (Glasberg and Moore 1990; Moore 2012). The output is pre-emphasized, in the fourth step, by the simulated equal loudness curve. The latter allows obtaining the non-equal sensitivity approximation of human auditory system at different frequencies (Hermansky 1990). The fifth step is the Intensity loudness Conversion step. The aim of this step consists in simulating the nonlinear relationship between the intensity of speech signal and perceived loudness by performing a cubic-root amplitude compression. In the sixth step, the autoregressive all-pole model is calculated using inverse DFT and the Levinson-Durbin recursion (Hermansky 1990). The last step of our method consists in applying a cepstral transformation to obtain the proposed Perceptual Linear Predictive Auditory Gammachirp coefficients (PLPaGc).Figure 5

Bottom Line: The proposed method is motivated by a biologically inspired auditory model which simulates the outer/middle ear filtering by a low-pass filter and the spectral behaviour of the cochlea by the Gammachirp auditory filterbank (GcFB).The evaluation results show that the proposed method gives better recognition rates compared to the classic techniques such as Perceptual Linear Prediction (PLP), Linear Predictive Coding (LPC), Linear Prediction Cepstral coefficients (LPCC) and Mel Frequency Cepstral Coefficients (MFCC).The used recognition system is based on the Hidden Markov Models with continuous Gaussian Mixture densities (HMM-GM).

View Article: PubMed Central - PubMed

Affiliation: Research Unit: Signals and Mechatronic Systems, SMS, Higher School of Technology and Computer Science (ESTI), University of Carthage, Carthage, Tunisia.

ABSTRACT
In this paper, a feature extraction method for robust speech recognition in noisy environments is proposed. The proposed method is motivated by a biologically inspired auditory model which simulates the outer/middle ear filtering by a low-pass filter and the spectral behaviour of the cochlea by the Gammachirp auditory filterbank (GcFB). The speech recognition performance of our method is tested on speech signals corrupted by real-world noises. The evaluation results show that the proposed method gives better recognition rates compared to the classic techniques such as Perceptual Linear Prediction (PLP), Linear Predictive Coding (LPC), Linear Prediction Cepstral coefficients (LPCC) and Mel Frequency Cepstral Coefficients (MFCC). The used recognition system is based on the Hidden Markov Models with continuous Gaussian Mixture densities (HMM-GM).

No MeSH data available.