Malaysian English large vocabulary continuous speech recognizer: an improvement using acoustic model adaptation
Abstract
This research project aims to develop Malaysian English Continuous Speech Recognition system by adapting US English acoustic model with Malaysian English speech corpus using Maximum a posteriori reasoning (MAP) and Maximum Likelihood Linear Regression (MLLR). During feature extraction stage, the Mel-Frequency Cepstral Coefficients (MFCC) technique was used. The Hidden Markov Model was used as the back end pattern comparison technique. For the purpose of implementation, the CMU Sphinx toolkit, which includes Pocketsphinx and Sphinxtrain as well as an acoustic model, was used to develop a speech recognition system for Malaysian English. Malaysian English speech samples were recorded and transcribed to produce the training database required for acoustic model adaptation. The adaptation speech corpus were collected from a number of speakers. The outcome of this research could increase the application of Malaysian English speech recognition in Malaysia due to accent problem. As a result, speech recognition systems that have gone through the MAP adaptation had the
best performance. Its average word error rate achieved was 32.84%. Average word recognition rate was 72.52% and average sentence error rate was 78.89%.
Collections
- IEM Journal [310]