Feature space reduction in ethnically diverse Malaysian English accents classification
Yusnita, Mohd Ali
Pandiyan, Paulraj Murugesa , Prof. Dr.
Sazali, Yaacob, Prof. Dr.
Shahriman, Abu Bakar, Dr.
MetadataShow full item record
In this paper we propose a reduced dimensional space of statistical descriptors of mel-bands spectral energy (MBSE) vectors for accent classification of Malaysian English (MalE) speakers caused by diverse ethnics. Principle component analysis (PCA) with eigenvector decomposition approach was utilized to project this high-dimensional dataset into uncorrelated space through the interesting covariance structure of a set of variables. This delimitates the size of feature vector necessary for good classification task once significant coordinate system is revealed. The objectives of this paper have three-fold. Firstly, to generate reduced size feature vector in order to decrease the memory requirement and the computational time. Secondly, to improve the classification accuracy. Thirdly, to replace the state-of-the-art mel-frequency cepstral coefficients (MFCC) method that is more susceptible to noisy environment. The system was designed using K-nearest neighbors algorithm and evaluated on 20% independent test dataset. The proposed PCA-transformed mel-bands spectral energy (PCA-MBSE) on MalE database has proven to be more efficient in terms of space and robust over the baselines MFCC and MBSE. PCA-MBSE achieved the same accuracy as the original MBSE at 66.67% reduced feature vector and tested to be superiorly robust under various noisy conditions with only 10.48% drop in the performance as compared to 16.81% and 48.01% using MBSE and MFCC respectively.