Time-frequency analysis based methods for classification of newborn cry signals
Abstract
The infant cry classification implies non invasive objective methods, classification of
different patterns of infant cry utterances and adoption of artificial and digital signal
processing techniques. It has been commenced past decades ago to overcome the
limitations of subjective methods in particularly auditory perception and human
spectrographic analysis, which are relying on clinical rater‘s experience and expertise.
This thesis addresses the development of an objective method for classification of
newborn cries primarily using time-frequency (t-f) methods. Towards this aim, a novel
investigation using two different t-f based signal processing approaches was performed:
(a) Quadratic time-frequency distributions (QTFDs): Spectrogram (SPEC), Wigner-
Ville distribution (WVD), Smoothed-Wigner Ville distribution (SWVD), Choi-William
distribution (CWD) and Modified B-distribution (MBD), and (b) Wavelet packet
transform (WPT) based method: wavelet packet spectrum (Wpspectrum). The
effectiveness of the suggested t-f methods was analyzed using normal and different
pathological cry signals. The investigational cry signals were accessed from three
different origins of databases (Mexico, Hungary and Malaysia (self-developed
database). In order to investigate the effectiveness of the suggested t-f methods, eight
different cry experiments were suggested, including binary and multiclass problems. In
the binary domain, analysis of cry signals from different origin and the severity level of
pathological cry signals were considered for investigation. The framework of this work
was designed in two phases in order to compare the performance evaluation of the
suggested t-f methods with the state of the art attributes in the infant cry classification
area (Mel frequency cepstral coefficients (MFCCs) and Linear prediction coefficients
(LPCs)). Initially, the performance evaluation of the individual suggested t-f methods,
MFCCs and LPCs on different proposed cry datasets were performed. In this case, a
cluster of t-f based statistical features was extracted from the suggested t-f methods. The
performance evaluation in term of classification task was tackled using two different
supervised neural networks, namely Probabilistic Neural Network (PNN) and General
Regression Neural Network (GRNN). Subsequently, by considering the classification
performance, the best distribution from the QTFDs was selected. In the second phase, a
feature set, combination of MFCCs, LPCs and the extracted statistical features from the
best QTFDs and Wpspectrum was formed. Different feature selection techniques, such
as Plus-1-minus-r (LRS) and Information Gain (IGS) were applied on the formed
feature set to obtain a parsimonious subset of those features. The discrimination
capability of the selected feature vector in terms of classification accuracy was
evaluated using PNN and GRNN.