Electrocardiogram signal based sudden cardiac arrest prediction using machine learning approaches
Abstract
This thesis focuses on predicting occurrence of imminent sudden cardiac arrest (SCA) using heart rate variability (HRV) and electrocardiogram (ECG) signals. Sudden cardiac death (SCD) is a devastating cardiovascular disease that responsible for millions of
deaths per year. SCD occurs when SCA went untreated for more than 10 minutes.
Hence, predicting imminent SCA before its occurrence or identification of high-risk
patients for SCD can save millions of lives. Two international databases, namely
MIT/BIH Sudden Cardiac Death database (20 subjects) and MIT/BIH Normal Sinus
Rhythm database (18 subjects) were used in this work. Both databases have two leads
ECG recording of patients in supine condition. In addition, HRV signals are provided in
these databases. Two segments of HRV signals were used in this work. First segment is
five minutes long and it was segmented two minutes before the onset of ventricular
fibrillation (VF). Consequently, second segment is one minute long and it was
segmented five minutes before the onset of VF. As for normal subjects, these
segmentations were done at random intervals. Besides, these segmentations were done
to achieve two and five minute prediction of imminent SCA, respectively. Both HRV
signal segments were pre-processed to remove and interpolate ectopic beats. Then, time
and non-linear domain features were extracted. Next, HRV signals were detrended and
frequency domain features were extracted. Feature selection method is different for each
time segment. For features of five minutes HRV signal, sequential forward selection
(SFS) was used to select optimal features while in one minute HRV analysis, feature
selection using principal component analysis (PCA) and correlation based feature
selection (CFS) were experimented in addition to SFS. Optimal features selected using
each methods were analyzed for its statistical significance using analysis of variance
(ANOVA) test. Based on literature, four machine learning classifiers (support vector
machine (SVM), probabilistic neural network (PNN), K-nearest neighbour (KNN) and
classification tree (CTree)) were used for prediction in both analyses. In contrast, one
minute ECG, which is five minutes before the onset of VF, was extracted from the
database. Then, it was pre-processed to eliminate power line interference and high
frequency noises. S-Transform (ST) based novel noise removal method was used for
removing zero energy noises. Then, segment from R wave until the end of T wave (RT
ABSTRACT
end) was extracted from each ECG trace. Two groups of features (G1 and G2) were
extracted from this novel ECG segment. G1 consists of four non-linear features (Hurst
exponent, largest Lyapunov exponent, approximate entropy and sample entropy) while
G2 consists of four higher order statistic features (mean, variance, skewness and
kurtosis) and proposed angle of elevation/depression (AED) feature. The proposed AED
feature is statistically significant (ANOVA) with p < 0.05. In this analysis, three
classifiers (SVM, subtractive fuzzy clustering (SFC) and neuro-fuzzy classifier (NFC))
were used for SCA prediction. Through these analyses, maximum prediction accuracy
of 97.37% was achieved in both two and five minutes SCA prediction using HRV
signals. In addition, 100% prediction accuracy was produced in one-minute ECG
analysis. The proposed AED feature produced 86.84% prediction accuracy.