extraction, in terms of spectral, wavelet, entropy
features in specific channels, or network analysis and
connectivity features among channels, combined with
machine learning for classification. Newer studies
incorporate deep learning approaches.
In the research on early stages of AD, some
researchers used Deep Neural Networks (DNN) for
classification with Relative Power (RP) to re-
combine features from the system’s learning method,
which improved diagnosis results compared to
another NN, which contained RP features as domain
knowledge (Kim & Kim, 2018). In newer studies,
though, Multiple Signal Classification and Empirical
Wavelet Transform (MUSIC-EWT) was used to
reconstruct signals into proper EEG frequencies,
analyze them with non-linear indices to discriminate
AD from MCI patients, evaluate features with
ANOVA for feature selection and use Epoch Neural
Network (EPNN) for classification (Amezquita-
Sanchez et al., 2019). Usually, preprocessing filters
are applied to EEG signals, while Independent
Component Analysis (ICA) or Blind Source
Separation (BSS) are considered for signal
improvement, Fast Fourier Transform (FFT) or
Wavelet Transform (WT) for feature extraction and
Linear Discriminant Analysis (LDA) or Support
Vector Machine (SVM) for the classification. In
addition to FFT for feature extraction, Continuous
Wavelet Transform (CWT) can also be applied, and
for data classification, K-Nearest Neighbor (KNN)
has been used successfully (Durongbhan et al., 2019).
The current study aims to use the information
hidden in all EEG channels without selecting the most
informative ones. It is explored whether open or
closed eyes recordings, are more informative. Also,
to identify the most informative frequency zones,
high-pass and low-pass filtered versions of the signal
are used. This study explores the value of a
classification method based on Kernel PCA and
Random Forest classifier in classifying Healthy, MCI
and AD patients on the preprocessed EEG data, in the
above-mentioned schemes. Classification follows
two steps, classification of EEG segments as a first
step, and classification of patients via segment
majority voting as a second step.
2 METHODS
As a starting point, the EEG data stored in European
Data Format (EDF), which included both open and
closed eyes parts, was serialized via Python object
serialization (pickle) for more efficient data handling
of the open-eye closed-eye segments separately.
During the preprocessing of the data, the data were
segmented into multiple parts for every patient and
for every status (open eyes, closed eyes). After this
process, major artifacts were rejected via standard
deviation thresholding, and two types of filters were
used (delta-theta, and alpha-beta bands, respectively).
ML algorithms were used to study the accuracy of
different classifiers when classifying patients as MCI
patients, AD patients, or Healthy, with different
schemes, e.g., eyes closed and low-pass filtered. The
algorithms used were based on the Random Forest
(RF) Classifier as a first step classifying patient
segments and a majority voting scheme as a second
step.
These methodological steps are described in more
detail in the following subsections.
2.1 EEG Data
In this paper EEG data were collected through a set
of 21 electrodes following the 10-20 international
reference system (Fp1, Fp2, F7, F3, Fz, F4, F8, T3,
C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1 and O2) at
500Hz.
For EEG signal collection an Nihon274Kohden
Neurofax J921A system was used. Input impedance
was set to Z<10kω, and the signals were digitized
with the Neurofax EEG-12200 Ver. 01-93, and a
sampling frequency of 500Hz. The protocol used for
data acquisition of the EEG signals refers to the
resting stage that lasts for 10 minutes, from which 5
minutes the patient’s eyes are closed, and the other 5
are opened, while being seated in an upright position.
For the experiment, we used 27 AD, 22 Healthy
και 24 MCI. The data were provided from the Greek
Association of Alzheimer’s Disease and Related
Disorders, with ethical approval for use, and based on
the patient data privacy legislation, the data were
anonymized.
The EEG data collected are saved in raw EEG
EDF files. Every EDF consists of 19 EEG signal
channels. Each file contains annotations about signal
phases such as open eyes, calibration, closed eyes,
A1+A2 electrode ON. Those annotations were used
to distinguish segments into open and closed eyes and
remove irrelevant ones. Then the data were processed
and stored in pickle format for storage capacity
reasons.
Following, a preprocessing pipeline is used,
including segmentation, filtering and transformation.
Exploring Classification in Open and Closed Eyes EEG Data for People with Cognitive Disorders