User Identification and Verification from a Pair of Simultaneous EEG Channels Using Transform Based Features

In this study, the approach of combined features from two simultaneous Electroencephalogram (EEG) channels when a user is performing a certain mental task is discussed to increase the discrimination degree among subject classes, hence the visibility of using sets of features extracted from a single channel was investigated in previously published articles. The feature sets considered in previous studies is utilized to establish a combined set of features extracted from two channels. The first feature set is the energy density of power spectra of Discrete Fourier Transform (DFT) or Discrete Cosine Transform; the second one is the set of statistical moments of Discrete Wavelet Transform (DWT). Euclidean distance metric is used to accomplish feature set matching task. The combinations of features from two EEG channels showed high accuracy for the identification system

and k-nearest neighbor classifier to recognize the individuals. A comparison of two kinds of tasks was conducted: motor movement and motor imagery. Their study indicated that imagery tasks show better performance than motor movement tasks. The system was tested on 18 subjects from the Motor/Movement Imagery dataset, the best achieved recognition rate when using SVM classifier was 97.4%.
Daria La Rocca et al. [7] produced an approach based on the fusion of spectral coherence-based connectivity; they fused features from two channels for a single task. They proposed Power Spectral Density (PSD) and Spectral Coherence Connectivity as features; they used a Mahalanobis distance-based classifier to classify (108) subjects from Motor Movement /Imagery dataset, the best achieved accuracy was (100%).
Kumari and Vaish [8] discussed in their paper the fusion of features that were extracted from different mental tasks using canonical correlation analysis from two mental tasks and 6 channels. They proposed to use Empirical Mode Decomposition (EMD), Information Theoretic Measure (ITM) and statistical measurement to extract features. They classified 7 subjects of CSU dataset using Learning Vector Quantization Neural Network (LVQNN) and its extension (LVQ2); they achieved an accuracy of (96.05%). One type of features from two channels and single task are fused in this study to generate the feature vector.
Kumari and Vaish [9] focused in their study on the comparison between the motor movement task and the imagery task. They proposed different methods of Daubechies wavelet transform and different energy methods as features, and then they used Artificial Neural Network (ANN) to classify 5 subjects from Motor Movement/ Imagery dataset, achieving True Accepted Rate (TAR) of (95%). In this paper, the Daubechies (db4) wavelet transform was considered with some of the statistical moments as features from two channels belonging to a single task, the statistical moments are applied on each sub-band, and the statistical distance measure was adopted for matching stage.
Yang et al. [10] discussed the sensitivity of EEG-based recognition system to the type of mental tasks; they proposed Daubechies (db4) packet decomposition and calculated the standard deviation of each sub-band as features. Features from different tasks and electrodes (9 electrodes) were fused to generate the final features vector, then they fed to Linear Discriminant Analysis (LDA) classifier to classify (108) subjects from MMI dataset. The best achieved CRR was (99%) for identification mode whereas they achieved a best verification result with Equal Error Rate (EER=4.5).
However, to make EEG-based user identification and verification system applicable, fast, and accurate, the system must go through few and uncomplicated stages. Also, the acquisition process should be easy and simple so as not to disturb the user. Therefore, the least number of electrodes (or channels) must be attached to the user's scalp, and a minimum number of mental tasks must be asked to be performed by the user. These main problems are discussed in this study in which simple, fast, and different methods are proposed using only two EEG channels when the user is performing one mental task, in order to reduce system complexity while maintaining high system accuracy.
In previous work, [11], [12] and [13] the approach of extracting features from single EEG channels when the user is performing certain mental task was discussed, to keep the complexity of the recognition system as less as possible; competitive results were achieved. In this work, the approach of extracting features from two simultaneous EEG channels when the user is performing one task is discussed, to increase the discrimination ratio of the classes and enhance the performance of the recognition system, using the same feature types proposed in the above mentioned previous works. This paper is organized as follows: Section II presents the description of used datasets and the proposed methods, Section III discusses the experiments result, Section VI discusses previous works related to this paper, and Section V presents conclusions.

II. Methodology
In this study, The EEG-based identification and verification system is based on the approach of combined features from two simultaneous EGG channels through the following main stages: (i) Feature extraction stage which in turn comprises three steps; the first step is aimed to transfer the input EEG signal to either frequency domain or scale-shift domain, whereas the second one is aimed to extract the main features from the transformed signal. The third step is feature analysis and combination stage which is aimed to select and combine the more related and discriminated features from two EEG channels belong to the same task to prepare the final feature vector to be the input to the matching stage to make the final decision.
The main problem facing the automatic EEG identification and verification system is the suitable selection of discriminative features from the EEG signal. Extraction of EEG features is conducted in different domains such as the time domain or the frequency domain. The most used feature extraction methods for EEG biometric systems are AR modeling, Power Spectral Density (PSD), the energy of EEG channels and wavelet packet decomposition (WPD) [3].

A. Datasets
Two public and free datasets which are used and described in [11] and [13] are also tested in this study. The first one is the Colorado State University dataset which is a small dataset that consists of the recordings of 7 healthy volunteers, collected by Keirn and Aunon [14], whereas the second one is Motor Movement/Imagery dataset which is a relatively large dataset that consists of EEG recordings of 109 healthy volunteers; it was described in [15]. The number of samples in each class in CSU dataset is shown in Table I (Note: class 4 has 9 samples for the letter-composing task because of the error occurred in the dataset and mentioned in [8], [14]), whereas in MMI datasets each class consists of 3 samples.

B. Proposed System
The proposed methods in [12], [11], and [13] which worked under the approach (using a single channel and a single task) are tested in this study under the second approach (combining features from two channels belonging to the same task).

1) Proposed Features
Two separate sets of features were used for the identification and verification system in previous studies; these are the energy distribution features and statistical moments features:

a) Energy Distribution Features
This proposed set of features includes the use of transforms: (1) DFT which is defined by (1), or (2) DCT which is defined by (2); they are used to transform the input signal to the frequency domain, and their output is used to calculate the energy distribution. Equation (3) is used to calculate the energy distribution to the sliced power spectra (i.e. AC components) [16], [3], and [17].
(1) (2) Where C(u) and F(u) are the u th AC coefficient of the DCT and DFT, respectively, and s() is the input EEG signal. (3) Where T(i) represents the F(u) or C(u) coefficients array; � ( ) is the energy average of j th band; L is the number of coefficients belonging to each band; j=0…(N-1)/L which is the total number of bands. The array en() is considered the feature vector.
The 1 st Statistical Moments Set is described by (14) whereas a 2 nd set is described by (16): (14) Where S(i) is the i th coefficient of the sub-band, k is the sub-band length, and is the mean which is determined as: Where ΔS(i)=S(i)-S(i+1) for (i=0,…, p-2), and is the average of ΔS(i) as described by (15), and the power n is taken (0.5, 0.75, 1, 2, and 3).

2) Two Simultaneous EEG Channels Feature Analysis and Combination
In this stage the features from two simultaneous EEG channels when the user is performing a certain task are combined to make one feature pool, then the pool size is reduced by applying feature analysis and combination by selecting the most related and discriminated features with lowest within distance and highest between variations to make a final feature vector which led to best recognition and verification accuracy [21] [17].

3) Matching Stage
In this stage, the normalized Euclidean distance measure (nMSD) is proposed to calculate the distance between the input pattern and the stored templates(s) to make the final decision which either to identify the user identity in identification mode or to verify the claimed identity based on similarity distance threshold in verification mode. nMSD is given by (17) [22]: (17) Where S i is the samples belonging to the i th class, T j is the template feature vector of the j th class, and σ j is the standard deviation vector of the j th template.

III. Experimental Results
The experimental study of the second approach (combined features from two EEG channels) was conducted on both considered datasets, and the accuracy of verification and identification system with all proposed features was tested. The second adopted approach enhanced the performance of all suggested features for the identification mode, whereas, for the verification mode, the performance of some methods with the first approach (using one EEG channel) is better than with the second approach.

A. Identification and Verification Experimental Results
Correct Recognition Rate (CRR); that is given by (18); is used to check the identification system accuracy [17]. The system was partially trained with 67.66% of total samples of each class for CSU dataset, whereas for MMI dataset each class has three samples; so two samples are used for training in which each one is considered as a template, and one sample is used to test the system. (18) The Receiver Operating Characteristic (ROC) Curve is the most used statistical tool for describing the verification system behavior by plotting the False Accepted Rate (FAR)which is given by (19) and measures the average of accepted imposter patterns, against the False Rejected Rate (FRR) which is given by (20) and measures the average of rejected genuine patterns, at various threshold settings to obtain the intersection point between FRR and FAR so the Half Total Error Rate (HTER) can be calculated using (20) to evaluate the performance of the verification system [23], [24]: So the accuracy of the verification system can be calculated using (22): (22) Where P refers to the genuine patterns; and N refers to the imposter patterns [24].

1) Experimental Results of Energy Features of DFT Bands
The result of combining the features of DFT energy distribution extracted from two EEG channels; are shown in this section. Tables II and III show some of the identification results when the method is applied to CSU and MMI datasets, respectively. Tables IV and V show some of the verification results of the both datasets.

2) Experimental Results of Energy Features of DCT Bands
Tables VI and VII show some of the achieved identification results for the DCT bands energy distribution features when the proposed system is applied on both datasets, whereas Tables VIII and IX show some the achieved verification results.

3) Experimental Results of Statistical Moments Features of Haar Wavelet Transform
Tables X and XI present the results of some conducted tests of the introduced identification system using Haar wavelet transform with the 2 nd set of statistical moments that is applied to CSU and MMI datasets, respectively. Tables XII and XIII show some conducted verification results on both datasets.

4) Experimental Results of Statistical Moments Features of Daubechies (db4) Wavelet transform
Tables XIV and XV show some of the conducted tests results of the identification system based on Daubechies wavelet transform (db4) with the 2 nd set of statistical moments on CSU dataset and 1 st statistical moments on MMI dataset. Tables XVI and XVII show some conducted verification results on both datasets.

5) Experimental Results of Statistical Moments Features of TAP 9/7 Wavelet Transform
The best identification results of the system based on Statistical Moments of Tap9/7 Sub-bands are in Tables XVIII and XIX, whereas  for verification system in Tables XX and XXI; for both datasets.

B. Execution Time
The specification of the computer lap top that was used in the conducted tests is Intel® Core ™ i5-2450M CPU with (4GB) RAM, the operating system is windows7 (64bit), and the development programming language is Microsoft Visual C#. Table XXII shows the average elapsed time, (in milliseconds) of the proposed methods on both datasets for one signal only. Taking into consideration the recording time for each sample of CSU dataset is (10 sec) with the sampling rate (250 Hz), and the recoding time for MMI dataset is (1 minute) with the sampling rate (160 Hz).

IV. Comparison With Related Works
In this section, the comparison between the two adopted approaches (using two EEG channels and using one EEG channel) and the comparison with the other recent related works are shown. Tables XXIII and XXIV show the comparisons of the results of the adopted (as 2 nd ) approach in this paper with the (1 st ) adopted approach in previously proposed work. The second approach combines the features from two channels belonging to the same task, so this approach also keep the complexity of the system low because the user is asked to perform only one mental task in the acquisition stage.
The comparison of the findings of this work and other related works for identification and verification modes are shown in Tables XXV and XXVI.

V. Conclusions And Future Work
In this paper an extended approach to extract features from user EEG signal is adopted, The features which are proposed in previously conducted studies are tested in this study to check the discriminative degree of this features when they are combined from two simulated EEG channels to generate one feature pool. This approach has improved the performance of the identification system, but for the verification system the performance of the first approach for most types of features is better than the second approach.
This approach also keeps the computational complexity low, and the user performs a single task to take his EEG features. After completing this study, the findings showed that one or two EEG channels are enough to extract discriminate features and recognize the individuals when the proposed methods were tested on the available datasets.
Wider Daubechies wavelet methods such as (db8, db10, ..) and a new type of statistical moments are recommended as new features for EEG based user identification and verification system.