Introduction

The purpose of this example is to demonstrate how to design a classifier system using ANNHUB to classify human brain waveforms. This will involve in designing a feature extraction system to obtain the features from human brain waveforms, also known as Electroencephalography (EEG) signals. That features will then be used as inputs of the Neural Network, designed in ANNHUB, to classify EEG mental tasks.



The EEG data-set recorded by Zak Keirn at Purdue University for his work on his Masters of Science thesis. The data-set is in binary Matlab format. In this work, EEG data were recorded from 7 subjects using 7 channels, located at C3, C4, P4, P4, O1, O2 and EOG defined by 10-20 system. Subjects were required to perform 5 mental tasks: a baseline, mental multiplication, mental figure rotation, mental letter composing, and visual counting in 10 trials. In each trial, EEG signals were recorded at 250Hz for 10 seconds (2500 samples). The Figure 4.28 shows the EEG signals for 5 mental tasks in 1 trial (10s).

   

Figure 4.28: The 10 second EEG signal of five mental tasks

                                       

The aim of this example is to design a classifier system that can detect if a subject performs baseline task or multiplication task based on his/her EEG signals. This mental task classifier consists of a feature extraction and a Neural Network classifier. The block diagram of the system is shown as follows,





Mental Task Classifier

Figure 4.29: Block diagram of the mental task classifier

Data preparation and feature extraction


The first step to design a Neural Network classifier is to determine its training data-set. This involves in identifying what are inputs and targets for the Neural Network. In this example, since we only need to classify two mental tasks, so the number of the output nodes of the Neural Network classifier will be two, and these outputs could be decoded as [0 1] for the baseline task and [1 0] for the multiplication task. As a result, determining the targets of the Neural Network would be simple and straight forward.


Determining inputs of the Neural Network; however, is difficult task. Since raw EEG signals are non-stationary, and are in time domain, how do we know what are the inputs and how many samples of that inputs? We know that a subject performs a mental task in 10 trials and each trial was recorded at 250Hz in 10 seconds. Therefore, the duration for 1 channel would be 100 seconds. If we decide to use raw EEG signals as inputs, we need to decide how many seconds of the raw EEG data should be used as a sample. For example, if we use 1 second data as a sample, then we have 100 samples, and the length of a sample for 1 channel would be 250. If we also use all 6 channels, then the length of a sample would be 250x6 = 1500. Since we have two mental tasks, so the total samples would be 200. As a result, the input nodes of the Neural Network would be 1500 and the output nodes would be 2, and we have 400 samples for data-set (70% for training set and 30% for test set). Determining what inputs and targets to be used to from a training set is important as it heavily affect the performance of the Neural Network classifier. The process of determining what are inputs and outputs to be used to form a training data-set is also called as feature extraction.



Instead of using raw EEG signals as inputs, that could lead to poor classification result, this example explores the uses of an autoregressive (AR) model to extract features from EEG raw signals.

Figure 4.20: Feature extraction method using Auto-Regression technique to form the data-set for two mental task classification system.



The figure above shows the details of the feature extraction procedure using AR models to from training set and test set for a Neural Network classifier. First the 100 seconds raw EEG data of both tasks (baseline and multiplication) are segmented in to 0.5 second segments. These raw EEG data segments are then used to form AR models. The obtained AR coefficients will then be used as the data-set inputs. By using the AR models to extract the features from raw EEG data-set, the number of input nodes of the Neural Network classifier will reduce from 1500 to 36. The Matlab codes for feature extraction procedure are provided as follows.  



1. Main Matlab program for feature extraction

%% 1. Get EEG mental task data

clear all;

load eegdata;

Fs =125;            % Using the window size = 0.5 second (125 samples)

baseline_task =[];

multiplication_task=[];

letter_composing_task=[];

rotation_task =[];

counting_task =[];


%% 2. Extract EEG data from the subject 1

for i=1:length(data)

  subjectID = data{i}{1};

 if(subjectID=='subject 1')    

     taskname = data{i}{2};

     switch taskname

         case 'baseline',

             baseline_task=[baseline_task,data{i}{4}];

         case 'multiplication',

             multiplication_task=[multiplication_task,data{i}{4}];

          case 'letter-composing',

             letter_composing_task=[letter_composing_task,data{i}{4}];  

          case 'rotation',

             rotation_task=[rotation_task,data{i}{4}];    

          case 'counting',

             counting_task=[counting_task,data{i}{4}];

         otherwise,

     end 

 end 

end

 

%% 3. Calculate Autoregressive model coefficients using burg method with order of 6

[N,M] = size(multiplication_task);

L = round(M/Fs);

for i=1:N

     temp= reshape(baseline_task(i,:),Fs,L)' ;

     baselineTask{i}= ARCoefs(temp,Fs);

     

     temp= reshape(multiplication_task(i,:),Fs,L)' ;

     multiplicationTask{i}= ARCoefs(temp,Fs);

     

     temp= reshape(letter_composing_task(i,:),Fs,L)' ;

     lettercomposingTask{i}= ARCoefs(temp,Fs);

     

     temp= reshape(rotation_task(i,:),Fs,L)' ;

     rotationTask{i}= ARCoefs(temp,Fs);

     

     temp= reshape(counting_task(i,:),Fs,L)' ;

     countingTask{i}= ARCoefs(temp,Fs);

end

 

%% 4. Combine coefficients from 6 EEG channels into a single source

baselineBands           = CombineChannels(baselineTask);

multiplicationBands     = CombineChannels(multiplicationTask);

lettercomposingBands    = CombineChannels(lettercomposingTask);

rotationBands           = CombineChannels(rotationTask);

countingBands           = CombineChannels(countingTask);

 

%% 5. Construct the mental task data-set for baseline and multiplication tasks

% Map targets to mental tasks

multiplicationTargest          = [0 1][0 1];

baselineTargest                  = [1 0];%[1 0];

% Adding targets

multiplicationData = AddTargets(multiplicationBands,multiplicationTargest);

baselineData = AddTargets(baselineBands,baselineTargest);

% Forming data-set

MentalTaskData =[multiplicationData;baselineData];


%% 6. Separate data-set into training set and testing set, and save them into files

[~,inputLength] =size(baselineBands);

newdata-set = shuffleRow(MentalTaskData);

[N, ~] = size(newdata-set);

M = floor(0.7*N);

trainingset = newdata-set(1:M,:);

testset = newdata-set(M+1:end,:);

 

input = trainingset(:,1:inputLength);

target = trainingset(:,inputLength+1:end);

edata =[input target];

[N,ip] = size(input);

[N,op] = size(target);

textHeader= getTextHeader(ip,op);

%write header to file

fid = fopen('MentalTaskTraining.csv','w');

fprintf(fid,'%s\n',textHeader);

fclose(fid);

dlmwrite('MentalTaskTraining.csv',edata,'-append');

 

input = testset(:,1:inputLength);

target = testset(:,inputLength+1:end);

 

edata =[input target];

[N,ip] = size(input);

[N,op] = size(target);

textHeader= getTextHeader(ip,op);

%write header to file

fid = fopen('MentalTaskTesting.csv','w');

fprintf(fid,'%s\n',textHeader);

fclose(fid);

dlmwrite('MentalTaskTesting.csv',edata,'-append');



2. Utility functions used in main script

function Coefs = ARCoefs(taskData,Fs)

    coefResult =[];

    [N,M] = size(taskData);

    for i =1:N

        temp = AREstimation(taskData(i,:),Fs);

        coefResult =[coefResult;temp];

    end

    Coefs = coefResult;

end



function ARcoefs = AREstimation(x,Fs)

    y = double(x');

    y = iddata(y);

    mb = ar(y,6,'burg');

    temp = mb.A;

    ARcoefs = temp(2:end);

end


function ARData = CombineChannels(mentalTask)

    Result =[];

    [~,N] = size(mentalTask);

    for i =1:N-1

        [M,~] =size(mentalTask{i});

        temp =[];

        for j=1:M

            temp = [temp;mentalTask{i}(j,:)];

        end 

         Result = [Result,temp];

    end 

    ARData = Result;

end 


function taskdata-set = AddTargets(freqBands,Targets)

    [N,~] =size(freqBands);

    for i = 1:N

        taskdata-set(i,:) = [freqBands(i,:),Targets];

    end 

end


function ret = shuffleRow(mat)

[r c] = size(mat);

shuffledRow = randperm(r);

ret = mat(shuffledRow, :);

end


function tHeader = getTextHeader(inputs,outputs)

textHeader ='';

for i=1:inputs

    AddStr1 =  strcat('Input ',num2str(i));

    AddStr2 =strcat(AddStr1,',');

    textHeader =strcat(textHeader, AddStr2);

end 

   

for j=1:outputs-1

    AddStr1 =  strcat('Output ',num2str(j));

    AddStr2 =strcat(AddStr1,',');

    textHeader =strcat(textHeader, AddStr2);

end 

    AddStr1 =  strcat('Output ',num2str(outputs));

    textHeader =strcat(textHeader, AddStr1);

    tHeader=textHeader;

end

Load data


The data-set obtained from feature extraction procedure will be saved in csv files, "MentalTaskTraining.csv" for training set and "MentalTaskTesting.csv" for test set. Both files are located in the EEGARMentalTasks example folder.


The training set will be loaded in the ANNHUB to construct the Neural Network classifier structure.

   

Figure 4.21: Loading two mental task training set into ANNHUB


Configure Neural Network classifier


Since the training data contains small number of samples (70%*400 =280), the Bayesian Regularization training algorithm is the best option as it does not require validation set to prevent over-fitting issue. The Tansig and Logsig are used as activations for hidden layer and output layer. The hidden layer contains 19 hidden nodes, and the cost function is Mean Squared Error.  Both pre-processing and post processing methods are used. The training data ratio is 75%.



\

Figure 4.22: Configure Neural Network structure for two mental task classification

Train Neural Network classifier

The stopping criteria settings are shown in Figure 4.23. The best performance is achieved at epoch 17th of the first training attempt.    



Figure 4.23: Train Neural Network classifier


Evaluate the trained Neural Network classifier

After training procedure, the Confusion matrix and ROC curve are used to evaluate the trained Neural Network. As shown in Figure 4.24, the trained classifier is able to correctly classifier two mental tasks with high success rate of over 90% accuracy.


Figure 4.24: Evaluate the trained Neural Network classifier using confusion matrix, accuracy, specificity and sensitivity information.

Figure 4.25: Evaluate the trained Neural Network classifier using ROC curves.


Test the trained Neural Network classifier with new data


The new data in "MentalTaskTesting.csv" which contains 120 samples are used to test the Neural Network classifier. Figure shows that the stability of the Neural Network classifier is still achieved with the success rate of 85% for new data-set.  


Figure 4.26: Test the trained Neural Network with new data-set

Conclusion

This section has presented the real classification application in biomedical engineering field, human brain mental task classification. First feature extraction procedure is applied to extract features from raw biomedical data. These features are then used as inputs to train a Neural Network classifier. Although the number of samples of the training data-set is small, Bayesian Neural Network that uses Bayesian Regularization training algorithm is able to classify two mental tasks with high accuracy rate of 85%. The stability of the classifier will be improved when the number of training samples increase.



The principle presented in this example can be applied to other classification applications. Depending on the nature of classification problems, appropriate feature extraction methods will be used to achieve the best result.