Project 2012: Unspoken Speech Detection Using a Brain-Computer Interface

 

 

 

 

 

 

Introduction and Inspiration:

Brain-Computer Interfaces (BCIs) are devices that allow humans to interact with the world using brainwaves. A BCI is comprised of:

  1. An input device to detect the brainwaves (electroencephalograph or EEG).
  2. An analogue to digital converter (converts analogue brain waves to digital signals).
  3. A controller module that controls the operation of the EEG input device.
  4. A computer programmed with software to analyze the digital signals AND provide an interface so that the user can see the result of their brain activity.

 

BCIs are exciting new developments that can be used for such things as interfacing with video games and providing stealth communications in military applications.  Most importantly however is that BCIs can help locked-in patients (such as Dr. Stephen Hawking) communicate with the outside world without any physical movement.

This project uses on a non-invasive (non-implanted) BCI.

 

Background:

1924 – Dr. Berger is first to record EEG in humans.  Uses silver wire under the scalp.

1988 – Drs. Farwell and Donchin develop the P-300 Speller technique, attaining 2.3 characters per minute recognition.

2006 – Dr. Santhanam and others at Stanford University use electrode arrays implanted in monkeys to prove that data rates of up to 6.5 bits per second (or 15 characters per minute) are possible.  The team noted that performance with these implanted arrays degraded over time.

2008 – Dr. Hoffman and others report that improvements in signal processing have increased P-300 recognition rates up to 4.6 characters per minute (non-invasive EEG technique).

2009 – Dr. D’Zmura and others report that the vowels “bu” and “ku” can be individually recognized in EEG when thought of by the subject in a specific cadence/pattern.

2010 – Dr. Leuthardt and others, working with researchers in Dr. D’Zmura’s group, report that they can uniquely identify patterns of bu and ku cadences thought of by test subjects and use these to individually identify different subjects with 98% accuracy.

 


 

 

 

 

 

Hypothesis:

 It is hypothesized that the letters of the alphabet can be accurately classified using non-invasive EEG by a combination of signal enhancement, feature extraction and post-processing algorithms and the use of a novel classification method.  Letters within words imagined by the user as a series of musical notes at a specific cadence will be detected more accurately and more quickly than by the current non-invasive EEG P-300 speller technique.

 

Purpose:

·        To enable unspoken communication through a non-invasive EEG device.

·        Use musical patterns and cadence associated with each letter of the alphabet in order to speed up recognition vs. the traditional P300 Speller system.

·        A quicker speller system would enable locked-in patients (non-mobile people such as Dr. Stephen Hawking) the ability to communicate more thoroughly.

 

Procedure: 

·      Electrodes are attached to the heads of subjects at locations Fz, Cz, Pz, Oz, P3, P4 using the International 10-20 System.

·      Training Phase:  Subjects listen to a sound file containing 3 notes and one space during a 2 second period.  Each of these patterns is specific to a letter (A, F, T, E, R).  For the 2 seconds immediately after this, they “think” of the same pattern.  Their brainwaves are recorded. Subjects are encouraged to think of the tones in the same cadence and pattern with which they are heard.  Note that each subject is given two dry runs where the process occurs, but no readings are taken.

·      After the first session, the subjects take a break.  During this time, their brain recordings are trained using an LDA Classifier and a Bayes Algorithm.

·      Real-time Phase:  During a second session, the subjects are presented with a series of tones, each set representing on letter in the work “AFTER.”  They are played the correct tones for before each letter and system classifies it in real time.  If it is classified correctly, the tones for the next letter are presented.  If not, the subject repeats the last letter.

·        The process continues until all 5 letters in the word AFTER are correctly identified in the correct order.  The time taken is recorded.

 


 

 

 

 

 

 

Building the Low-Noise EEG:

The EEG was built using a TI 1298 Analog to Digital chip.  The 1298 chip was obtained in the form of a demonstrator board initially designed for ECG operation, and was heavily modified so that the demonstrator board was powered by and controlled using an Arduino Uno R3 controller board.  The data interface type was SPI.  The modifications are shown here:


 

 

 

 


 

 

 


 


 


 

 

 

 

Programming:

·       The Texas Instruments ADS1298 was programmed using an Arduino Uno R3 controller board.  Control registers and SPI timings were programmed in the Arduino using Arduino C programming language.

·       Control of the neural environment was made using OpenVibe, an open-source project for brain computer interfaces and real-time neurosciences.  This software is written in Visual C++ and the logic for the current project was programmed in a visual environment (see Data Acquisition). The Acquisition Step, LDA Training Step, Bayes Training Step and Online Step were all created using this environment.

·       The ADS1298 was connected through the Arduino by means of a serial-USB Windows driver.  However, a Windows driver for the purpose-built EEG used here was created for the project in C++.

·       As OpenVibe does not support a native Bayes classifier, one was written in C for this project.

·       The logic controlling the real-time synchronization of the audible sounds heard by participants and the recording of the subsequent brainwaves was controlled by a LUA stimulator.  LUA (another programming language) is supported by OpenVibe, but the controlling scripts in the acquisition steps were specifically written in LUA for this project. [LUA is commonly used in real-time gaming applications]


 

 


 

 

 

 

 

 

Subject Testing:

  1. Electrodes are attached to the heads of subjects at locations Fz, Cz, Pz, Oz, P3, P4 using the International 10-20 System.
  2. Training Phase:  Subjects listen to a sound file containing 3 notes and one space during a 2 second period.  Each of these patterns is specific to a letter (A, F, T, E, R).  For the 2 seconds immediately after this, they “think” of the same pattern.  Their brainwaves are recorded. Subjects are encouraged to think of the tones in the same cadence and pattern with which they are heard.  Note that each subject is given two dry runs where the process occurs, but no readings are taken.
  3. After the first session, the subjects take a break.  During this time, their brain recordings are trained using an LDA Classifier and a Bayes Algorithm.
  4. Real-time Phase:  During a second session, the subjects are presented with a series of tones, each set representing on letter in the work “AFTER.”  They are played the correct tones for before each letter and system classifies it in real time.  If it is classified correctly, the tones for the next letter are presented.  If not, the subject repeats the last letter.
  5. The process continues until all 5 letters in the word AFTER are correctly identified in the correct order.  The time taken is recorded.

 

 

 


 

 

 

 

 

 

Data Acquisition:

  1. In the Linear Discriminant Analysis (LDA) training scenario, the recorded brainwaves, along with their associated labels, were split into epochs and then transformed with the Fast Fourier Transform (FFT). 
  2. The resultant spectra were separated into six different frequency ranges (8-11, 12-15, 16-19, 20-23, 24-27, and 28-31Hz) and the average amplitude of these frequencies was calculated.
  3. The six frequency range averages were sent to individual LDA trainers to be compared in a one-versus-all fashion (the first LDA trainer would compare letter A with the other letters; the second: letter B versus the other letters, etc.).
  4. In the Bayes training scenario, the same recorded brainwaves were split out into six different frequency ranges, and classified by the LDA classifiers.
  5. The resultant target and non-target class labels were coupled with a “letter label” (the letter occurring at that instant) and written to a Comma Separated Value file (.csv, similar to an Excel spreadsheet).
  6. In the final scenario, the online scenario, incoming brainwaves were split into six frequency ranges like before, and then classified using LDA. 
  7. The target and non-target class labels were sent to the Bayes classification box, and the Bayes box returned a prediction for the letter that was thought of. 
  8. If the prediction matched the target letter, the user would be prompted to complete the pattern for the next letter, and if not, they would be prompted to complete the last letter again. 
  9. Upon reaching the final letter, the word completion time was output.

 

 


 

 

 

 

 

 

Signal Analysis:

 A flow chart of signal analysis methods is shown here.  In this experiment, signals were amplified with programmable gain amplifiers (12X) in the 1298 chipset, a DRL was employed to clean up the signal, and Butterworth bandpasses were used to remove signals outside the interesting range (3- 48 Hz). FFT, LDA and Bayes classifiers were used to analyze the resulting signals.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


 

 

 

 

 

Results:

  1. The three second/four note rhythm more quickly and accurately determined the intended letter than the two second /three note rhythm by a considerable margin.  This was the reason why the second rhythm was used in the multi-subject determination of time to detect the word “AFTER”.   The three second rhythm created my significantly unique waveforms for each letter depicted and was used as the rhythm for the detection of letters with 5 test subjects.
  2. The rate of spelling a five letter work using the non-invasive ADS11298/Arduino EEG system described here was found to be an average of 39.3 seconds with a standard deviation of 2.0 seconds for all of the test results (18 data points).  Note that two subjects were each unable to finish spelling the test word in one of their trials.  This is a significant improvement over the P-300 Speller system which has been tested many times, but is known to have a speed of 105 seconds for a 5 letter work (Usakli, 2009).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 Conclusions:

  1. The ADS1298/Arduino EEG system was successful in generating useful signals for the selective detection of letters.
  2. A three second/four note rhythm was found to be significantly superior to a two second/three note rhythm for signal differentiation.
  3. The novel combination of a three second/four note rhythm imagined by the test subject, coupled with LDA classification and sorting by Bayes classifiers designed for this experiment was successful in communicating silently.
  4. The time taken for the subject to communicate a 5 letter word was 39.3 seconds with a standard deviation of 2.0 seconds in 18 of the 20 test subject sessions.
  5. The designed ADS1298/Arduino EEG system non-invasively classifying a four note, three second signal was found to be significantly faster than the existing P300 Speller System.