Research Article: A Wireless Brain-Machine Interface for Real-Time Speech Synthesis

Date Published: December 9, 2009

Publisher: Public Library of Science

Author(s): Frank H. Guenther, Jonathan S. Brumberg, E. Joseph Wright, Alfonso Nieto-Castanon, Jason A. Tourville, Mikhail Panko, Robert Law, Steven A. Siebert, Jess L. Bartels, Dinal S. Andreasen, Princewill Ehirim, Hui Mao, Philip R. Kennedy, Eshel Ben-Jacob.

Abstract: Brain-machine interfaces (BMIs) involving electrodes implanted into the human cerebral cortex have recently been developed in an attempt to restore function to profoundly paralyzed individuals. Current BMIs for restoring communication can provide important capabilities via a typing process, but unfortunately they are only capable of slow communication rates. In the current study we use a novel approach to speech restoration in which we decode continuous auditory parameters for a real-time speech synthesizer from neuronal activity in motor cortex during attempted speech.

Partial Text: Perhaps the most debilitating aspect of profound paralysis due to accident, stroke, or disease is loss of the ability to speak. The loss of speech not only makes the communication of needs to caregivers very difficult, but it also leads to profound social isolation of the affected individual. Existing augmentative communication systems can provide the ability to type using residual movement, electromyographic (EMG) signals, or, in cases of extreme paralysis such as that considered here, brain-machine interfaces (BMIs) that utilize electroencephalographic (EEG) signals [1] which can remain intact even in the complete absence of voluntary movement or EMG signals.

Our results demonstrate the ability of a profoundly paralyzed individual to relatively quickly (over the course of a 1.5 hour session involving 34 or fewer vowel production attempts) improve his performance on a vowel production task using a BMI based on formant frequency estimates decoded in real-time from the activity of neurons in his speech motor cortex. The participant’s average hit rate rose from 45% on the first block to 70% on the last block across sessions (reaching a high of 89% in the last block of the last session) and average endpoint error decreased from 435 Hz to 233 Hz from the first to the last block across sessions. These results generalize previous findings of successful motor learning for control of a computer cursor by paralyzed humans with implants in the hand/arm area of motor cortex [29], [30], [32]–[34], an ability previously demonstrated in monkey BMI studies [35]–[37], to the domain of speech production. They also support the feasibility of neural prostheses that could decode and synthesize intended words in real-time, i.e. as the (mute) speaker attempts to speak them.



0 0 vote
Article Rating
Notify of
Inline Feedbacks
View all comments