 |
|
 |
|
|
|
If "Scifi no more: Machine Translates Thoughts into Speech in Real Time" is not shown property. Visit the source link above.
|
|
Machine Translates Thoughts into Speech in Real
Time |
| |
 |
| |
|
(PhysOrg.com) -- By implanting an
electrode into the brain of a person with locked-in
syndrome, scientists have demonstrated how to wirelessly
transmit neural signals to a speech synthesizer. The
"thought-to-speech" process takes about 50 milliseconds -
the same amount of time for a non-paralyzed, neurologically
intact person to speak their thoughts. The study marks the
first successful demonstration of a permanently installed,
wireless implant for real-time control of an external
device. |
| |
The study is led by Frank Guenther of the
Department of Cognitive and Neural Systems and the Sargent College
of Health and Rehabilitation Sciences at Boston University, as well
as the Division of Health Science and
Technology at Harvard
University-Massachusetts Institute of Technology. The
research team includes collaborators from Neural Signals,
Inc., in Duluth, Georgia; StatsANC LLC in Buenos Aires,
Argentina; the Georgia Tech Research Institute in Marietta,
Georgia; the Gwinnett Medical Center in Lawrenceville,
Georgia; and Emory University Hospital in Atlanta, Georgia.
The team published their results in a recent issue of
PLoS
ONE. |
| |
“The results of our study show that a
brain-machine interface (BMI) user can control sound output
directly, rather than having to use a (relatively slow) typing
process,” Guenther told PhysOrg.com. |
| |
In their study, the researchers tested the
technology on a 26-year-old male who had a brain stem stroke at age 16.
The brain stem stroke caused a lesion between the volunteer’s
motor neurons that carry
out actions and the rest of the brain; while his consciousness and
cognitive abilities are intact, he is paralyzed except for slow
vertical movement of the eyes. The rare condition is called
locked-in syndrome. |
| |
Five years ago, when the volunteer was 21 years
old, the scientists implanted an electrode near the boundary
between the speech-related premotor and primary motor cortex
(specifically, the left ventral premotor cortex). Neurites began
growing into the electrode and, in three or four months, the
neurites produced signaling patterns on the electrode wires that
have been maintained indefinitely. |
| |
Three years after implantation, the researchers began testing
the brain-machine interface for real-time synthetic
speech production. The system is
“telemetric” - it requires no wires or connectors passing
through the skin, eliminating the risk of infection. Instead,
the electrode amplifies and converts neural signals into
frequency modulated (FM) radio signals. These signals are
wirelessly transmitted across the scalp to two coils, which are
attached to the volunteer’s head using a water-soluble paste.
The coils act as receiving antenna for the RF signals. The
implanted electrode is powered by an induction power supply via
a power coil, which is also attached to the
head. |
| |
The signals are then routed to an
electrophysiological recording system that digitizes and sorts
them. The sorted spikes, which contain the relevant data, are sent
to a neural decoder that runs on a desktop computer. The neural
decoder’s output becomes the input to a speech synthesizer, also
running on the computer. Finally, the speech synthesizer generates
synthetic speech (in the current study, only three vowel sounds
were tested). The entire process takes an average of 50
milliseconds. |
| |
As the scientists explained, there are no previous
electrophysiological studies of neuronal firing in speech motor
areas. In order to develop an accurate neural coding scheme, they
had to rely on an established neurocomputational model of speech
motor control. According to this model, neurons in
the left ventral premotor cortex represent intended speech sounds
in terms of “formant frequency trajectories.” |
| |
In an intact brain, these frequency trajectories
are sent to the primary motor cortex where they are transformed
into motor commands to the speech articulators. However, in the
current study, the researchers had to interpret these frequency
trajectories in order to translate them into speech. To do this,
the scientists developed a two-dimensional formant frequency space,
in which different vowel sounds can be plotted based on two formant
frequencies (whose values are represented on the x and y
axes). |
| |
“The study supported our hypothesis (based on the
DIVA model, our neural network model of speech) that the premotor
cortex represents intended speech as an ‘auditory trajectory,’ that
is, as a set of key frequencies (formant frequencies) that vary
with time in the acoustic signal we hear as speech,” Guenther said.
“In other words, we could predict the intended sound directly from
neural activity in the premotor cortex, rather than try to predict
the positions of all the speech articulators individually and then
try to reconstruct the intended sound (a much more difficult
problem given the small number of neurons from which we recorded).
This result provides our first insight into how neurons in the
brain represent speech, something that has not been investigated
before since there is no animal model for
speech.” |
| |
To confirm that the neurons in the implanted area
were able to carry speech information in the form of formant
frequency trajectories, the researchers asked the volunteer to
attempt to speak in synchrony with a vowel sequence that was
presented auditorily. In later experiments, the volunteer received
real-time auditory feedback from the speech synthesizer. During 25
sessions over a five-month period, the volunteer significantly
improved the thought-to-speech accuracy. His average hit rate
increased from 45% to 70% across sessions, reaching a high of 89%
in the last session. |
| |
Although the current study focused only on
producing a small set of vowels, the researchers think that
consonant sounds could be achieved with improvements to the system.
While this study used a single three-wire electrode, the use of
additional electrodes at multiple recording sites, as well as
improved decoding techniques, could lead to rapid, accurate control
of a speech synthesizer that could generate a wide range of
sounds. |
| |
“Our immediate plans involve the implementation of
a new synthesizer that can produce consonants as well as vowels but
remains simple enough for a BMI user to control,” Guenther said.
“We are also working on hardware that will greatly increase the
number of neurons that are recorded. We expect to tap into at least
10 times as many neurons in the next implant recipient, which
should lead to a dramatic improvement in
performance.” |
| |
Overall, the work marks a milestone in the
development of a permanent neural prosthesis that requires no major
external hardware beyond a wireless receiver and laptop computer.
Previous brain-machine interfaces for
communication applications are very slow,
producing only about one word per minute. The new system has
the potential to enable real-time conversation, and help
minimize the social isolation that accompanies profound
paralysis.
|
| |
More information: Guenther FH, Brumberg JS,
Wright EJ, Nieto-Castanon A, Tourville JA, et al. (2009) A Wireless
Brain-Machine Interface for Real-Time Speech Synthesis. PLoS ONE
4(12): e8218. doi:10.1371/journal.pone.0008218
|
| |
|
Copyright 2009 PhysOrg.com.
All rights reserved. This material may not be published,
broadcast, rewritten or redistributed in whole or part without the
express written permission of
PhysOrg.com.
|
|
|
|
 |
|
 |
|
|
|
|
|
|