Computer program converts brain signals into synthetic voice


A new computer program translates brain signals into language. The technology tracks electrical messages transmitted to muscles in and around the mouth to decode what the brain is trying to say. More testing is needed, but the developers say it could be used to design brain implants to help people with stroke or brain disease communicate.

“We want to create technologies that can reproduce speech directly from human brain activity,” said Edward Chang, a neurosurgeon at the University of California at San Francisco, who led the research, at a conference hurry. “This study provides proof of principle that this is possible.” He and his colleagues describe the results in Nature today (April 24).

The technique is very invasive and relies on electrodes placed deep in the brain. As such, it has so far only been tested on five people with epilepsy who have had the electrodes fitted as part of their treatment. These people could – and spoke – during the tests, which allowed the computer to determine the associated brain signals. Scientists now need to see if it works in people who can’t speak.

It will likely be more difficult, says Nick Ramsey, a neuroscientist at Utrecht University Medical Center in the Netherlands, who is working on brain implants to help people with confinement syndrome communicate, despite generalized paralysis in their muscles. “It’s still an open question whether you’ll be able to get enough brain data from people who can’t speak to build your decoder,” but he says the study is “sleek and sophisticated” and the results are promising. . “I’ve been following their job for a few years and they really understand what they’re doing.

Speech is one of the most complex motor actions in the human body. It requires precise neural control and coordination of the muscles of the lips, tongue, jaw and larynx. To decode this activity, the scientists used the implanted electrodes to track signals sent by the brain when volunteers read aloud a series of sentences. A computer algorithm analyzed these instructions using a preexisting model of how the vocal tract moves to produce sound. A second processing step then converted these predicted movements into spoken sentences.

This two-step approach – translating brain activity into motor movements, then motor movements into words – produces less distortion than trying to convert brain signals directly to speech, Chang explains. When the team played 101 synthesized sentences to listeners and asked them to identify the spoken words from a list of 25 words, they transcribed 43% of them accurately.

Qinwan Rabbani, a graduate student who works on similar systems at Johns Hopkins University, listened to the synthesized phrases and says they are good, especially since the computer only had about a dozen minutes of speech to analyze. Algorithms that decode speech typically require “days or weeks” of audio recordings, he says.

The brain signals that control speech are more complicated to decode than those used, for example, to move the arms and legs, and more easily influenced by emotional state and fatigue. This means that a synthetic speech system ultimately applied to paralyzed patients would likely be limited to a limited set of words, Rabbani says.

GK Anumanchipalli et al., “Speech synthesis from neural decoding of spoken sentences”, Nature, doi: 10.1038 / s41586-019-1119-1, 2019.


Gordon K. Morehouse