AI algorithms can help scientists process brain waves and convert them directly into speech, according to new research.
“Our voices help connect us to our friends, family and the world around us, which is why losing the power of one’s voice due to injury or disease is so devastating,” said Nima Mesgarani, senior author of the paper published in Scientific Reports and a researcher at Columbia University. “With today’s study, we have a potential way to restore that power. We’ve shown that, with the right technology, these people’s thoughts could be decoded and understood by any listener.”
Neurons in our brain’s auditory cortex are excited whenever we listen to people speak – or even imagine people speaking. How exactly the brain makes sense of the jumble of sound waves or constructs a facsimile of the process when we imagine people speaking is still unknown. However, neuroscientists have shown that brain patterns emitted during a task can be pieced together to reconstruct the words being spoken. It has propelled the idea of building neuroprosthetics, devices that act as brain-computer interfaces.
The group of researchers tried to advance the technique known as auditory stimulus reconstruction using a neural network. First, an autoencoder was trained to convert audio signals to spectrograms, detailing different frequencies in the sounds, from 80 hours of speech recordings.
Next, the researchers placed electrodes directly onto the brains of five participants undergoing brain surgery for epilepsy to record electrical activity. All of them had normal hearing. They all listened to a recital of short stories for 30 minutes. The stories were randomly paused, and they were asked to recite the last sentence to train a vocoder. The vocoder was taught to map specific brain patterns to audible speech.
The participants listened to a string of 40 digits – zero to nine – being recited. The recorded brain signals were run through the vocoder to produce audio signals, and these samples were then fed back to the autoencoder for analysis so that the system could repeat the digits being reconstructed.
You can listen to an example here. It’s a bit robotic and tinny, and only zero to nine is being repeated.
“We found that people could understand and repeat the sounds about 75 per cent of the time, which is well above and beyond any previous attempts. The sensitive vocoder and powerful neural networks represented the sounds the patients had originally listened to with surprising accuracy,” said Mesagarani.
Although intriguing, the experiment is still very simplistic. The system can only reconstruct the signals from participants listening to speech, so it’s not their own thoughts. Also, it’s only the recital of digits and not full numbers or even sentences. The researchers hope to test their system out with more complicated words, and if it works by getting people to speak or imagine speaking.
“In this scenario, if the wearer thinks ‘I need a glass of water,’ our system could take the brain signals generated by that thought, and turn them into synthesized, verbal speech,” Mesagarani said. “This would be a game changer. It would give anyone who has lost their ability to speak, whether through injury or disease, the renewed chance to connect to the world around them.” ®