Computational model decodes speech by predicting it

June 26, 2020

The brain analyses spoken language by recognising syllables. Scientists from the University of Geneva (UNIGE) and the Evolving Language National Centre for Competence in Research (NCCR) have designed a computational model that reproduces the complex mechanism employed by the central nervous system to perform this operation. The model, which brings together two independent theoretical frameworks, uses the equivalent of neuronal oscillations produced by brain activity to process the continuous sound flow of connected speech. The model functions according to a theory known as predictive coding, whereby the brain optimizes perception by constantly trying to predict the sensory signals based on candidate hypotheses (syllables in this model). The resulting model, described in the journal Nature Communications, has helped the live recognition of thousands of syllables contained in hundreds of sentences spoken in natural language. This has validated the idea that neuronal oscillations can be used to coordinate the flow of syllables we hear with the predictions made by our brain.

"Brain activity produces neuronal oscillations that can be measured using electroencephalography," begins Anne-Lise Giraud, professor in the Department of Basic Neurosciences in UNIGE's Faculty of Medicine and co-director of the Evolving Language NCCR. These are electromagnetic waves that result from the coherent electrical activity of entire networks of neurons. There are several types, defined according to their frequency. They are called alpha, beta, theta, delta or gamma waves. Taken individually or superimposed, these rhythms are linked to different cognitive functions, such as perception, memory, attention, alertness, etc.

However, neuroscientists do not yet know whether they actively contribute to these functions and how. In an earlier study published in 2015, Professor Giraud's team showed that the theta waves (low frequency) and gamma waves (high frequency) coordinate to sequence the sound flow in syllables and to analyse their content so they can be recognised.

The Geneva-based scientists developed a spiking neural network computer model based on these physiological rhythms, whose performance in sequencing live (on-line) syllables was better than that of traditional automatic speech recognition systems.

The rhythm of the syllables

In their first model, the theta waves (between 4 and 8 Hertz) made it possible to follow the rhythm of the syllables as they were perceived by the system. Gamma waves (around 30 Hertz) were used to segment the auditory signal into smaller slices and encode them. This produces a "phonemic" profile linked to each sound sequence, which could be compared, a posteriori, to a library of known syllables. One of the advantages of this type of model is that it spontaneously adapts to the speed of speech, which can vary from one individual to another.

Predictive coding

In this new article, to stay closer to the biological reality, Professor Giraud and her team developed a new model where they incorporate elements from another theoretical framework, independent of the neuronal oscillations: "predictive coding". "This theory holds that the brain functions so optimally because it is constantly trying to anticipate and explain what is happening in the environment by using learned models of how outside events generate sensory signals. In the case of spoken language, it attempts to find the most likely causes of the sounds perceived by the ear as speech unfolds, on the basis of a set of mental representations that have been learned and that are being permanently updated.", says Dr. Itsaso Olasagasti, computational neuroscientist in Giraud's team, who supervised the new model implementation.

"We developed a computer model that simulates this predictive coding," explains Sevada Hovsepyan, a researcher in the Department of Basic Neurosciences and the article's first author. "And we implemented it by incorporating oscillatory mechanisms."

Tested on 2,888 syllables

The sound entering the system is first modulated by a theta (slow) wave that resembles what neuron populations produce. It makes it possible to signal the contours of the syllables. Trains of (fast) gamma waves then help encode the syllable as and when it is perceived. During the process, the system suggests possible syllables and corrects the choice if necessary. After going back and forth between the two levels several times, it discovers the right syllable. The system is subsequently reset to zero at the end of each perceived syllable.

The model has been successfully tested using 2,888 different syllables contained in 220 sentences, spoken in natural language in English. "On the one hand, we succeeded in bringing together two very different theoretical frameworks in a single computer model," explains Professor Giraud. "On the other, we have shown that neuronal oscillations most likely rhythmically align the endogenous functioning of the brain with signals that come from outside via the sensory organs. If we put this back in predictive coding theory, it means that these oscillations probably allow the brain to make the right hypothesis at exactly the right moment."
-end-


Université de Genève

Related Brain Activity Articles from Brightsurf:

Inhibiting epileptic activity in the brain
A new study shows that a protein -- called DUSP4 -- was increased in healthy brain tissue directly adjacent to epileptic tissue.

What is your attitude towards a humanoid robot? Your brain activity can tell us!
Researchers at IIT-Istituto Italiano di Tecnologia in Italy found that people's bias towards robots, that is, attributing them intentionality or considering them as 'mindless things', can be correlated with distinct brain activity patterns.

Using personal frequency to control brain activity
Individual frequency can be used to specifically influence certain areas of the brain and thus the abilities processed in them - solely by electrical stimulation on the scalp, without any surgical intervention.

Rats' brain activity reveals their alcohol preference
The brain's response to alcohol varies based on individual preferences, according to new research in rats published in eNeuro.

Studies of brain activity aren't as useful as scientists thought
Hundreds of published studies over the last decade have claimed it's possible to predict an individual's patterns of thoughts and feelings by scanning their brain in an MRI machine as they perform some mental tasks.

A child's brain activity reveals their memory ability
A child's unique brain activity reveals how good their memories are, according to research recently published in JNeurosci.

How dopamine drives brain activity
Using a specialized magnetic resonance imaging (MRI) sensor that can track dopamine levels, MIT neuroscientists have discovered how dopamine released deep within the brain influences distant brain regions.

Brain activity intensity drives need for sleep
The intensity of brain activity during the day, notwithstanding how long we've been awake, appears to increase our need for sleep, according to a new UCL study in zebrafish, published in Neuron.

Do babies like yawning? Evidence from brain activity
Contagious yawning is observed in many mammals, but there is no such report in human babies.

Understanding brain activity when you name what you see
Using complex statistical methods and fast measurement techniques, researchers found how the brain network comes up with the right word and enables us to say it.

Read More: Brain Activity News and Brain Activity Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.