Model analyzes how viruses escape the immune system

January 14, 2021

CAMBRIDGE, MA -- One reason it's so difficult to produce effective vaccines against some viruses, including influenza and HIV, is that these viruses mutate very rapidly. This allows them to evade the antibodies generated by a particular vaccine, through a process known as "viral escape."

MIT researchers have now devised a new way to computationally model viral escape, based on models that were originally developed to analyze language. The model can predict which sections of viral surface proteins are more likely to mutate in a way that enables viral escape, and it can also identify sections that are less likely to mutate, making them good targets for new vaccines.

"Viral escape is a big problem," says Bonnie Berger, the Simons Professor of Mathematics and head of the Computation and Biology group in MIT's Computer Science and Artificial Intelligence Laboratory. "Viral escape of the surface protein of influenza and the envelope surface protein of HIV are both highly responsible for the fact that we don't have a universal flu vaccine, nor do we have a vaccine for HIV, both of which cause hundreds of thousands of deaths a year."

In a study appearing today in Science, Berger and her colleagues identified possible targets for vaccines against influenza, HIV, and SARS-CoV-2. Since that paper was accepted for publication, the researchers have also applied their model to the new variants of SARS-CoV-2 that recently emerged in the United Kingdom and South Africa. That analysis, which has not yet been peer-reviewed, flagged viral genetic sequences that should be further investigated for their potential to escape the existing vaccines, the researchers say.

Berger and Bryan Bryson, an assistant professor of biological engineering at MIT and a member of the Ragon Institute of MGH, MIT, and Harvard, are the senior authors of the paper, and the lead author is MIT graduate student Brian Hie.

The language of proteins

Different types of viruses acquire genetic mutations at different rates, and HIV and influenza are among those that mutate the fastest. For these mutations to promote viral escape, they must help the virus change the shape of its surface proteins so that antibodies can no longer bind to them. However, the protein can't change in a way that makes it nonfunctional.

The MIT team decided to model these criteria using a type of computational model known as a language model, from the field of natural language processing (NLP). These models were originally designed to analyze patterns in language, specifically, the frequency which with certain words occur together. The models can then make predictions of which words could be used to complete a sentence such as "Sally ate eggs for ..." The chosen word must be both grammatically correct and have the right meaning. In this example, an NLP model might predict "breakfast," or "lunch."

The researchers' key insight was that this kind of model could also be applied to biological information such as genetic sequences. In that case, grammar is analogous to the rules that determine whether the protein encoded by a particular sequence is functional or not, and semantic meaning is analogous to whether the protein can take on a new shape that helps it evade antibodies. Therefore, a mutation that enables viral escape must maintain the grammaticality of the sequence but change the protein's structure in a useful way.

"If a virus wants to escape the human immune system, it doesn't want to mutate itself so that it dies or can't replicate," Hie says. "It wants to preserve fitness but disguise itself enough so that it's undetectable by the human immune system."

To model this process, the researchers trained an NLP model to analyze patterns found in genetic sequences, which allows it to predict new sequences that have new functions but still follow the biological rules of protein structure. One significant advantage of this kind of modeling is that it requires only sequence information, which is much easier to obtain than protein structures. The model can be trained on a relatively small amount of information -- in this study, the researchers used 60,000 HIV sequences, 45,000 influenza sequences, and 4,000 coronavirus sequences.

"Language models are very powerful because they can learn this complex distributional structure and gain some insight into function just from sequence variation," Hie says. "We have this big corpus of viral sequence data for each amino acid position, and the model learns these properties of amino acid co-occurrence and co-variation across the training data."

Blocking escape

Once the model was trained, the researchers used it to predict sequences of the coronavirus spike protein, HIV envelope protein, and influenza hemagglutinin (HA) protein that would be more or less likely to generate escape mutations.

For influenza, the model revealed that the sequences least likely to mutate and produce viral escape were in the stalk of the HA protein. This is consistent with recent studies showing that antibodies that target the HA stalk (which most people infected with the flu or vaccinated against it do not develop) can offer near-universal protection against any flu strain.

The model's analysis of coronaviruses suggested that a part of the spike protein called the S2 subunit is least likely to generate escape mutations. The question still remains as to how rapidly the SARS-CoV-2 virus mutates, so it is unknown how long the vaccines now being deployed to combat the Covid-19 pandemic will remain effective. Initial evidence suggests that the virus does not mutate as rapidly as influenza or HIV. However, the researchers recently identified new mutations that have appeared in Singapore, South Africa, and Malaysia, that they believe should be investigated for potential viral escape (these new data are not yet peer-reviewed).

In their studies of HIV, the researchers found that the V1-V2 hypervariable region of the protein has many possible escape mutations, which is consistent with previous findings, and they also found sequences that would have a lower probability of escape.

The researchers are now working with others to use their model to identify possible targets for cancer vaccines that stimulate the body's own immune system to destroy tumors. They say it could also be used to design small-molecule drugs that might be less likely to provoke resistance, for diseases such as tuberculosis.

"There are so many opportunities, and the beautiful thing is all we need is sequence data, which is easy to produce," Bryson says.
The research was funded by a National Defense Science and Engineering Graduate Fellowship from the Department of Defense and a National Science Foundation Graduate Research Fellowship.

Massachusetts Institute of Technology

Related Immune System Articles from Brightsurf:

How the immune system remembers viruses
For a person to acquire immunity to a disease, T cells must develop into memory cells after contact with the pathogen.

How does the immune system develop in the first days of life?
Researchers highlight the anti-inflammatory response taking place after birth and designed to shield the newborn from infection.

Memory training for the immune system
The immune system will memorize the pathogen after an infection and can therefore react promptly after reinfection with the same pathogen.

Immune system may have another job -- combatting depression
An inflammatory autoimmune response within the central nervous system similar to one linked to neurodegenerative diseases such as multiple sclerosis (MS) has also been found in the spinal fluid of healthy people, according to a new Yale-led study comparing immune system cells in the spinal fluid of MS patients and healthy subjects.

COVID-19: Immune system derails
Contrary to what has been generally assumed so far, a severe course of COVID-19 does not solely result in a strong immune reaction - rather, the immune response is caught in a continuous loop of activation and inhibition.

Immune cell steroids help tumours suppress the immune system, offering new drug targets
Tumours found to evade the immune system by telling immune cells to produce immunosuppressive steroids.

Immune system -- Knocked off balance
Instead of protecting us, the immune system can sometimes go awry, as in the case of autoimmune diseases and allergies.

Too much salt weakens the immune system
A high-salt diet is not only bad for one's blood pressure, but also for the immune system.

Parkinson's and the immune system
Mutations in the Parkin gene are a common cause of hereditary forms of Parkinson's disease.

How an immune system regulator shifts the balance of immune cells
Researchers have provided new insight on the role of cyclic AMP (cAMP) in regulating the immune response.

Read More: Immune System News and Immune System Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to