A Sudoku-solving algorithm holds promise for protein medicine

September 23, 2020

Computational biologists at the University of Toronto have developed an artificial intelligence algorithm that has the potential to create novel protein molecules as finely tuned therapeutics.

The team led by Philip M. Kim, a professor of molecular genetics and computer science at the Donnelly Centre for Cellular and Biomolecular Research at U of T's Faculty of Medicine, have developed ProteinSolver, a graph neural network that can design a fully new protein to fit a given geometric shape. The researchers took inspiration from the Japanese number puzzle Sudoku, whose constraints are conceptually similar to those of a protein molecule.

Their findings are published in the journal Cell Systems.

"The parallel with Sudoku becomes apparent when you depict a protein molecule as a network," says Kim, adding that the portrayal of proteins in graph form is standard practice in computational biology.

A newly synthesized protein is a string of amino-acids, stitched together according to the instructions in that protein's gene code. The amino-acid polymer then folds in and around itself into a three-dimensional molecular machine that can be harnessed for medicine.

A protein converted into a graph looks like a network of nodes, representing amino-acids, connected by edges, which are the distances between them within the molecule. By applying principles from graph theory, it then becomes possible to model the molecule's geometry for a specific purpose to, for example, neutralize an invading virus or shut down an overactive receptor in cancer.

Proteins make good drugs thanks to three-dimensional features on their surface with which they bind cellular targets with more precision than the synthetic small molecule drugs that tend to be broad spectrum and can lead to harmful off-target side effects.

Just over a third of all medications approved over the last couple of years were proteins, which also make up the vast majority of top ten drugs globally, Kim said. Insulin, antibodies and growth factors are only some examples of injectable cellular proteins, also known as biologics, already in use.

Designing proteins from scratch remains incredibly difficult however, owing to the vast number of possible structures to choose from.

"The main problem in protein design is that you have a very large search space," says Kim, referring to the many ways in which the 20 naturally occurring amino-acids can be combined into protein structures.

"For a standard-length protein of 100 amino-acids, there are 20100 possible molecular structures, that's more than the number of molecules in the universe," he says.

Kim decided to turn the problem on its head, by starting with a three-dimensional structure and working out its amino-acid composition.

"It's the protein design, or the inverse protein folding problem - you have a shape in mind and you want a sequence (of amino-acids) that will fold into that shape. Solving this is in some ways more useful than protein folding, as you can in theory generate new proteins for any purpose," says Kim.

That's when Alexey Strokach, a PhD student in Kim's lab, turned to Sudoku, after learning in a class about its relatedness to molecular geometry.

In Sudoku, the goal is to find missing values in a sparsely filled grid by observing a set of rules and the existing number values.

Individual amino-acids in a protein molecule are similarly constrained by their neighbours. Local electrostatic forces ensure that amino-acids carrying opposite electric charge pack closely together while those with the same charge are pulled apart.

Strokach first built the constraints found in Sudoku into a neural network algorithm. He then trained the algorithms on a vast database of available protein structures and their amino-acid sequences from across the tree of life. The goal was to teach the algorithm, ProteinSolver, the rules, honed by evolution over millions of years, of packing amino-acids together into smaller folds. Applying these rules to the engineering process should increase the chances of having a functional protein at the end.

The researchers then tested ProteinSolver by giving it existing protein folds and asking it to generate amino-acid sequences that can build them. They then took the novel computed sequences, which do not exist in nature, and manufactured the corresponding protein variants in the lab. The variants folded into the expected structures, showing that the approach works.

In its current form, ProteinSolver is able to compute novel amino-acid sequences for any protein fold known to be geometrically stable. But the ultimate goal is to engineer novel protein structures with entirely new biological functions, as new therapeutics, for example.

"The ultimate goal is for someone to be able to draw a completely new protein by hand and compute sequences for that, and that's what we are working on now," says Strokach.

The researchers made ProteinSolver and the code behind it open source and available to the wider research community through a user-friendly website.

University of Toronto

Related Science Articles from Brightsurf:

75 science societies urge the education department to base Title IX sexual harassment regulations on evidence and science
The American Educational Research Association (AERA) and the American Association for the Advancement of Science (AAAS) today led 75 scientific societies in submitting comments on the US Department of Education's proposed changes to Title IX regulations.

Science/Science Careers' survey ranks top biotech, biopharma, and pharma employers
The Science and Science Careers' 2018 annual Top Employers Survey polled employees in the biotechnology, biopharmaceutical, pharmaceutical, and related industries to determine the 20 best employers in these industries as well as their driving characteristics.

Science in the palm of your hand: How citizen science transforms passive learners
Citizen science projects can engage even children who previously were not interested in science.

Applied science may yield more translational research publications than basic science
While translational research can happen at any stage of the research process, a recent investigation of behavioral and social science research awards granted by the NIH between 2008 and 2014 revealed that applied science yielded a higher volume of translational research publications than basic science, according to a study published May 9, 2018 in the open-access journal PLOS ONE by Xueying Han from the Science and Technology Policy Institute, USA, and colleagues.

Prominent academics, including Salk's Thomas Albright, call for more science in forensic science
Six scientists who recently served on the National Commission on Forensic Science are calling on the scientific community at large to advocate for increased research and financial support of forensic science as well as the introduction of empirical testing requirements to ensure the validity of outcomes.

World Science Forum 2017 Jordan issues Science for Peace Declaration
On behalf of the coordinating organizations responsible for delivering the World Science Forum Jordan, the concluding Science for Peace Declaration issued at the Dead Sea represents a global call for action to science and society to build a future that promises greater equality, security and opportunity for all, and in which science plays an increasingly prominent role as an enabler of fair and sustainable development.

PETA science group promotes animal-free science at society of toxicology conference
The PETA International Science Consortium Ltd. is presenting two posters on animal-free methods for testing inhalation toxicity at the 56th annual Society of Toxicology (SOT) meeting March 12 to 16, 2017, in Baltimore, Maryland.

Citizen Science in the Digital Age: Rhetoric, Science and Public Engagement
James Wynn's timely investigation highlights scientific studies grounded in publicly gathered data and probes the rhetoric these studies employ.

Science/Science Careers' survey ranks top biotech, pharma, and biopharma employers
The Science and Science Careers' 2016 annual Top Employers Survey polled employees in the biotechnology, biopharmaceutical, pharmaceutical, and related industries to determine the 20 best employers in these industries as well as their driving characteristics.

Three natural science professors win TJ Park Science Fellowship
Professor Jung-Min Kee (Department of Chemistry, UNIST), Professor Kyudong Choi (Department of Mathematical Sciences, UNIST), and Professor Kwanpyo Kim (Department of Physics, UNIST) are the recipients of the Cheong-Am (TJ Park) Science Fellowship of the year 2016.

Read More: Science News and Science Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.