A machine learning approach to identify functional human phosphosites

December 11, 2019

9 December 2019, Cambridge - Researchers at the EMBL's European Bioinformatics Institute (EMBL-EBI) have created the largest reference phosphoproteome to date of almost 120 000 human phosphosites. To identify those most likely to be critical, they used a machine learning approach capable of ranking them according to functional importance.

Proteins are the core molecular machines of the cell that can be regulated by protein modifications, akin to molecular switches. Protein phosphorylation is one such molecular switch, that can alter the structural conformation of a protein, causing it to become activated, deactivated or modifying its function. Despite decades of work the total number of these modifications and which ones are truly critical for life remains a mystery.

This research, published in Nature Biotechnology, creates a freely-accessible resource that can be used by researchers to better understand which proteins are phosphorylated and which phosphosites have functional relevance. Access to this data has significant implications to accelerate the progression of research into many different biological processes and diseases.

Machine learning and data sharing

"This new resource would not have been possible if scientists around the world didn't share their research data and results," says Pedro Beltrao, Group Leader at the EMBL-EBI. "It would take a single machine over 500 consecutive days to run all the mass spectrometry experiments used to create this database. By applying machine learning to this huge dataset, we created a scoring system that will hopefully help researchers to determine which lesser-known phosphosites to explore next."

The researchers at EMBL-EBI curated over 100 publicly available phospho-enriched human datasets containing over 6000 mass-spectrometry experiments from EMBL-EBI's PRoteomics IDEntifications (PRIDE) database. This large-scale project has generated the biggest open access reference phosphoproteome database to date.

Functional human phosphosites

To identify the phosphosites most critical to human cells, machine learning was used to integrate diverse annotations for each site such as the degree of conservation. The phosphosite functional score generated in this study has enormous potential to help other scientists uncover more about their proteins of interest. It can be used to rank known phosphosites to distinguish those which are functionally relevant for molecular processes and disease.

For example, the researchers were able to demonstrate the practicality of their functional score model by identifying two high-scoring phosphosites which play a role in regulating neuronal differentiation.

"The functional score model created from this study can be used to uncover an abundance of new, functional phosphosites that may play crucial roles in disease," says David Ochoa, Project Coordinator at Open Targets. "We already know of several groups who are using the scoring model, so we would like to encourage researchers everywhere to explore the resource and make use of it."
Source articles

OCHOA, D., et al. (2019). The functional landscape of the human phosphoproteome. Nature Biotechnology. Published online 09 12; DOI:10.1038/s41587-019-0344-3

European Molecular Biology Laboratory - European Bioinformatics Institute

Related Protein Articles from Brightsurf:

The protein dress of a neuron
New method marks proteins and reveals the receptors in which neurons are dressed

Memory protein
When UC Santa Barbara materials scientist Omar Saleh and graduate student Ian Morgan sought to understand the mechanical behaviors of disordered proteins in the lab, they expected that after being stretched, one particular model protein would snap back instantaneously, like a rubber band.

Diets high in protein, particularly plant protein, linked to lower risk of death
Diets high in protein, particularly plant protein, are associated with a lower risk of death from any cause, finds an analysis of the latest evidence published by The BMJ today.

A new understanding of protein movement
A team of UD engineers has uncovered the role of surface diffusion in protein transport, which could aid biopharmaceutical processing.

A new biotinylation enzyme for analyzing protein-protein interactions
Proteins play roles by interacting with various other proteins. Therefore, interaction analysis is an indispensable technique for studying the function of proteins.

Substituting the next-best protein
Children born with Duchenne muscular dystrophy have a mutation in the X-chromosome gene that would normally code for dystrophin, a protein that provides structural integrity to skeletal muscles.

A direct protein-to-protein binding couples cell survival to cell proliferation
The regulators of apoptosis watch over cell replication and the decision to enter the cell cycle.

A protein that controls inflammation
A study by the research team of Prof. Geert van Loo (VIB-UGent Center for Inflammation Research) has unraveled a critical molecular mechanism behind autoimmune and inflammatory diseases such as rheumatoid arthritis, Crohn's disease, and psoriasis.

Resurrecting ancient protein partners reveals origin of protein regulation
After reconstructing the ancient forms of two cellular proteins, scientists discovered the earliest known instance of a complex form of protein regulation.

Sensing protein wellbeing
The folding state of the proteins in live cells often reflect the cell's general health.

Read More: Protein News and Protein Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.