Protein misprediction uncovered by new technique

August 26, 2008

A new bioinformatics tool is capable of identifying and correcting abnormal, incomplete and mispredicted protein annotations in public databases. The MisPred tool, described today in the open access journal BMC Bioinformatics, currently uses five principles to identify suspect proteins that are likely to be abnormal or mispredicted.

László Patthy led a team from the Institute of Enzymology of the Hungarian Academy of Sciences, Budapest, that developed this new approach. He explained how necessary it is, "Recent studies have shown that a significant proportion of eukaryotic genes are mispredicted at the transcript level. As the MisPred routines are able to detect many of these errors, and may aid in their correction, we suggest that it may significantly improve the quality of protein sequence data based on gene predictions". The MisPred approach promises to save much time and effort that would otherwise be spent in further investigation of erroneously identified genes.

The MisPred approach rates annotations according to five dogmas:
  1. Extracellular or transmembrane proteins must have appropriate secretory signals.
  2. A protein with intra- and extra-cellular parts must have a transmembrane segment.
  3. Extracellular and nuclear domains must not occur in a single protein.
  4. The number of amino acid residues in closely related members of a globular domain family must fall into a relatively narrow range.
  5. A protein must be encoded by exons located on a single chromosome.

There are some exceptions to these rules, as pointed out by Patthy, "Some secreted proteins may truly lack secretory signal peptides since they are subject to leaderless protein secretion. Similarly, it cannot be excluded at present that transchromosomal chimeras can be formed and may have normal physiological functions. Nevertheless, the fact that MisPred analyses of protein sequences of the Swiss-Prot database identified very few such exceptions indicates that the rules of MisPred are generally valid".

The authors found that the absence of expected signal peptides and violation of domain integrity account for the majority of mispredictions. The authors note that "Interestingly, even the manually curated UniProtKB/Swiss-Prot dataset is contaminated with mispredicted or abnormal proteins, although to a much lesser extent than UniProtKB/TrEMBL or the EnsEMBL or GNOMON predicted entries".
-end-
Notes to Editors

1. Identification and correction of abnormal, incomplete and mispredicted proteins in public databases
Alinda Nagy, Hedi Hegyi, Krisztina Farkas, Hedvig Tordai, Evelin Kozma, Laszlo Banyai and Laszlo Patthy
BMC Bioinformatics

During embargo, article available here: http://www.biomedcentral.com/imedia/9308204501966618_article.pdf?random=617420

After the embargo, article available at the journal website: http://www.biomedcentral.com/bmcbioinformatics/

Please name the journal in any story you write. If you are writing for the web, please link to the article. All articles are available free of charge, according to BioMed Central's open access policy.

Article citation and URL available on request at press@biomedcentral.com on the day of publication.

2. BMC Bioinformatics is an open access journal publishing original peer-reviewed research articles in all aspects of computational methods used in the analysis and annotation of sequences and structures, as well as all other areas of computational biology. BMC Bioinformatics (ISSN 1471-2105) is indexed/tracked/covered by PubMed, MEDLINE, BIOSIS, CAS, Scopus, EMBASE, Thomson Reuters (ISI) and Google Scholar.

3. BioMed Central (http://www.biomedcentral.com/) is an independent online publishing house committed to providing immediate access without charge to the peer-reviewed biological and medical research it publishes. This commitment is based on the view that open access to research is essential to the rapid and efficient communication of science.

BioMed Central

Related Protein Articles from Brightsurf:

The protein dress of a neuron
New method marks proteins and reveals the receptors in which neurons are dressed

Memory protein
When UC Santa Barbara materials scientist Omar Saleh and graduate student Ian Morgan sought to understand the mechanical behaviors of disordered proteins in the lab, they expected that after being stretched, one particular model protein would snap back instantaneously, like a rubber band.

Diets high in protein, particularly plant protein, linked to lower risk of death
Diets high in protein, particularly plant protein, are associated with a lower risk of death from any cause, finds an analysis of the latest evidence published by The BMJ today.

A new understanding of protein movement
A team of UD engineers has uncovered the role of surface diffusion in protein transport, which could aid biopharmaceutical processing.

A new biotinylation enzyme for analyzing protein-protein interactions
Proteins play roles by interacting with various other proteins. Therefore, interaction analysis is an indispensable technique for studying the function of proteins.

Substituting the next-best protein
Children born with Duchenne muscular dystrophy have a mutation in the X-chromosome gene that would normally code for dystrophin, a protein that provides structural integrity to skeletal muscles.

A direct protein-to-protein binding couples cell survival to cell proliferation
The regulators of apoptosis watch over cell replication and the decision to enter the cell cycle.

A protein that controls inflammation
A study by the research team of Prof. Geert van Loo (VIB-UGent Center for Inflammation Research) has unraveled a critical molecular mechanism behind autoimmune and inflammatory diseases such as rheumatoid arthritis, Crohn's disease, and psoriasis.

Resurrecting ancient protein partners reveals origin of protein regulation
After reconstructing the ancient forms of two cellular proteins, scientists discovered the earliest known instance of a complex form of protein regulation.

Sensing protein wellbeing
The folding state of the proteins in live cells often reflect the cell's general health.

Read More: Protein News and Protein Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.