Bluesky Facebook Reddit Email

An AI model developed to design proteins simulates 500 million years of protein evolution in developing new fluorescent protein

01.16.25 | American Association for the Advancement of Science (AAAS)

SAMSUNG T9 Portable SSD 2TB

SAMSUNG T9 Portable SSD 2TB transfers large imagery and model outputs quickly between field laptops, lab workstations, and secure archives.

Guided by a multimodal generative language model called ESM3, Thomas Hayes and colleagues generated and synthesized a previously unknown bright fluorescent protein, with a genetic sequence so different from known fluorescent proteins that the researchers say its creation is equivalent to ESM3 simulating 500 million years of biological evolution. The model could provide a new way to “search” the space of protein possibilities with an eye to better understanding how naturally evolved proteins work, as well as developing novel proteins for uses in medicine, environmental remediation, and a host of other applications. ESM3 can reason over protein sequence, structure, and function, by representing each of these through alphabets of discrete tokens that can be combined in a generative language model. This strategy differs from previous uses of language models that were only scaled for protein sequences. The training data for ESM3 consists of 771 billion unique tokens created from 3.15 billion protein sequences, 236 million protein structures and 539 million proteins with function annotations. ESM3 can train up to 98 billion parameters. ESM3 is now available in public beta via an API, enabling scientists to engineer proteins programmatically or through interactive browser-based apps. Researchers can use the EvolutionaryScale Forge API through the free academic access tier or use the code and weights of the open model.

Science

10.1126/science.ads0018

Simulating 500 million years of evolution with a language model

16-Jan-2025

Keywords

Article Information

Contact Information

Science Press Package Team
American Association for the Advancement of Science/AAAS
scipak@aaas.org

How to Cite This Article

APA:
American Association for the Advancement of Science (AAAS). (2025, January 16). An AI model developed to design proteins simulates 500 million years of protein evolution in developing new fluorescent protein. Brightsurf News. https://www.brightsurf.com/news/LVD9QZ5L/an-ai-model-developed-to-design-proteins-simulates-500-million-years-of-protein-evolution-in-developing-new-fluorescent-protein.html
MLA:
"An AI model developed to design proteins simulates 500 million years of protein evolution in developing new fluorescent protein." Brightsurf News, Jan. 16 2025, https://www.brightsurf.com/news/LVD9QZ5L/an-ai-model-developed-to-design-proteins-simulates-500-million-years-of-protein-evolution-in-developing-new-fluorescent-protein.html.