Researchers use large language models to discover recipes for novel materials

Advances in artificial intelligence promise to help chemical engineers discover complex new materials. These materials could be used for reactions such as turning carbon dioxide into fuel, but technical barriers have limited catalysis adoption so far. Researchers at the University of Rochester are now harnessing the benefits of large language models (LLMs) similar to ChatGPT, Claude, or Gemini to empower more researchers to use AI to discover new materials and accelerate experiment workflows.

In a study published in ACS Central Science , a team led by Marc Porosoff , an associate professor in the Department of Chemical and Sustainability Engineering , and Andrew White , visiting associate professor and the cofounder and chief technology officer of Edison Scientific , describes an AI based-method they developed that allows users to input natural language prompts about the materials they want to create and suggest optimal procedures for experiments to produce them. As the users run the experiments, they input the results back into the AI model and continue iterating until they reach their goal.

“We’re able to leverage the pre-trained knowledge of large language models and well-established statistical methods for materials discovery to help us as researchers navigate large experimental design spaces more efficiently,” says Porosoff.

Porosoff likens the new AI method to describing a cup of coffee, noting that someone could describe the coffee by its taste, color, and aroma, or by the type of beans, grind size, apparatus, and water temperature used to make the brew. Both representation methods describe the same cup of coffee, but the second approach gives you a recipe to reproduce it that others can easily replicate.

Porosoff and his team are applying the same principle to catalysts for energy applications, using language-based representations to describe materials not just by their properties, but by the steps needed to create them.

To build on their success, the US Department of Energy Advanced Research Projects Agency-Energy (ARPA-E) announced it will provide nearly $3 million in funding to apply the URochester team’s method toward creating catalysts for the production of fuel from abundant materials, specifically methanol and ethanol from carbon dioxide and hydrogen. Porosoff will lead a multi-institution project team that includes URochester, Virginia Polytechnic Institute and State University, Stanford University, Northwestern University, A*STAR Institute of Sustainability for Chemicals, Energy and Environment (ISCE2) in Singapore, and OxEon Energy, a small business based in Salt Lake City.

Traditional AI methods for materials discovery typically use a strategy called Bayesian optimization to identify and design the best candidates. But the result is complex numerical data about a material’s structure, which requires deep expertise to use effectively. The new LLM method instead produces a set of procedures that researchers can easily understand, execute, and verify to determine if the experiment’s output matches the predicted results.

This can be extremely useful for working with complex materials such as trimetallic catalysts, which are made of three metals.

“Our method reduces the technical barrier associated with using Bayesian optimization, which is a well-established method for efficiently exploring large and complicated parameter spaces,” says Shane Michtavy, a URochester chemical engineering PhD student who helped develop the AI method, synthesize materials, and run the chemical reactions described in the paper. “Using pre-trained LLMs allows users to explore using less data than traditional models, as they are deployed in a frozen state with built-in knowledge of the physical world and catalysis.”

The paper shows how the researchers applied the method to several live experiments, including one to identify catalysts for turning carbon dioxide and hydrogen into carbon monoxide and water using trimetallic catalysts made from low-cost metals. Porosoff says that there are about 360,000 possible experiments that could have been run to find the ideal catalyst, but by using procedures produced by the AI model and providing it with the results from the experiments, they were able to find an ideal candidate in just ten experiments.

The study was supported by funding from the National Science Foundation, the National Institutes of Health, and the US Department of Energy. Additional authors included Mayk Caldas, technical staff at Edison Scientific.

Now that they have shown the model works as a proof of concept in the lab, Porosoff aims to take the method further using the funding announced through ARPA-E’s Catalytic Application Testing for Accelerated Learning Chemistries via High-throughput Experimentation and Modeling Efficiently (CATALCHEM-E) program.

“Right now, it takes a decade or longer to go from conceptualizing a new catalyst to testing it in a lab to putting it in a real reactor,” says Porosoff. “The CATALCHEM-E program aims to cut that by an order of magnitude to a single year, and we think using AI with text-based representations will be a big factor in shortening the development cycle.”

Porosoff and his collaborators will first demonstrate their workflow on carbon dioxide-to-methanol and then extend the process to higher alcohols such as ethanol, which is a key additive for gasoline and used in pharmaceuticals, cosmetics, and many other applications. Ultimately, they hope to commercially deploy the model for industries to create catalysts to synthesize alcohols for fuel.

The project is scheduled to begin in July and run through 2029. See a full list of CATALCHEM-E programs on the ARPA-E website .

ACS Central Science

10.1021/acscentsci.5c02418

Bayesian Optimization of Catalysis with In-Context Learning

14-Apr-2026

Researchers use large language models to discover recipes for novel materials

Apple iPhone 17 Pro

Keywords

Article Information

Contact Information

Source

How to Cite This Article