Nav: Home

ALCF helps tackle the Large Hadron Collider's big data challenge

November 03, 2015

Argonne physicists are using Mira to perform simulations of Large Hadron Collider (LHC) experiments with a leadership-class supercomputer for the first time, shedding light on a path forward for interpreting future LHC data. Researchers at the Argonne Leadership Computing Facility (ALCF) helped the team optimize their code for the supercomputer, which has enabled them to simulate billions of particle collisions faster than ever before.

At CERN's Large Hadron Collider (LHC), the world's most powerful particle accelerator, scientists initiate millions of particle collisions every second in their quest to understand the fundamental structure of matter.

With each collision producing about a megabyte of data, the facility, located on the border of France and Switzerland, generates a colossal amount of data. Even after filtering out about 99 percent of it, scientists are left with around 30 petabytes (or 30 million gigabytes) each year to analyze for a wide range of physics experiments, including studies on the Higgs boson and dark matter.

To help tackle the considerable challenge of interpreting all this data, researchers from the U.S. Department of Energy's (DOE's) Argonne National Laboratory are demonstrating the potential of simulating collision events with Mira, a 10-petaflops IBM Blue Gene/Q supercomputer at the Argonne Leadership Computing Facility (ALCF), a DOE Office of Science User Facility.

"Simulating the collisions is critical to helping us understand the response of the particle detectors," said principal investigator Tom LeCompte, an Argonne physicist and the former physics coordinator for the LHC's ATLAS experiment, one of four particle detectors at the facility. "Differences between the simulated data and the experimental data can lead us to discover signs of new physics."

This marks the first time a leadership-class supercomputer has been used to perform massively parallel simulations of LHC collision events. The effort has been a great success thus far, showing that such supercomputers can help drive future discoveries at the LHC by accelerating the pace at which simulated data can be produced. The project also demonstrates how leadership computing resources can be used to inform and facilitate other data-intensive high energy physics experiments.

Since 2002, LHC scientists have relied on the Worldwide LHC Computing Grid for all their data processing and simulation needs. Linking thousands of computers and storage systems across 41 countries, this international distributed computing infrastructure allows data to be accessed and analyzed in near real-time by an international community of more than 8,000 physicists collaborating among the four major LHC experiments.

"Grid computing has been very successful for LHC, but there are some limitations on the horizon," LeCompte said. "One is that some LHC event simulations are so complex that it would take weeks to complete them. Another is that the LHC's computing needs are set to grow by at least a factor of 10 in the next several years."

To investigate the use of supercomputers as a possible tool for the LHC, LeCompte applied for and received computing time at the ALCF through DOE's Advanced Scientific Computing Research Leadership Computing Challenge. His project is focused on simulating ATLAS events that are difficult to simulate with the computing grid.

While the LHC's big data challenge seems like a natural fit for one of the fastest supercomputers in the world, it took extensive work to adapt an existing LHC simulation method for Mira's massively parallel architecture.

With help from ALCF researchers Tom Uram, Hal Finkel, and Venkat Vishwanath, the Argonne team transformed ALPGEN, a Monte Carlo-based application that generates events in hadronic collisions, from a single-threaded simulation code into massively multi-threaded code that could run efficiently on Mira. By improving the code's I/O performance and reducing its memory usage, they were able to scale ALPGEN to run on the full Mira system and help the code perform 23 times faster than it initially did. The code optimization work has enabled the team to routinely simulate millions of LHC collision events in parallel.

"By running these jobs on Mira, they completed two years' worth of ALPGEN simulations in a matter of weeks, and the LHC computing grid became correspondingly free to run other jobs," Uram said.

Throughout the course of the project, the team's simulations have equated to about 9 percent of the annual computing done by the ATLAS experiment. Ultimately, this effort is helping to accelerate the science that depends on these simulations.

"The datasets we've generated are important, and we would have made them anyway, but now we have them in our hands about a year and a half sooner," LeCompte said. "That, in turn, will help us get more results to conferences and publications at an earlier time."

As supercomputers like Mira get better integrated into the LHC's workflow, LeCompte believes a much larger fraction of simulations could eventually be shifted to high-performance computers. To help move the LHC in that direction, his team plans to increase the range of codes capable of running on Mira, with the next candidates being Sherpa, another event generation code, and Geant4, a code for simulating the passage of particles through matter.

"We also plan to help other high energy physics groups use leadership supercomputers like Mira," LeCompte said. "Our experience is that it takes a year or so to get to the minimum partition size, and another year to run at scale."
This research is supported by the DOE Office of Science's High Energy Physics program. Computing time at the ALCF was allocated through the DOE Office of Science's Advanced Scientific Computing Research program.

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation's first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America's scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy's Office of Science.

The U.S. Department of Energy's Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit the Office of Science website.

DOE/Argonne National Laboratory

Related Supercomputer Articles:

Researchers measure the coherence length in glasses using the supercomputer JANUS
Thanks to the JANUS II supercomputer, researchers from Spain and Italy (Institute of Biocomputation and Physics of Complex Systems of the University of Zaragoza, Complutense University of Madrid, University of Extremadura, La Sapienza University of Rome and University of Ferrara), have refined the calculation of the microscopic correlation length and have reproduced the experimental protocol, enabling them to calculate the macroscopic length.
Officials dedicate OSC's newest, most powerful supercomputer
State officials and Ohio Supercomputer Center leaders gathered at a data center today (March 29) to dedicate the Owens Cluster.
A scientist and a supercomputer re-create a tornado
With tornado season fast approaching or already underway in vulnerable states throughout the US, new supercomputer simulations are giving meteorologists unprecedented insight into the structure of monstrous thunderstorms and tornadoes.
Calculating 1 billion plasma particles in a supercomputer
At the National Institutes of Natural Sciences National Institute for Fusion Science (NIFS) a research group using the NIFS 'Plasma Simulator' supercomputer succeeded for the first time in the world in calculating the movements of one billion plasma particles and the electrical field constructed by those particles.
Supercomputer simulations help develop new approach to fight antibiotic resistance
Supercomputer simulations at the Department of Energy's Oak Ridge National Laboratory have played a key role in discovering a new class of drug candidates that hold promise to combat antibiotic resistance.
Supercomputer comes up with a profile of dark matter
In the search for the mysterious dark matter, physicists have used elaborate computer calculations to come up with an outline of the particles of this unknown form of matter.
New Hikari supercomputer starts solar HVDC
The Hikari supercomputer launched at the Texas Advanced Computing Center is the first in the US powered by solar HVDC.
Wiring reconfiguration saves millions for Trinity supercomputer
A moment of inspiration during a wiring diagram review has saved more than $2 million in material and labor costs for the Trinity supercomputer at Los Alamos National Laboratory.
Chemistry consortium uses Titan supercomputer to understand actinides
A multi-institution team led by the University of Alabama's David Dixon is using Titan to understand actinide chemistry at the molecular level in hopes of designing methods to clean up contamination and safely store spent nuclear fuel.
Are humans the new supercomputer?
Online computer games allow gamers to solve a class of problems in quantum physics that cannot be easily solved by algorithms alone.

Related Supercomputer Reading:

Best Science Podcasts 2019

We have hand picked the best science podcasts for 2019. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: TED Radio Hour

Digital Manipulation
Technology has reshaped our lives in amazing ways. But at what cost? This hour, TED speakers reveal how what we see, read, believe — even how we vote — can be manipulated by the technology we use. Guests include journalist Carole Cadwalladr, consumer advocate Finn Myrstad, writer and marketing professor Scott Galloway, behavioral designer Nir Eyal, and computer graphics researcher Doug Roble.
Now Playing: Science for the People

#529 Do You Really Want to Find Out Who's Your Daddy?
At least some of you by now have probably spit into a tube and mailed it off to find out who your closest relatives are, where you might be from, and what terrible diseases might await you. But what exactly did you find out? And what did you give away? In this live panel at Awesome Con we bring in science writer Tina Saey to talk about all her DNA testing, and bioethicist Debra Mathews, to determine whether Tina should have done it at all. Related links: What FamilyTreeDNA sharing genetic data with police means for you Crime solvers embraced...