Managing the data deluge for national security analysts

November 17, 2015

ALBUQUERQUE, N.M. -- After a disaster or national tragedy, bits of information often are found afterward among vast amounts of available data that might have mitigated or even prevented what happened, had they been recognized ahead of time.

In this information age, national security analysts often find themselves searching for a needle in a haystack. The available data is growing much faster than analysts' ability to observe and process it. Sometimes they can't make key connections and often they are overwhelmed struggling to use data for predictions and forensics.

Sandia National Laboratories' Pattern Analytics to Support High-Performance Exploitation and Reasoning (PANTHER) team has made a number of breakthroughs that could help solve these problems. They're developing solutions that will enable analysts to work smarter, faster and more effectively when looking at huge, complex amounts of data in real-time, stressful environments where the consequences might be life or death.

PANTHER's accomplishments include rethinking how to compare motion and trajectories; developing software that can represent remote sensor images, couple them with additional information and present them in a searchable form; and conducting fundamental research on visual cognition, said Kristina Czuchlewski, PANTHER's principal investigator and manager of Sandia's Intelligence Surveillance and Reconnaissance Systems Engineering and Decision Support.

The PANTHER team looked at raw data and ways to pre-process and analyze it to make it searchable and more meaningful. The project's fundamental research in cognitive science will inform the design of software and tools to help those viewing the data and make information of interest or trends easier to uncover.

PANTHER, which was funded by Sandia's Laboratory Directed Research & Development program, is gleaning deeper insights from complex data sets in minutes instead of months, and covering hundreds of square miles instead of dozens.

"PANTHER developed the foundation for transforming how massive, complex data sets can be quickly analyzed to provide the nation's decision-makers with new perspectives on situations and circumstances," said Anthony Medina, director of Sandia's Radio Frequency & Electronic Systems Center. "If an analyst is collecting information on a specific location over time and learns that something of interest might be occurring there, they probably don't have the tools they need to quickly gather and analyze information from all relevant data sets that might corroborate the forecast. But PANTHER is probably the nation's best bet right now to get to that point quickly."

Tracktable code automates observation of motion, trajectories

Mark Rintoul, a Sandia data scientist, developed the Tracktable code along with Sandia researcher Andy Wilson and others to automate the observation of motion and trajectories. The code could be applied to any problem that involves movement, such as airliners, ships or people.

Current approaches to getting meaningful information from trajectories focus on comparing one trajectory to another. If you have millions of trajectories to consider, that could mean trillions of comparisons, which takes a lot of time and computer power, Rintoul said.

"We've developed a way to store and represent trajectories so that computers can compare them all at once in a very fast and effective manner," he said. Instead of trillions of comparisons, the software does the same job in millions of comparisons, which is manageable.

An analyst concerned about the number of airliners stuck in holding patterns could ask Tracktable about aircraft trajectories that made a certain pattern of turns. Tracktable then calculates geometric features, such as the number of 90-degree turns an aircraft flew or the length of a straight line. By associating a type of motion with these features and assigning a number to each feature, the computer can quickly group flights that behave in similar ways and show them to the viewer for interpretation.

"If you have millions and you're not interested in precise comparisons, but general groupings of them, this is very effective," Rintoul said.

PANTHER also examined the predictive capability of the information buried in data. If an analyst looks at the first half of a flight, considers historical data about similar flight paths and then looks at the second half of the flight, any deviation from the pattern might cue an analyst to take a closer look. Finding that outlier from millions of flights that have flown before takes about a second with Tracktable, Rintoul said. The analyst is alerted because PANTHER team members are using the advances in cognitive science to design visual results that will highlight the odd behavior of the single aircraft. By studying how analysts use visual data, Sandia researchers are figuring out ways to make an outlier pop out of a screen full of detail to demand an analyst's attention.

The team is now looking at integrating motion and trajectories into a system called GeoGraphy.

GeoGraphy helps analysts search for items of interest, shows changes over time

GeoGraphy, initially funded by the National Nuclear Security Administration, is a software system that converts remote sensing images expressed in pixels into nodes and edges in a graph to show changes over time and make the data searchable, said Randy Brost, a Sandia computer scientist who led the team that developed the software. Nodes are analogous to the beige hubs in Tinkertoys, while edges are the colored connecting rods.

GeoGraphy breaks the images into categories, such as buildings, trees or rivers. This pre-processing creates a graphic resembling a complex paint-by-number that shows the categories of everything in the image. The program uses nodes and edges to describe relationships between objects, such as distance or time, Brost and Czuchlewski said.

In addition to the imagery, the software package could include such information as phone books or county records, producing a single searchable database of all the information that shows what's changed over time.

For example, to find a high school, the analyst tells the program to search for large buildings near regions that look like parking lots, football fields and tennis courts and defines those items. The analyst then can choose from among the results the computer provides.

The system is hierarchical, so once analysts identify high schools, they can ask the program to find high schools the next time without describing them. And should they doubt that something is a high school, the software makes the raw data available so they can verify the results, Brost said.

"The purpose of these codes -- GeoGraphy and Tracktable -- is to assist humans, not to replace them or to automatically do their jobs. It's to enhance their ability to do their jobs well and to allow them to be more effective in dealing with large sets of evidence," Brost said. "In the end, basically they are suggestion systems that say, 'Hey, based on what you told me you're interested in, you ought to look here, here and here.'"

The PANTHER team also included researchers focused on enhancing the viewer experience. Researcher Laura Matzen and others are conducting cognitive science experiments to learn how analysts' expertise affects their visual cognition and to create a model of how top-down visual attention -- when a user approaches an image with a goal in mind -- works. The researchers hope to use the answers they find to such fundamental cognitive science questions to inform the design of new tools that will improve interactions between humans and computers, Matzen said.

The prototype products and ideas developed under PANTHER are ready for the next step in their development: to be tested in real-world environments, Czuchlewski said.

Sandia researchers have proposed research into new problems illuminated by PANTHER, while other agencies are solidifying the foundation PANTHER has developed. Other projects will use PANTHER's ideas to address real-world problems, the researchers said.

"We went into PANTHER thinking we were going to do one thing, we're going to improve the lives of image analysts," Czuchlewski said. "And, in the research process, we did a whole lot more."
Sandia National Laboratories is a multi-program laboratory operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corp., for the U.S. Department of Energy's National Nuclear Security Administration. With main facilities in Albuquerque, N.M., and Livermore, Calif., Sandia has major R&D responsibilities in national security, energy and environmental technologies and economic competitiveness.

DOE/Sandia National Laboratories

Related Data Articles from Brightsurf:

Keep the data coming
A continuous data supply ensures data-intensive simulations can run at maximum speed.

Astronomers are bulging with data
For the first time, over 250 million stars in our galaxy's bulge have been surveyed in near-ultraviolet, optical, and near-infrared light, opening the door for astronomers to reexamine key questions about the Milky Way's formation and history.

Novel method for measuring spatial dependencies turns less data into more data
Researcher makes 'little data' act big through, the application of mathematical techniques normally used for time-series, to spatial processes.

Ups and downs in COVID-19 data may be caused by data reporting practices
As data accumulates on COVID-19 cases and deaths, researchers have observed patterns of peaks and valleys that repeat on a near-weekly basis.

Data centers use less energy than you think
Using the most detailed model to date of global data center energy use, researchers found that massive efficiency gains by data centers have kept energy use roughly flat over the past decade.

Storing data in music
Researchers at ETH Zurich have developed a technique for embedding data in music and transmitting it to a smartphone.

Life data economics: calling for new models to assess the value of human data
After the collapse of the blockchain bubble a number of research organisations are developing platforms to enable individual ownership of life data and establish the data valuation and pricing models.

Geoscience data group urges all scientific disciplines to make data open and accessible
Institutions, science funders, data repositories, publishers, researchers and scientific societies from all scientific disciplines must work together to ensure all scientific data are easy to find, access and use, according to a new commentary in Nature by members of the Enabling FAIR Data Steering Committee.

Democratizing data science
MIT researchers are hoping to advance the democratization of data science with a new tool for nonstatisticians that automatically generates models for analyzing raw data.

Getting the most out of atmospheric data analysis
An international team including researchers from Kanazawa University used a new approach to analyze an atmospheric data set spanning 18 years for the investigation of new-particle formation.

Read More: Data News and Data Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to