To find the right network model, compare all possible histories

January 25, 2021

Two family members test positive for COVID-19 -- how do we know who infected whom? In a perfect world, network science could provide a probable answer to such questions. It could also tell archaeologists how a shard of Greek pottery came to be found in Egypt, or help evolutionary biologists understand how a long-extinct ancestor metabolized proteins.

As the world is, scientists rarely have the historical data they need to see exactly how nodes in a network became connected. But a new paper published in Physical Review Letters offers hope for reconstructing the missing information, using a new method to evaluate the rules that generate network models.

"Network models are like impressionistic pictures of the data," says physicist George Cantwell, one of the study's authors and a postdoctoral researcher at the Santa Fe Institute. "And there have been a number of debates about whether the real networks look enough like these models for the models to be good or useful."

Normally when researchers try to model a growing network -- say, a group of individuals infected with a virus -- they build up the model network from scratch, following a set of mathematical instructions to add a few nodes at a time. Each node could represent an infected individual, and each edge a connection between those individuals. When the clusters of nodes in the model resemble the data drawn from the real-world cases, the model is considered to be representative -- a problematic assumption when the same pattern can result from different sets of instructions.

Cantwell and co-authors Guillaume St-Onge (University Laval, Quebec) and Jean-Gabriel Young (University of Vermont) wanted to bring a shot of statistical rigor to the modeling process. Instead of comparing features from a snapshot of the network model against the features from the real-world data, they developed methods to calculate the probability of each possible history for a growing network. Given competing sets of rules, which could represent real-world processes such as contact, droplet, or airborne transmission, the authors can apply their new tool to determine the probability of specific rules resulting in the observed pattern.

"Instead of just asking 'does this picture look more like the real thing?'" Cantwell says, "We can now ask material questions like, 'did it grow by these rules?'" Once the most likely network model is found, it can be rewound to answer questions such as who was infected first.

In their current paper, the authors demonstrate their algorithm on three simple networks that correspond to previously-documented datasets with known histories. They are now working to apply the tool to more complicated networks, which could find applications across any number of complex systems.

Santa Fe Institute

Related Data Articles from Brightsurf:

Keep the data coming
A continuous data supply ensures data-intensive simulations can run at maximum speed.

Astronomers are bulging with data
For the first time, over 250 million stars in our galaxy's bulge have been surveyed in near-ultraviolet, optical, and near-infrared light, opening the door for astronomers to reexamine key questions about the Milky Way's formation and history.

Novel method for measuring spatial dependencies turns less data into more data
Researcher makes 'little data' act big through, the application of mathematical techniques normally used for time-series, to spatial processes.

Ups and downs in COVID-19 data may be caused by data reporting practices
As data accumulates on COVID-19 cases and deaths, researchers have observed patterns of peaks and valleys that repeat on a near-weekly basis.

Data centers use less energy than you think
Using the most detailed model to date of global data center energy use, researchers found that massive efficiency gains by data centers have kept energy use roughly flat over the past decade.

Storing data in music
Researchers at ETH Zurich have developed a technique for embedding data in music and transmitting it to a smartphone.

Life data economics: calling for new models to assess the value of human data
After the collapse of the blockchain bubble a number of research organisations are developing platforms to enable individual ownership of life data and establish the data valuation and pricing models.

Geoscience data group urges all scientific disciplines to make data open and accessible
Institutions, science funders, data repositories, publishers, researchers and scientific societies from all scientific disciplines must work together to ensure all scientific data are easy to find, access and use, according to a new commentary in Nature by members of the Enabling FAIR Data Steering Committee.

Democratizing data science
MIT researchers are hoping to advance the democratization of data science with a new tool for nonstatisticians that automatically generates models for analyzing raw data.

Getting the most out of atmospheric data analysis
An international team including researchers from Kanazawa University used a new approach to analyze an atmospheric data set spanning 18 years for the investigation of new-particle formation.

Read More: Data News and Data Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to