Study shows widely used machine learning methods don't work as claimed

March 16, 2020

Models and algorithms for analyzing complex networks are widely used in research and affect society at large through their applications in online social networks, search engines, and recommender systems. According to a new study, however, one widely used algorithmic approach for modeling these networks is fundamentally flawed, failing to capture important properties of real-world complex networks.

"It's not that these techniques are giving you absolute garbage. They probably have some information in them, but not as much information as many people believe," said C. "Sesh" Seshadhri, associate professor of computer science and engineering in the Baskin School of Engineering at UC Santa Cruz.

Seshadhri is first author of a paper on the new findings published March 2 in Proceedings of the National Academy of Sciences. The study evaluated techniques known as "low-dimensional embeddings," which are commonly used as input to machine learning models. This is an active area of research, with new embedding methods being developed at a rapid pace. But Seshadhri and his coauthors say all these methods share the same shortcomings.

To explain why, Seshadhri used the example of a social network, a familiar type of complex network. Many companies apply machine learning to social network data to generate predictions about people's behavior, recommendations for users, and so on. Embedding techniques essentially convert a person's position in a social network into a set of coordinates for a point in a geometric space, yielding a list of numbers for each person that can be plugged into an algorithm.

"That's important because something abstract like a persons 'position in a social network' can be converted to a concrete list of numbers. Another important thing is that you want to convert this into a low-dimensional space, so that the list of numbers representing each person is relatively small," Seshadhri explained.

Once this conversion has been done, the system ignores the actual social network and makes predictions based on the relationships between points in space. For example, if a lot of people close to you in that space are buying a particular product, the system might predict that you are likely to buy the same product.

Seshadhri and his coauthors demonstrated mathematically that significant structural aspects of complex networks are lost in this embedding process. They also confirmed this result by empirically by testing various embedding techniques on different kinds of complex networks.

"We're not saying that certain specific methods fail. We're saying that any embedding method that gives you a small list of numbers is fundamentally going to fail, because a low-dimensional geometry is just not expressive enough for social networks and other complex networks," Seshadhri said.

A crucial feature of real-world social networks is the density of triangles, or connections between three people.

"Where you have lots of triangles, it means there is a lot of community structure in that part of a social network," Seshadhri said. "Moreover, these triangles are even more significant when you're looking at people who have limited social networks. In a typical social network, some people have tons of connections, but most people don't have a lot of connections."

In their analysis of embedding techniques, the researchers observed that a lot of the social triangles representing community structure are lost in the embedding process. "All of this information seems to disappear, so it's almost like the very thing you wanted to find has been lost when you construct these geometric representations," Seshadhri said.

Low-dimensional embeddings are by no means the only methods being used to generate predictions and recommendations. They are typically just one of many inputs into a very large and complex machine learning model.

"This model is a huge black box, and a lot of the positive results being reported say that if you include these low-dimensional embeddings, your performance goes up, maybe you get a slight bump. But if you used it by itself, it seems you would be missing a lot," Seshadhri said.

He also noted that new embedding methods are mostly being compared to other embedding methods. Recent empirical work by other researchers, however, shows that different techniques can give better results for specific tasks.

"Let's say you want to predict who's a Republican and who's a Democrat. There are techniques developed specifically for that task which work better than embeddings," he said. "The claim is that these embedding techniques work for many different tasks, and that's why a lot of people have adopted them. It's also very easy to plug them into an existing machine learning system. But for any particular task, it turns out there is always something better you can do."

Given the growing influence of machine learning in our society, Seshadhri said it is important to investigate whether the underlying assumptions behind the models are valid.

"We have all these complicated machines doing things that affect our lives significantly. Our message is just that we need to be more careful about evaluating these techniques," he said. "Especially in this day and age when machine learning is getting more and more complicated, it's important to have some understanding of what can and cannot be done."
In addition to Seshadhri, the coauthors of the paper include Aneesh Sharma at Google, UCSC graduate student Andrew Stolman, and Ashish Goel at Stanford University. This work was funded by the National Science Foundation and the Army Research Office.

University of California - Santa Cruz

Related Social Networks Articles from Brightsurf:

AI methods of analyzing social networks find new cell types in tissue
In situ sequencing enables gene activity inside body tissues to be depicted in microscope images.

Teen social networks linked to adult depression
Teens who have a larger number of friends may be less likely to suffer from depression later in life, especially women, a new MSU research study has found.

Drexel study: Measuring social networks of young adults with autism
While social isolation is a core challenge associated with autism, researchers from Drexel University's A.J.

Study suggests optimal social networks of no more than 150 people
New rules of engagement on the battlefield will require a deep understanding of networks and how they operate according to new Army research.

Social networks can support academic success
Social networks have been found to influence academic performance: students tend to perform better with high-performers among their friends, as some people are capable of inspiring others to try harder, according to Sofia Dokuka, Dilara Valeyeva and Maria Yudkevich of the HSE University.

Brain builds and uses maps of social networks, physical space, in the same way
Even in these social-distanced days, we keep in our heads a map of our relationships with other people: family, friends, coworkers and how they relate to each other.

Twitter fight: Birds use social networks to pick opponents wisely
In a new article published in the journal Current Opinion in Psychology, UC biologist Elizabeth Hobson says animals such as monk parakeets seem to understand where they fit in a dominance hierarchy and pick their fights accordingly.

Study questions benefits of social networks to disaster response
Faced with a common peril, people delay making decisions that might save lives, fail to alert each other to danger and spread misinformation.

'McDonaldization' based analysis of Russian social networks
The author describes his concept this way: 'the principles of the fast-food restaurant are coming to dominate more and more sectors of recent'.

Hunter-gatherers facilitated a cultural revolution through small social networks
Hunter-gatherer ancestors, from around 300,000 years ago, facilitated a cultural revolution by developing ideas in small social networks, and regularly drawing on knowledge from neighbouring camps, suggests a new study by UCL and University of Zurich.

Read More: Social Networks News and Social Networks Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to