Nav: Home

Research identifies key weakness in modern computer vision systems

July 30, 2018

PROVIDENCE, RI [Brown University] -- Computer vision algorithms have come a long way in the past decade. They've been shown to be as good or better than people at tasks like categorizing dog or cat breeds, and they have the remarkable ability to identify specific faces out of a sea of millions.

But research by Brown University scientists shows that computers fail miserably at a class of tasks that even young children have no problem with: determining whether two objects in an image are the same or different. In a paper presented last week at the annual meeting of the Cognitive Science Society, the Brown team sheds light on why computers are so bad at these types of tasks and suggests avenues toward smarter computer vision systems.

"There's a lot of excitement about what computer vision has been able to achieve, and I share a lot of that," said Thomas Serre, associate professor of cognitive, linguistic and psychological sciences at Brown and the paper's senior author. "But we think that by working to understand the limitations of current computer vision systems as we've done here, we can really move toward new, much more advanced systems rather than simply tweaking the systems we already have."

For the study, Serre and his colleagues used state-of-the-art computer vision algorithms to analyze simple black-and-white images containing two or more randomly generated shapes. In some cases the objects were identical; sometimes they were the same but with one object rotated in relation to the other; sometimes the objects were completely different. The computer was asked to identify the same-or-different relationship.

The study showed that, even after hundreds of thousands of training examples, the algorithms were no better than chance at recognizing the appropriate relationship. The question, then, was why these systems are so bad at this task.

Serre and his colleagues had a suspicion that it has something to do with the inability of these computer vision algorithms to individuate objects. When computers look at an image, they can't actually tell where one object in the image stops and the background, or another object, begins. They just see a collection of pixels that have similar patterns to collections of pixels they've learned to associate with certain labels. That works fine for identification or categorization problems, but falls apart when trying to compare two objects.

To show that this was indeed why the algorithms were breaking down, Serre and his team performed experiments that relieved the computer from having to individuate objects on its own. Instead of showing the computer two objects in the same image, the researchers showed the computer the objects one at a time in separate images. The experiments showed that the algorithms had no problem learning same-or-different relationship as long as they didn't have to view the two objects in the same image.

The source of the problem in individuating objects, Serre says, is the architecture of the machine learning systems that power the algorithms. The algorithms use convolutional neural networks -- layers of connected processing units that loosely mimic networks of neurons in the brain. A key difference from the brain is that the artificial networks are exclusively "feed-forward" -- meaning information has a one-way flow through the layers of the network. That's not how the visual system in humans works, according to Serre.

"If you look at the anatomy of our own visual system, you find that there are a lot of recurring connections, where the information goes from a higher visual area to a lower visual area and back through," Serre said.

While it's not clear exactly what those feedbacks do, Serre says, it's likely that they have something to do with our ability to pay attention to certain parts of our visual field and make mental representations of objects in our minds.

"Presumably people attend to one object, building a feature representation that is bound to that object in their working memory," Serre said. "Then they shift their attention to another object. When both objects are represented in working memory, your visual system is able to make comparisons like same-or-different."

Serre and his colleagues hypothesize that the reason computers can't do anything like that is because feed-forward neural networks don't allow for the kind of recurrent processing required for this individuation and mental representation of objects. It could be, Serre says, that making computer vision smarter will require neural networks that more closely approximate the recurrent nature of human visual processing.
-end-
Serre's co-authors on the paper were Junkyung Kim and Matthew Ricci. The research was supported by the National Science Foundation (IIS-1252951, 1644760) and DARPA (YFA N66001-14-1-4037).

Brown University

Related Working Memory Articles:

Slower growth in working memory linked to teen driving crashes
Research into why adolescent drivers are involved in motor vehicle crashes, the leading cause of injury and death among 16- to 19-year-olds in the United States, has often focused on driving experience and skills.
Are differences in working memory development associated with crashes involving young drivers?
This study of 84 young drivers looked at the association between motor vehicle crashes and differences in the development of working memory, which is critical to awareness of hazards while driving.
Working memory is structured hierarchically
Researchers in cognitive psychology at HSE University have experimentally demonstrated that the colors and orientations of objects are stored and processed independently in working memory.
Chimpanzees' working memory similar to ours
Working memory is central to our mental lives; we use it to add up the cost of our shopping or to remember the beginning of this sentence at its end.
Flexibility of working memory from random connections
Working memory is your ability to hold things 'in mind.' It acts as a workspace in which information can be held, manipulated, and used to guide behavior.
Good sleep quality and good mood lead to good working memory with age
A team of psychologists has found strong associations between working memory -- a fundamental building block of a functioning mind -- and three health-related factors: sleep, age, and depressed mood.
BU scientists find electrostimulation can improve working memory in people
In a groundbreaking study published in Nature Neuroscience, Boston University researchers demonstrate that electrostimulation can improve the working memory of people in their 70s so that their performance on memory tasks is indistinguishable from that of 20-year-olds.
Word order predicts a native speakers' working memory
Memory plays a crucial role in our lives, and several studies have already investigated how we store and retrieve information under different conditions.
A new model for how working memory gets you through the day
MIT neuroscientists present a new model of working memory that explains how the brain holds information in mind (the 'memory' part) and also executes volitional control over it (the 'working' part).
Working memory might be more flexible than previously thought
Breaking with the long-held idea that working memory has fixed limits, a new study by researchers at Uppsala University and New York University suggests that these limits adapt themselves to the task that one is performing.
More Working Memory News and Working Memory Current Events

Top Science Podcasts

We have hand picked the top science podcasts of 2019.
Now Playing: TED Radio Hour

Risk
Why do we revere risk-takers, even when their actions terrify us? Why are some better at taking risks than others? This hour, TED speakers explore the alluring, dangerous, and calculated sides of risk. Guests include professional rock climber Alex Honnold, economist Mariana Mazzucato, psychology researcher Kashfia Rahman, structural engineer and bridge designer Ian Firth, and risk intelligence expert Dylan Evans.
Now Playing: Science for the People

#540 Specialize? Or Generalize?
Ever been called a "jack of all trades, master of none"? The world loves to elevate specialists, people who drill deep into a single topic. Those people are great. But there's a place for generalists too, argues David Epstein. Jacks of all trades are often more successful than specialists. And he's got science to back it up. We talk with Epstein about his latest book, "Range: Why Generalists Triumph in a Specialized World".
Now Playing: Radiolab

Dolly Parton's America: Neon Moss
Today on Radiolab, we're bringing you the fourth episode of Jad's special series, Dolly Parton's America. In this episode, Jad goes back up the mountain to visit Dolly's actual Tennessee mountain home, where she tells stories about her first trips out of the holler. Back on the mountaintop, standing under the rain by the Little Pigeon River, the trip triggers memories of Jad's first visit to his father's childhood home, and opens the gateway to dizzying stories of music and migration. Support Radiolab today at Radiolab.org/donate.