Computers with human-like vision could strengthen security and surveillance, UCLA researchers say

November 06, 2001

Unmanned military vehicles could distinguish cave entrances from shadows and locate other hazards if they had a sense of vision similar to humans, say researchers at UCLA's Henry Samueli School of Engineering and Applied Science.

A Pentagon spokesman was recently quoted concerning the difficulty of spotting caves in Afghanistan, saying, "From a cockpit perspective, a cave looks like nothing more than a shadow on the ground."

Stefano Soatto, assistant professor at UCLA's computer science department and head of the engineering school's vision lab, is studying how the human visual system works in order to pass the ability on to machines. "In practice, the human visual system is still by far the best around, but this may not be so for long," Soatto said.

Soatto's research team is examining how people use vision to interact with others and with their environment, and is designing systems that will allow computers to interact in similar ways.

"We use senses to build models of the world around us that allow us to walk through an unfamiliar environment and interact with it," Soatto said. "I want a machine to be able to do the same thing."

The projects under way at the UCLA Vision Lab all involve "dynamic vision," the ability of a computer to take in visual sensory information about its surroundings and use what it "sees" of its changing environment to perform assigned tasks, such as exploring underground bunkers or monitoring bank vaults.

As Soatto explains, "The world has certain physical properties -- shape, motion, material properties of objects and so forth. Humans have developed, over the course of evolution, a particular way of representing their environment that has been crucial for them to survive."

Machines, especially computers, can also be made to interpret the physical world and interact with it, whether that environment is inside a nuclear reactor or on the operating table.

Soatto is talking about much more than simple photography or video. "We know how to build cameras to capture images, we know how to build computers to crunch numbers, and we know how to build robots that move and perform pre-assigned tasks," Soatto said. "However, we still do not know how to put everything together and endow a machine with a sense of vision."

For a computer to perform "real-world" tasks, it must do more than simply capture and analyze a photograph. Using only that information, a computer cannot distinguish a photograph of a scene from the scene itself. To interact with a changing environment, the computer needs to gain additional information about spatial properties of the environment -- shape, motion, distances, angles -- measurable properties you can only get as images change over time. Multiple points of view are needed, where either the scene or the viewer's perspective changes. Only then can a three-dimensional representation of the world be created.

Consider face-recognition security systems. Sometimes used at banks, airports and even public events, these systems are designed to recognize and allow passage to certain people while denying entry to strangers. But the system can be fooled in ways that the human visual system cannot, Soatto said.

"Current systems capture one image of your face, match it to a database and can recognize it as yours and let you in. However, if an intruder shows up with a photograph of your face, the system would not be able to distinguish that 2-D photograph from your 3-D face, and would therefore let the intruder in." A computer with a true sense of vision would be able to tell the difference, says Soatto.

In July Soatto's team was the first to demonstrate a computer system that could track an object's movement and shape in real time -- as it is happening. By capturing and processing images in real time, unmanned planes would know the difference between a shadow on a hill and the entrance to a cave, or between an airstrip and a long trench. The plane could act immediately to what it was seeing, rather than later, after the data had been analyzed. It also means a computer could do more than just pre-assigned tasks based on data collected at a certain moment; it would constantly update what it knows about its environment and truly interact with a changing world.

Soatto's team is also investigating how to enable a computer to recognize distinct human movements -- such as a person walking -- and predict whether the figure is a man, woman or child. In some cases the computer could even identify a figure by his gait, or predict age, even mood, distinguishing joyful skipping from suspicious skulking.

The UCLA Vision Lab's work may relieve humans from unwanted dangers or enable them to do things that otherwise were not possible. "Think of everything that animals and humans do with vision and all the occasions when you may want a machine to do that instead: mowing your lawn, exploring unfamiliar areas or staying up all night to check that intruders are not in your building," Soatto said.
Soatto began the UCLA Vision Lab in 2000, and currently has eight graduate student researchers. For more information about the lab and its ongoing projects, log on to

University of California - Los Angeles

Related Vision Articles from Brightsurf:

School-based vision screening programs found 1 in 10 kids had vision problems
A school-based vision screening program in kindergarten, shown to be effective at identifying untreated vision problems in 1 in 10 students, could be useful to implement widely in diverse communities, according to new research in CMAJ (Canadian Medical Association Journal)

Restoring vision by gene therapy
Latest scientific findings give hope for people with incurable retinal degeneration.

Vision loss influences perception of sound
People with severe vision loss can less accurately judge the distance of nearby sounds, potentially putting them more at risk of injury.

'Time is vision' after a stroke
University of Rochester researchers studied stroke patients who experienced vision loss and found that the patients retained some visual abilities immediately after the stroke but these abilities diminished gradually and eventually disappeared permanently after approximately six months.

Improving the vision of self-driving vehicles
There may be a better way for autonomous vehicles to learn how to drive themselves: by watching humans.

A new model of vision
MIT researchers have developed a computer model of face processing that could reveal how the brain produces richly detailed visual representations so quickly.

Vision may be the real cause of children's problems
Do you have poor motor skills or struggle to read, write or solve math problems?

Shark and ray vision comes into focus
Until now, little has been known about the evolution of vision in cartilaginous fishes, particularly sharks and their genetic cousins, the rays.

The birth of vision, from the retina to the brain
How do neurons differentiate to become individual components of the visual system?

Tracing the evolution of vision
The function of the visual photopigment rhodopsin and its action in the retina to facilitate vision is well understood.

Read More: Vision News and Vision Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to