Nav: Home

Citizen science projects have a surprising new partner -- the computer

February 06, 2019

For more than a decade, citizen science projects have helped researchers use the power of thousands of volunteers who help sort through datasets that are too large for a small research team. Previously, this data generally couldn't be processed by computers because the work required skills that only humans could accomplish.

Now, computer machine learning techniques that teach the computer specific image recognition skills can be used in crowdsourcing projects to deal with massively increasing amounts of data--making computers a surprising new partner in citizen science projects.

The research, led by the University of Minnesota-Twin Cities, was chosen as the cover story for the most recent issue of the British Ecological Society's scientific journal Methods in Ecology and Evolution.

In this study, data scientists and citizen science experts partnered with ecologists who often study wildlife populations by deploying camera traps. These camera traps are remote, independent devices, triggered by motion and infrared sensors that provide researchers with images of passing animals. After collection, these images have to be classified according to the study's goals to produce useful ecological data for analysis.

"In the past, researchers asked citizen scientists to help them process and classify the images within a reasonable time-frame," said the study's lead author Marco Willi, a recent graduate of the University of Minnesota master's program in data science and researcher in the University's School of Physics and Astronomy. "Now, some of these recent camera trap projects have collected millions of images. Even with the help of citizen scientists, it could take years to classify all of the images. This new study is a proof of concept that machine learning techniques can help significantly reduce the time of classification."

Researchers used three datasets of images collected from Africa--Snapshot Serengeti, Camera CATalogue, and Elephant Expedition--and one dataset from Snapshot Wisconsin with images collected in North America. The datasets each featured between nine and 55 species and exhibited significant differences in how often various species were photographed. These datasets also differed in aspects such as dataset size, camera placement, camera configuration, and species coverage which allows for drawing more general conclusions.

The researchers used machine learning techniques that teach the computer how to classify the images by showing the computer datasets of images already classified by humans. For example, the machine would be shown full and partial images that are known to be images of zebras from various angles. The computer then would start to recognize the patterns, edges, and parts of the animal, and learn how to identify the image as a zebra. The researchers can also build upon some of these skills to help computers identify other animals, such as a deer or squirrel, with even fewer images.

The computer also learns to identify empty images, which are images without animals where the cameras were usually set off by vegetation blowing in the wind. In some cases, these empty images make up about 80 percent of all camera trap images. Eliminating all the empty images can greatly speed the classification process.

The computer's accuracy rates for identifying empty images across projects range between 91.2 percent and 98.0 percent, while accuracies for identifying specific species are between 88.7 percent and 92.7 percent. While the computer's classification accuracy is low for rare species, the computer can also tell researchers how confident it is in its predictions. Removing low-confidence predictions increases the computer's accuracies to the level of citizen scientists.

"Our machine learning techniques allow ecology researchers to speed up the image classification process and pave the way for even larger citizen science projects in the future," Willi said. "Instead of every image having to be classified by multiple volunteers, one or two volunteers could confirm the computer's classification."

While this study focused on ecology camera trap programs, Willi said the same techniques can also be used in other citizen science projects such as classifying images from space.

"Data in a wide range of science areas is growing much faster than the number of citizen science project volunteers," said study co-author Lucy Fortson, a University of Minnesota physics and astronomy professor and co-founder of Zooniverse, the largest citizen science online platform that hosted the projects in the study. "While there will always be a need for human effort in these projects, combining these efforts with the help of Big Data techniques can help researchers process more data even faster and allows the volunteers to focus on the harder, rarer classifications."

Led by Fortson, the Zooniverse team at the University of Minnesota, including Willi, is working to integrate machine learning techniques into the platform so the hundreds of researchers from astronomy to zoology using the platform can take advantage of them.

In addition to researchers at the University of Minnesota, the international team on this study included researchers from the University of Oxford, Wisconsin Department of Natural Resources, Institute for Communities and Wildlife in Africa, Adler Planetarium, and the conservation organization Panthera.
The study was funded primarily by the National Science Foundation (Award IIS 1619177) with additional support from University of Oxford's Hertford College Mortimer May Fund, Google Global Impact Award, and the Science and Technology Facilities Council (ST/N003179/10). Researchers used the Minnesota Supercomputing Institute at the University of Minnesota for some of their research.

To read the full research study entitled "Identifying Animal Species in Camera Trap Images using Deep Learning and Citizen Science," visit the Methods in Ecology and Evolution website.

University of Minnesota

Related Data Articles:

Discrimination, lack of diversity, & societal risks of data mining highlighted in big data
A special issue of Big Data presents a series of insightful articles that focus on Big Data and Social and Technical Trade-Offs.
Journal AAS publishes first data description paper: Data collection and sharing
AAS published its first data description paper on June 8, 2017.
73 percent of academics say access to research data helps them in their work; 34 percent do not publish their data
Combining results from bibliometric analyses, a global sample of researcher opinions and case-study interviews, a new report reveals that although the benefits of open research data are well known, in practice, confusion remains within the researcher community around when and how to share research data.
Designing new materials from 'small' data
A Northwestern and Los Alamos team developed a novel workflow combining machine learning and density functional theory calculations to create design guidelines for new materials that exhibit useful electronic properties, such as ferroelectricity and piezoelectricity.
Big data for the universe
Astronomers at Lomonosov Moscow State University in cooperation with their French colleagues and with the help of citizen scientists have released 'The Reference Catalog of galaxy SEDs,' which contains value-added information about 800,000 galaxies.
More Data News and Data Current Events

Best Science Podcasts 2019

We have hand picked the best science podcasts for 2019. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: TED Radio Hour

Do animals grieve? Do they have language or consciousness? For a long time, scientists resisted the urge to look for human qualities in animals. This hour, TED speakers explore how that is changing. Guests include biological anthropologist Barbara King, dolphin researcher Denise Herzing, primatologist Frans de Waal, and ecologist Carl Safina.
Now Playing: Science for the People

#534 Bacteria are Coming for Your OJ
What makes breakfast, breakfast? Well, according to every movie and TV show we've ever seen, a big glass of orange juice is basically required. But our morning grapefruit might be in danger. Why? Citrus greening, a bacteria carried by a bug, has infected 90% of the citrus groves in Florida. It's coming for your OJ. We'll talk with University of Maryland plant virologist Anne Simon about ways to stop the citrus killer, and with science writer and journalist Maryn McKenna about why throwing antibiotics at the problem is probably not the solution. Related links: A Review of the Citrus Greening...