Nav: Home

Citizen science projects have a surprising new partner -- the computer

February 06, 2019

For more than a decade, citizen science projects have helped researchers use the power of thousands of volunteers who help sort through datasets that are too large for a small research team. Previously, this data generally couldn't be processed by computers because the work required skills that only humans could accomplish.

Now, computer machine learning techniques that teach the computer specific image recognition skills can be used in crowdsourcing projects to deal with massively increasing amounts of data--making computers a surprising new partner in citizen science projects.

The research, led by the University of Minnesota-Twin Cities, was chosen as the cover story for the most recent issue of the British Ecological Society's scientific journal Methods in Ecology and Evolution.

In this study, data scientists and citizen science experts partnered with ecologists who often study wildlife populations by deploying camera traps. These camera traps are remote, independent devices, triggered by motion and infrared sensors that provide researchers with images of passing animals. After collection, these images have to be classified according to the study's goals to produce useful ecological data for analysis.

"In the past, researchers asked citizen scientists to help them process and classify the images within a reasonable time-frame," said the study's lead author Marco Willi, a recent graduate of the University of Minnesota master's program in data science and researcher in the University's School of Physics and Astronomy. "Now, some of these recent camera trap projects have collected millions of images. Even with the help of citizen scientists, it could take years to classify all of the images. This new study is a proof of concept that machine learning techniques can help significantly reduce the time of classification."

Researchers used three datasets of images collected from Africa--Snapshot Serengeti, Camera CATalogue, and Elephant Expedition--and one dataset from Snapshot Wisconsin with images collected in North America. The datasets each featured between nine and 55 species and exhibited significant differences in how often various species were photographed. These datasets also differed in aspects such as dataset size, camera placement, camera configuration, and species coverage which allows for drawing more general conclusions.

The researchers used machine learning techniques that teach the computer how to classify the images by showing the computer datasets of images already classified by humans. For example, the machine would be shown full and partial images that are known to be images of zebras from various angles. The computer then would start to recognize the patterns, edges, and parts of the animal, and learn how to identify the image as a zebra. The researchers can also build upon some of these skills to help computers identify other animals, such as a deer or squirrel, with even fewer images.

The computer also learns to identify empty images, which are images without animals where the cameras were usually set off by vegetation blowing in the wind. In some cases, these empty images make up about 80 percent of all camera trap images. Eliminating all the empty images can greatly speed the classification process.

The computer's accuracy rates for identifying empty images across projects range between 91.2 percent and 98.0 percent, while accuracies for identifying specific species are between 88.7 percent and 92.7 percent. While the computer's classification accuracy is low for rare species, the computer can also tell researchers how confident it is in its predictions. Removing low-confidence predictions increases the computer's accuracies to the level of citizen scientists.

"Our machine learning techniques allow ecology researchers to speed up the image classification process and pave the way for even larger citizen science projects in the future," Willi said. "Instead of every image having to be classified by multiple volunteers, one or two volunteers could confirm the computer's classification."

While this study focused on ecology camera trap programs, Willi said the same techniques can also be used in other citizen science projects such as classifying images from space.

"Data in a wide range of science areas is growing much faster than the number of citizen science project volunteers," said study co-author Lucy Fortson, a University of Minnesota physics and astronomy professor and co-founder of Zooniverse, the largest citizen science online platform that hosted the projects in the study. "While there will always be a need for human effort in these projects, combining these efforts with the help of Big Data techniques can help researchers process more data even faster and allows the volunteers to focus on the harder, rarer classifications."

Led by Fortson, the Zooniverse team at the University of Minnesota, including Willi, is working to integrate machine learning techniques into the platform so the hundreds of researchers from astronomy to zoology using the platform can take advantage of them.

In addition to researchers at the University of Minnesota, the international team on this study included researchers from the University of Oxford, Wisconsin Department of Natural Resources, Institute for Communities and Wildlife in Africa, Adler Planetarium, and the conservation organization Panthera.
The study was funded primarily by the National Science Foundation (Award IIS 1619177) with additional support from University of Oxford's Hertford College Mortimer May Fund, Google Global Impact Award, and the Science and Technology Facilities Council (ST/N003179/10). Researchers used the Minnesota Supercomputing Institute at the University of Minnesota for some of their research.

To read the full research study entitled "Identifying Animal Species in Camera Trap Images using Deep Learning and Citizen Science," visit the Methods in Ecology and Evolution website.

University of Minnesota

Related Data Articles:

Data centers use less energy than you think
Using the most detailed model to date of global data center energy use, researchers found that massive efficiency gains by data centers have kept energy use roughly flat over the past decade.
Storing data in music
Researchers at ETH Zurich have developed a technique for embedding data in music and transmitting it to a smartphone.
Life data economics: calling for new models to assess the value of human data
After the collapse of the blockchain bubble a number of research organisations are developing platforms to enable individual ownership of life data and establish the data valuation and pricing models.
Geoscience data group urges all scientific disciplines to make data open and accessible
Institutions, science funders, data repositories, publishers, researchers and scientific societies from all scientific disciplines must work together to ensure all scientific data are easy to find, access and use, according to a new commentary in Nature by members of the Enabling FAIR Data Steering Committee.
Democratizing data science
MIT researchers are hoping to advance the democratization of data science with a new tool for nonstatisticians that automatically generates models for analyzing raw data.
Getting the most out of atmospheric data analysis
An international team including researchers from Kanazawa University used a new approach to analyze an atmospheric data set spanning 18 years for the investigation of new-particle formation.
Ecologists ask: Should we be more transparent with data?
In a new Ecological Applications article, authors Stephen M. Powers and Stephanie E.
Should you share data of threatened species?
Scientists and conservationists have continually called for location data to be turned off in wildlife photos and publications to help preserve species but new research suggests there could be more to be gained by sharing a rare find, rather than obscuring it, in certain circumstances.
Futuristic data storage
The development of high-density data storage devices requires the highest possible density of elements in an array made up of individual nanomagnets.
Making data matter
The advent of 3-D printing has made it possible to take imaging data and print it into physical representations, but the process of doing so has been prohibitively time-intensive and costly.
More Data News and Data Current Events

Trending Science News

Current Coronavirus (COVID-19) News

Top Science Podcasts

We have hand picked the top science podcasts of 2020.
Now Playing: TED Radio Hour

Climate Mindset
In the past few months, human beings have come together to fight a global threat. This hour, TED speakers explore how our response can be the catalyst to fight another global crisis: climate change. Guests include political strategist Tom Rivett-Carnac, diplomat Christiana Figueres, climate justice activist Xiye Bastida, and writer, illustrator, and artist Oliver Jeffers.
Now Playing: Science for the People

#562 Superbug to Bedside
By now we're all good and scared about antibiotic resistance, one of the many things coming to get us all. But there's good news, sort of. News antibiotics are coming out! How do they get tested? What does that kind of a trial look like and how does it happen? Host Bethany Brookeshire talks with Matt McCarthy, author of "Superbugs: The Race to Stop an Epidemic", about the ins and outs of testing a new antibiotic in the hospital.
Now Playing: Radiolab

Speedy Beet
There are few musical moments more well-worn than the first four notes of Beethoven's Fifth Symphony. But in this short, we find out that Beethoven might have made a last-ditch effort to keep his music from ever feeling familiar, to keep pushing his listeners to a kind of psychological limit. Big thanks to our Brooklyn Philharmonic musicians: Deborah Buck and Suzy Perelman on violin, Arash Amini on cello, and Ah Ling Neu on viola. And check out The First Four Notes, Matthew Guerrieri's book on Beethoven's Fifth. Support Radiolab today at