Green digitization: Botanical collections data answer real-world questions

April 18, 2018

Even as botany has moved firmly into the era of "big data," some of the most valuable botanical information remains inaccessible for computational analysis, locked in physical form in the orderly stacks of herbaria and museums. Herbarium specimens are plant samples collected from the field that are dried and stored with labels describing species, date and location of collection, along with various other information including habitat descriptions. The detailed historical record these specimens keep of species occurrence, morphology, and even DNA provides an unparalleled data source to address a variety of morphological, ecological, phenological, and taxonomic questions. Now efforts are underway to digitize these data, and make them easily accessible for analysis.

Two symposia were convened to discuss the possibilities and promise of digitizing these data--at the Botanical Society of America's 2017 annual meeting in Fort Worth, Texas, and again at the XIX International Botanical Congress in Shenzhen, China. The proceedings of those symposia have been published as a special issue of Applications in Plant Sciences; the articles discuss a range of methods and remaining challenges for extracting data from botanical collections, as well as applications for collections data once digitized. Many of the authors contributing to the issue are involved in iDigBio (Integrated Digitized Biocollections), a new "national coordinating center for the facilitation and mobilization of biodiversity specimen data," as described by Dr. Gil Nelson, a botanist at Florida State University and coeditor of this issue.

iDigBio is funded by the U.S. National Science Foundation's Advancing Digitization of Biodiversity Collections initiative, and has already digitized about 50 million herbarium specimens. According to Dr. Nelson, "A primary significance has been community building among biodiversity scientists, curators, and collections managers, and developing and disseminating recommended practices and technical skills for getting these jobs done." The challenges of digitizing these data are formidable, said Dr. Nelson, and include "developing computer vision techniques for making species determinations and scoring phenological traits, and developing effective natural language processing algorithms for parsing label data."

But as the papers in this issue show, steady progress is being made in developing methods to address these challenges. Nelson et al. (2018) and Contreras (2018) address more nuts-and-bolts issues of data management, the former discussing the need for globally unique IDs for herbarium specimens, and the latter providing a workflow for digitizing new fossil leaf collections. Botella et al. (2018) review and discuss the prospects for "computer vision" aided by deep-learning neural networks that, while in their infancy, could eventually identify species from variable images. Yost et al. (2018) offer a protocol for digitizing data on phenology (the timing of events such as flowering or fruiting) from herbarium specimens.

These digitization methods can help unlock valuable herbarium data to address a range of questions. James et al. (2018) discuss how digitized herbarium specimens can be used to show how plant species have responded to global change, for example by using location and time data to model shifts in range. Cantrill (2018) discusses how the Australasian Virtual Herbarium database has been used for ecological and other research. Thiers and Halling (2018) extend the applications to the fungal world, showing how herbarium data can be used as a baseline to determine the distribution of macrofungi in North America. Furthermore, digitization efforts can have real payoff in public perception; Dr. Nelson sees an "increasing presence of biodiversity data and museums in the popular press, which has raised the profiles of herbaria and other collections for the general public." Along these lines, Konrat et al. (2018) show how digital herbarium data can be used to engage citizen scientists.

Through centuries of painstaking collection and cataloguing, botanists have created a unique and irreplaceable bank of data in the tens of millions of herbarium specimens worldwide. But converting a dried, pressed plant specimen with a handwritten label from 1835 into a format that you can fit on a USB stick is no small trick. Using creative thinking, sophisticated methodology, and hard work, these scientists are bringing the valuable information locked in herbarium specimens into the digital age.
The Applications in Plant Sciences special issue "Green digitization: Online botanical collections data answering real-world questions" is available online at:

Soltis, P. S., G. Nelson, and S. A. James. 2018. Green digitization: Online botanical collections data answering real-world questions. Applications in Plant Sciences 6(2): 1028.

Articles in the issue:

Botella, C., A. Joly, P. Bonnet, P. Monestiez, and F. Munoz. 2018. Species distribution modeling based on the automated identification of citizen observations. Applications in Plant Sciences 6(2): e1029.

Cantrill, D. J. 2018. The Australasian Virtual Herbarium: Tracking data usage and benefits for biological collections. Applications in Plant Sciences 6(2): e1026.

Contreras, D. L. 2018. A workflow and protocol describing the field to digitization process for new project-based fossil leaf collections. Applications in Plant Sciences 6(2): e1025.

James, S. A., P. S. Soltis, L. Belbin, A. D. Chapman, G. Nelson, D. L. Paul, and M. Collins. 2018. Herbarium data: Global biodiversity and societal botanical needs for novel research. Applications in Plant Sciences 6(2): e1024.

Nelson, G., P. Sweeney, and E. Gilbert. 2018. Use of globally unique identifiers (GUIDs) to link herbarium specimen records to physical specimens. Applications in Plant Sciences 6(2): e1027.

Thiers, B. M., and R. E. Halling. 2018. The Macrofungi Collection Consortium. Applications in Plant Sciences 6(2): e1021.

von Konrat, M., T. Campbell, B. Carter, M. Greif, M. Bryson, J. Larraín, L. Trouille, et al. 2018. Using citizen science to bridge taxonomic discovery with education and outreach. Applications in Plant Sciences 6(2): e1023.

Yost, J. M., P. W. Sweeney, E. Gilbert, G. Nelson, R. Guralnick, A. S. Gallinat, E. R. Ellwood, et al. 2018. Digitization protocol for scoring reproductive phenology from herbarium specimens of seed plants. Applications in Plant Sciences 6(2): e1022.

Applications in Plant Sciences (APPS) is a monthly, peer-reviewed, open access journal focusing on new tools, technologies, and protocols in all areas of the plant sciences. It is published by the Botanical Society of America, a nonprofit membership society with a mission to promote botany, the field of basic science dealing with the study and inquiry into the form, function, development, diversity, reproduction, evolution, and uses of plants and their interactions within the biosphere. APPS is available as part of the Wiley Online Library.

For further information, please contact the APPS staff at

Botanical Society of America

Related Computer Vision Articles from Brightsurf:

Computer vision predicts congenital adrenal hyperplasia
Using computer vision, researchers have discovered strong correlations between facial morphology and congenital adrenal hyperplasia (CAH), a life-threatening genetic condition of the adrenal glands and one of the most common forms of adrenal insufficiency in children.

Computer vision app allows easier monitoring of diabetes
A computer vision technology developed by University of Cambridge engineers has now been developed into a free mobile phone app for regular monitoring of glucose levels in people with diabetes.

Computer vision helps find binding sites in drug targets
Scientists from the iMolecule group at Skoltech developed BiteNet, a machine learning (ML) algorithm that helps find drug binding sites, i.e. potential drug targets, in proteins.

Tool helps clear biases from computer vision
Researchers at Princeton University have developed a tool that flags potential biases in sets of images used to train artificial intelligence (AI) systems.

UCLA computer scientists set benchmarks to optimize quantum computer performance
Two UCLA computer scientists have shown that existing compilers, which tell quantum computers how to use their circuits to execute quantum programs, inhibit the computers' ability to achieve optimal performance.

School-based vision screening programs found 1 in 10 kids had vision problems
A school-based vision screening program in kindergarten, shown to be effective at identifying untreated vision problems in 1 in 10 students, could be useful to implement widely in diverse communities, according to new research in CMAJ (Canadian Medical Association Journal)

Researchers incorporate computer vision and uncertainty into AI for robotic prosthetics
Researchers have developed new software that can be integrated with existing hardware to enable people using robotic prosthetics or exoskeletons to walk in a safer, more natural manner on different types of terrain.

'Time is vision' after a stroke
University of Rochester researchers studied stroke patients who experienced vision loss and found that the patients retained some visual abilities immediately after the stroke but these abilities diminished gradually and eventually disappeared permanently after approximately six months.

Computer vision helps SLAC scientists study lithium ion batteries
New machine learning methods bring insights into how lithium ion batteries degrade, and show it's more complicated than many thought.

A new model of vision
MIT researchers have developed a computer model of face processing that could reveal how the brain produces richly detailed visual representations so quickly.

Read More: Computer Vision News and Computer Vision Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to