How many fossils does it take to accurately train an image-based AI algorithm? According to a new study co-authored by Bruce MacFadden , UF Distinguished Professor Emeritus and retired curator of vertebrate paleontology at the Florida Museum of Natural History, the answer is somewhere around 250. This number is much lower than the amount scientists previously thought was needed.
This is a new spin on an old question that paleontologists have contended with for years. The amount of information that can be gleaned from a single fossil is limited to a few bare facts. If they’re lucky, paleontologists can determine the identity of the organism it belonged to, what it would have looked like, and the age and size of the organism when it died. They can often determine the age of the fossil and the types of environmental conditions that led to its preservation.
Answering most other questions with a scientifically acceptable degree of certainty requires multiple fossils from the same species. Preferably, these fossils should also be of the same type; the skull of one individual compared to the sternum of another won’t yield much useful information.
Despite depictions in popular media of paleontologists unearthing perfectly preserved, fully articulated skeletons, the reality is that most of what anyone ever finds are bits and pieces. This is especially true of paleontologists who study things like vertebrates — organisms with a backbone — that have a patchy fossil record.
“Vertebrates have over 200 bones in their skeleton, and you’ll almost never find one that’s complete, or even nearly so” MacFadden said.
The fossil fragments that paleontologists take back to the lab are problematic, because even if they find two or more of the same type of bone or teeth from the same species, the chances that they morphologically overlap enough to be useful is low. On top of that, identifying all of these fragments is hard and sometimes impossible, even for trained professionals. And because most of the fossils have fallen to pieces, there are a lot more of them to identify than there would have been otherwise. Ask a paleontologist about the most time-consuming aspects of their job, and they’ll add species identification, particularly of broken fragments, near the top of their list.
At the Florida Museum, the vertebrate fossil collection includes more than one million specimens, including hundreds of bags packed with fossil-rich sand waiting to be sieved, sorted, identified and analyzed.
“Within a bag of sediment, there can be thousands of small vertebrate fossil fragments, like tiny shark teeth and fish spines,” MacFadden said.
All of this creates a bottleneck, with fossils piling up on one end and discoveries trickling out of the other. Which brings us back to AI, which has the potential to greatly speed up the identification process.
In fact, in some other fields of paleontology, it’s been used to do just that for decades. Palynologists, who study fossilized spores and pollen, for example, have the opposite problem of their vertebrate paleontology colleagues. Crack open many fossil-bearing sedimentary layers that formed on land, and you’ll likely find thousands of ancient spores and pollen grains. Rather than not having enough of these for any given species, they have far more than they could ever conceivably identify on their own. Consequently, palynologists began turning to AI for help in the 1980s, and their methods have been improving since.
AI can be used for groups, such as vertebrates, with less plentiful fossils as well. However, just as it takes multiple fossils from the same species to answer interesting scientific questions, it takes a similarly large number of fossils to adequately train an AI algorithm.
Prior to this current study, it’s was unclear how many was enough.
To determine an acceptable threshold for accuracy, MacFadden and his colleagues wanted to use a group of animals with a fossil record in which recognizable specimens are readily plentiful. The team chose sharks, whose abundant fossilized teeth were thought to provide sufficient numbers of specimens for their AI test case.
“Their skeletons are made of cartilage, which almost never fossilizes,” MacFadden said. Shark teeth, however, are are durable and tend to stick around long after every other part of an animal has disappeared. Similar to spores and pollen, they can be found in many fossil-bearing sedimentary layers.
MacFadden and the project team narrowed down their selection to six species that lived during the Neogene, a period of time that stretched between 23 and 2.6 million years ago. Some of the species, like Megalodon — which currently holds the title for largest shark that ever lived — are now extinct, while others, like the great white shark ( Carcharodon carcharias ), are still around.
The SharkAI research team photographed thousands of shark teeth specimens curated at the Florida Museum, but they came up short. Needing additional teeth from tiger sharks and the extinct precursor of the great white shark, they reached out to a few avocational paleontologists they’d worked with over the years.
Lee Cone, a retired high school teacher who took his AP biology students on fossil hunting field trips, and had previously been with MacFadden on fossil collecting trips, loaned teeth from both species that he had available in his personal collection. Two other fossil collectors — Barbara Fite and C. O’Connor, both from Florida — also temporarily parted with a few of the specimens they’d collected.
Having acquired the requisite number of teeth, the authors then had to figure out how to analyze them using a type of artificial intelligence called computer vision.
“None of the members of this team had the expertise to build the code, and at the beginning, we were stymied by not knowing what to do.”
What they did was find people who were proficient at software development and asked for their help. The work of fine-tuning the AI models was done primarily by co-author Cristobal Barberis of Adaptive Computing, in Naples, Florida. Arthur Porto , the Florida Museum’s first curator of artificial intelligence , also helped train and test the models in a three-step process.
The team started by feeding the models 500 labeled images for each of the six shark species in increments of 50 to see which amount would perform best. Then they tested the models by removing the training wheels and giving them 25 unlabeled images of each species for them to identify without any help.
The results, MacFadden said, were encouraging.
“We were getting accuracies of greater than 90%,” which plateaued at about 250 specimens. That means using anything beyond 250 specimens might modestly increase accuracy, but not enough to be worth the trouble. It also showed that with species in limited supply, lower numbers of specimens might be sufficient for AI computer identification.
According to Porto, even smaller numbers of specimens also churned out impressive results. Using only 50 specimens to train their model, the authors still reported seeing accuracy rates of at least 93%.
“When I first got involved with the project, the main question was how far can we push these approaches in terms of the number of images. At which point do things break down?” he said. “The thing that pleasantly surprised me is that even when you have fairly low sample sizes, you can still get pretty reasonable performance.”
The study has implications beyond just the study of paleontology. MacFadden and other members of their SharkAI research team have also done a substantial amount of work to bring fossils into K-12 classrooms. Ultimately, he envisions curricula in which students can use AI to classify shark tooth images available on biorepositories based on tooth shape and the type of prey their former owners once used them to catch.
The authors published their study in the journal Paleobiology .
Additional authors of the study are Cristobal Barberis of Adaptive Computing; Maria Vallejo-Pareja and Stephanie Killingsworth of the Florida Museum of Natural History; Samantha Zbinden of the University of Texas at Austin; Victor Perez of Prince George’s County Parks and Recreation; Kenneth Marks; and Dévi Hall.
Paleobiology
AI and paleontology: effects of vertebrate fossil sample size on machine learning image classification
25-Feb-2026