New research from the University of the Witwatersrand, South Africa, has significant implications for understanding both human language development and the behaviour of large-scale artificial intelligence language models.
Culture is key, as well as an understanding of “iterated learning”, which posits that language evolves over generations (in humans and computers) to become more structured.
“We built a computer brain with similar characteristics to a child’s, and compared it to behaviours we see in children’s brains. We then fed it data with similar properties found in human language and watched how the generations (versions) of the computer brain learn.”
“It turns out, computer brains find the structure in the data in the same way that children favour certain properties of language in learning. It also showed that the dataset (language) becomes more structured over generations because it makes learning easier,” says lead author Dr Devon Jarvis , Lecturer in the School of Computer Science and Applied Mathematics (CSAM) , and Fellow in the Wits Machine Intelligence and Neural Discovery (MIND) Institute .
Their findings were recently published in a paper titled: Compositionality and Systematicity Emerge from Iterated Learning in Deep Linear Networks in the prestigious journal Proceedings of the National Academy of Sciences (PNAS).
It all starts in childhood
Jarvis explains that children have a remarkable ability to rapidly learn language during early development. They learn the world in hierarchies: starting with basic concepts and gradually understanding more complex ones.
“First, they learn that plants and animals are different things. Then they learn that there are different types of animals. But at some point, there is a depth of understanding of the world that they just have not reached yet,” says Jarvis.
Take the penguin, for instance. Children learn that birds have wings and therefore can fly. AHA! But they are confused that the penguin cannot fly. Here, they over-extrapolate, and mistakes are made, which then help them to learn new information: penguins can’t fly, but they can swim, AHA!. And slowly, they built a structured understanding of the world with increasing precision.
“While this progressive acquisition of knowledge has its benefits, the work focused on the implications for generations of learners. A child learns some language from their parents, and they will eventually pass it on to their own children. Due to the complexity of language, this transmission introduces mistakes.”
“Just like the penguin example, these mistakes are not arbitrary and result from the over-generalisation of knowledge. The net result is that easy portions of language to learn are remembered and reused, while the more unstructured portions are forgotten. Essentially, individuals are good at learning but only with the pressure of communication do we really see the depth of their intelligence,” explains Jarvis.
Not all neural networks are equal
The researchers used deep linear neural networks (mathematical models that mimic how the brain processes information) to study the neural basis of this process. They found that iterated learning only works well when the network has sufficient depth, multiple layers of processing, and a sufficiently complex language. Shallow networks, those with fewer layers, failed to capture the structured regularities that make language learnable.
This suggests that the architecture of a learning system, whether biological or artificial, and the richness of its environment, play a crucial role in how well language structure can be absorbed and transmitted. A point also coming to bear in the recent advances in generative AI models, which rely heavily on scale for their emergent capabilities.
Jarvis continues: “The pieces of this work have been around in the various literatures for a while now. Deep linear networks are established models of child development and iterated learning has been known to linguists for many years.”
“But it is the combination of these two perspectives that seems to make a useful point: that language evolves to become learnable based on the very specific nature of how children learn in stages and favour reusing information over learning new things.”
“The fact that this was shown in a very simple version of the technology underpinning the modern boom in AI tools is also encouraging and suggests that in the intersection of multiple fields lies the fundamental principles of cognition.”
Co-authors include Professor Richard Klein (Head of the School of CSAM and Fellow in the Wits MIND Institute , Wits University), Professor Benjamin Rosman (Director: Wits MIND Institute , and researcher in CSAM , Wits University), and Professor Andrew Saxe , (Gatsby Unit & Sainsbury Wellcome Centre, University College London )
Proceedings of the National Academy of Sciences
News article
Not applicable
Compositionality and systematicity emerge from iterated learning in deep linear networks
5-May-2026