Do AI language models ‘understand’ the real world? On a basic level, they do, a new study finds

PROVIDENCE, R.I. [Brown University] — Most of what AI chatbots know about the world comes from devouring massive amounts of text from the internet — with all its facts, falsehoods, knowledge and nonsense. Given that input, is it possible that AI language models have an “understanding” of the real world?

As it turns out, they do — or at least something like an understanding. That’s according to a new study by researchers from Brown University to be presented on Saturday, April 25 at the International Conference on Learning Representations in Rio de Janeiro, Brazil.

The study looked under the hood of several AI language models to look for signs that they know the difference between events and scenarios that are commonplace, unlikely, impossible or downright nonsense.

“This work reveals some evidence that language models have encoded something like the causal constraints of the real world,” said Michael Lepori, a Ph.D. candidate at Brown who led the work. “Beyond just encoding these constraints, they do so in a way that is predictive of human judgments of these categories.”

Lepori’s research explores the intersection of computer science and human cognition. He is advised by Ellie Pavlick, a professor of computer science, and Thomas Serre, a professor of cognitive and psychological sciences, both of whom are faculty affiliates of Brown’s Carney Institute for Brain Science and co-authors of the research.

For the study, the researchers designed an experiment to test how language models interpret sentences describing events of varying plausibility. Some statements described commonplace scenarios: For example, “Someone cooled a drink with ice.” Some scenarios were improbable or unlikely: “Someone cooled a drink with snow.” Some were impossible: “Someone cooled a drink with fire.” Some were nonsensical: “Someone cooled a drink with yesterday.”

For each input, the researchers examined the resulting mathematical states generated inside the AI model, an approach known as mechanistic interpretability.

“Mechanistic interpretability can be appropriately characterized as something like neuroscience for AI systems,” Lepori said. “It seeks to reverse-engineer what the model is doing when exposed to a particular input. You could kind of think about it as understanding what is encoded in the ‘brain state’ of the machine.”

By comparing the differences in “brain states” generated by pairs of sentences from different categories — commonplace versus improbable, improbable versus impossible and so on — the researchers could get a sense of whether, and how well, the models internally differentiate between categories. The experiments were repeated across several different open-source language models, including Open AI’s GPT 2, Meta’s Llama 3.2 and Google’s Gemma 2, to get a “model-agnostic” sense of how well these types of models distinguish between categories.

The study found that models of sufficient size do indeed develop distinct mathematical patterns, or vectors, that are strongly correlated with each plausibility category. The vectors could distinguish between even the most similar of categories — like improbable versus impossible events — with roughly 85% accuracy.

What’s more, Lepori says, the vectors revealed by the study are reflective of human uncertainty about which category a statement might fall into. Take the statement, “Someone cleaned the floor with a hat,” for example. When people hear that statement, they may disagree about whether it represents something that’s impossible or just unlikely. For the study, the researchers analyzed the vectors to see how ambiguous the AI systems thought these statements were, and compared that with survey results from human participants.

“What we show is that the models actually capture that human uncertainty pretty well,” Lepori said. “In cases where, say, 50% of people said a statement was impossible and 50% said it was improbable, the models were assigning roughly 50% probability as well.”

Taken together, the results suggest that modern AI language models can indeed develop an understanding of the real world that is reflective of human understanding. These vectors start to emerge in models with more than 2 billion parameters, the research found, which is fairly small compared to today’s trillion-plus-parameter models.

More broadly, the researchers say these kinds of mechanistic interpretability studies can help in developing a better understanding of what AI models know and how they came to know it.

And that, the researchers say, will help in developing smarter, more trustworthy models.

10.48550/arXiv.2507.12553

Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility

25-Apr-2026

Do AI language models ‘understand’ the real world? On a basic level, they do, a new study finds

SAMSUNG T9 Portable SSD 2TB

Keywords

Article Information

Contact Information

Source

How to Cite This Article