What flocking birds can teach AI

Among the primary concerns surrounding artificial intelligence is its tendency to yield erroneous information when summarizing long documents. These “hallucinations” are problematic not only because they convey falsehoods, but also because they reduce efficiency—sorting through content to search for mistakes of AI outputs is time-consuming.

To help address this challenge, a team of computer scientists has created an algorithmic framework that draws from a natural phenomenon—bird flocking—by mimicking how birds efficiently self-organize. The framework serves as a preprocessing step for large language models (LLMs), helping them produce more reliable summaries of large documents.

The work is reported in the journal Frontiers in Artificial Intelligence .

The researchers created the bird-flocking algorithm by first unpacking how AI agents make mistakes.

These systems are built on LLMs that are designed to autonomously research, write, and summarize. But while they may write well, they do not always produce accurate or faithful summaries.

“One contributing factor is that when input text is excessively long, noisy, or repetitive, model performance degrades, causing AI agents and LLMs to lose track of key facts, dilute critical information among irrelevant content, or drift away from the source material entirely,” explains Anasse Bari, a computer science professor at NYU’s Courant Institute School of Mathematics, Computing, and Data Science and director of the Predictive Analytics and AI Research Lab, which conducted the work.

Drawing from the cause of this shortcoming, Bari and co-author Binxu Huang, an NYU computer science researcher, turned to an orderly and time-tested method of gathering disparate parts—bird flocking—and applied it as a preprocessing step to generative AI.

Their method considered each sentence in a long document—a scientific study or a legal analysis—as a virtual bird. In yielding a simplified outcome, it evaluated the document’s sentences based on their position, thematic centrality, and topical relevance, then grouped them into clusters that mirror how birds self-organize into flocks.

This grouping reduced each cluster to its most representative sentences, with the goal of minimizing redundancy and preserving key points. The resulting curated summary was then passed to an LLM as a structured, concise, and reduced input.

“The intention was to ground AI models more closely to the source material while reducing repetition and noise before generating a final summary,” says Bari , who previously turned to natural phenomena in devising an algorithm to improve online searches.

Here is how it works in greater detail:

Phase 1: Score Every Sentence

Each sentence is cleaned by keeping only nouns, verbs, and adjectives, while stripping out articles, prepositions, conjunctions, and punctuation. Among other natural language processing techniques, multi-word terms are also merged (“lung cancer” becomes “lung_cancer”) so single concepts stay intact.

Each sentence is then converted into a numerical vector by fusing lexical, semantic, and topical features. Sentences are scored on document-wide centrality, section-level importance, and alignment with the abstract, with a numerical boost for key sections like the Introduction, Results, and Conclusion.

Phase 2: Bird Flocking for Diversity

Taking only the top-scored sentences risks repetition—and stymies flocking. For instance, in a cancer research paper, the five highest-ranked sentences might all discuss treatment outcomes. Instead, the framework treats each sentence as a bird positioned in an imaginary space according to its meaning. Much like real birds in nature, which self-organize into flocks by following three simple rules known as cohesion (stay close to nearby birds), alignment (move in the same direction as neighbors), and separation (avoid crowding), sentences with similar meanings naturally cluster together while maintaining distinct groupings. Leaders emerge within each cluster and followers attach to their nearest leader.

From each final flock of similar sentences, only the highest-scoring ones are selected, so the summary covers background, methods, results, and conclusions, rather than echoing one theme—thereby reflecting a document’s diversity of content without repeating it. The chosen sentences are reordered and fed to an AI agent powered by an LLM, which synthesizes them into a fluent summary grounded in the original source content.

The researchers evaluated the algorithm on over 9,000 documents, examining whether this approach produced better outputs compared to an AI agent powered by an LLM alone. The framework, including its bird-flocking-inspired algorithm, combined with LLMs, helped generate summaries with greater factual accuracy than did LLMs producing content without the algorithm.

“The core idea of our work is that we developed an experimental framework that serves as a preprocessing step for large texts before it is fed to an AI agent or LLM and not as a competitor to LLMs or AI agents,” Bari says. “The framework identifies the most important sentences in a document and creates a more concise representation and summary of the original text, removing repetition and noise before it reaches the AI.”

However, the authors acknowledge that their approach is not a panacea.

“The goal is to help the AI generate summaries that stay closer to the source material,” notes Bari. “While this approach has the potential to partially address the issue of hallucination, we do not want to claim we have solved it—we have not.”

# # #

Editor’s Note: In November 2025, NYU announced the establishment of the Courant Institute School of Mathematics, Computing, and Data Science . The newly established school recognizes the storied history of the Courant Institute of Mathematical Sciences—and its strengths in both applied and pure mathematics—while encompassing NYU’s Center for Data Science and linking the computer science departments at Courant and the Tandon School of Engineering.

Frontiers in Artificial Intelligence

10.3389/frai.2026.1703769

Data/statistical analysis

A Bird-Inspired Artificial Intelligence Framework for Advanced Large Text Summarization

17-Mar-2026

What flocking birds can teach AI

Apple iPhone 17 Pro

Keywords

Article Information

Contact Information

How to Cite This Article