Bluesky Facebook Reddit Email

AI voices are easier to understand than human voices

04.21.26 | American Institute of Physics

SAMSUNG T9 Portable SSD 2TB

SAMSUNG T9 Portable SSD 2TB transfers large imagery and model outputs quickly between field laptops, lab workstations, and secure archives.


WASHINGTON, April 21, 2026 — Synthetic voices are increasingly a part of our lives, from digital assistants like Siri and Alexa to automated telemarketers and answering machines. With the expansion of generative AI, a new type of synthetic voice has been developed: voice clones, which can recreate a facsimile of a person’s voice from only a few seconds of recorded speech.

In JASA, published on behalf of the Acoustical Society of America by AIP Publishing, a pair of researchers from University College London and the University of Roehampton evaluated the intelligibility of humans and voice clones. They found that voice clones are easier than humans to understand in noisy environments.

Voice clones differ from traditional synthetic voices in the amount of sampling they require. Synthetic voices like Siri require a voice actor to spend hours in a recording booth. In contrast, a voice clone can be made from as little as 10 seconds of speech, significantly expanding the number of potential voices as well as the number of potential applications.

Researchers Patti Adank and Han Wang specialize in studying human perception of unclear speech and were fascinated by the idea of machine-replicated speech. A key question they were looking to answer was just how easy voice clones are for the average person to understand. They suspected that voice clones would simply be poor representations of actual human voices and that people would struggle to understand them. What they found could not be more different.

“I thought initially that voice clones would be less intelligible because they were unfamiliar,” said Adank. “I found they were up to 20% more intelligible, which was quite shocking. A small part of our paper is talking about that experiment, and then a large part is me and my collaborator frantically trying to find out what it is that makes those voice clones more intelligible.”

The duo initially presented volunteers with human voices and voice clones, asking them to rate their intelligibility. After finding that voice clones were consistently rated easier to understand, they repeated the experiment with elderly volunteers to determine if being hard-of-hearing alters the effect; with American volunteers — the original cohort was British — to judge if the accent plays a role; and with a filter designed to mimic cochlear implants. In every case, voice clones emerged victorious.

After examining over 100 acoustic measurements, Adank believes the only way to solve the mystery is to work with collaborators who specialize in text-to-speech systems to adapt an existing open-source cloning system.

“I am now going to try and recreate [the effect] by studying how synthesizers work and how they use digital signal processing to generate those voices, just to get a bit of a handle on this,” said Adank.

###

The article “Voice clones are easier to understand in noise than their human originals: the voice cloning intelligibility benefit” is authored by Patti Adank and Han Wang. It will appear in The Journal of the Acoustical Society of America on April 21, 2026 (DOI: 10.1121/10.0043094). After that date, it can be accessed at https://doi.org/10.1121/10.0043094 .

ABOUT THE JOURNAL

The Journal of the Acoustical Society of America ( JASA ) is published on behalf of the Acoustical Society of America. Since 1929, the journal has been the leading source of theoretical and experimental research results in the broad interdisciplinary subject of sound. JASA serves physical scientists, life scientists, engineers, psychologists, physiologists, architects, musicians, and speech communication specialists. See https://pubs.aip.org/asa/jasa .

ABOUT THE ACOUSTICAL SOCIETY OF AMERICA

The Acoustical Society of America (ASA) is the premier international scientific society in acoustics devoted to the science and technology of sound. Its 7,000 members worldwide represent a broad spectrum of the study of acoustics. ASA publications include The Journal of the Acoustical Society of America (the world's leading journal on acoustics), JASA Express Letters, Proceedings of Meetings on Acoustics, Acoustics Today magazine, books, and standards on acoustics. The society also holds two major scientific meetings each year. See https://acousticalsociety.org/ .

10.1121/10.0043094

People

Voice clones are easier to understand in noise than their human originals: the voice cloning intelligibility benefit

21-Apr-2026

Keywords

Article Information

Contact Information

Hannah Daniel
American Institute of Physics
media@aip.org

How to Cite This Article

APA:
American Institute of Physics. (2026, April 21). AI voices are easier to understand than human voices. Brightsurf News. https://www.brightsurf.com/news/1ZZY44Y1/ai-voices-are-easier-to-understand-than-human-voices.html
MLA:
"AI voices are easier to understand than human voices." Brightsurf News, Apr. 21 2026, https://www.brightsurf.com/news/1ZZY44Y1/ai-voices-are-easier-to-understand-than-human-voices.html.