Current TTS systems to get technical boost through $1.8 million grant

November 10, 2002

PORTLAND, Ore. - Current computer text-to-speech (TTS) synthesizers used by the hard-of-hearing and language-impaired are good at getting across basic facts. But typically, such synthesizers sound "bored," with little to no intonation or expression. Because words and sounds are just cut and pasted together, current TTS systems are punctuated with blips, clicks and an unevenness that makes it difficult for young listeners, in particular, to thoroughly understand what's being said.

Now, thanks to a new $1.8 million grant from the National Science Foundation to Oregon Health & Science University's OGI School of Science & Engineering, such technological shortcomings could someday be a problem of the past. Scientists at the school, based in Hillsboro, Ore., intend to create a computer TTS synthesizer system that more closely resembles natural speech. The system will figure into the next generation of interactive computer programs that teach language skills to children with reading, language and other communications disorders, such as autistic spectrum disorder.

"It's an ambitious project," said professor Jan van Santen, Ph.D., a mathematical psychologist in the Department of Electrical and Computer Engineering who heads the school's Center for Spoken Language Understanding (cslu.cse.ogi.edu) and is the lead scientist on the interdisciplinary, collaborative project. "But we think we can do it."

The project teams van Santen with a talented group of computational linguists, autism experts, computer scientists and neuropsychologists at Carnegie-Mellon University, AT&T Research, and the School of Science & Engineering. The five-year grant totals $2.75 million.

"Intonation is an essential part of meaning," said van Santen. "For kids with developmental or language problems, having educational materials that contain great expression or intonation, is essential to learning and paying attention in class.

"For a TTS system to be able to more expressively synthesize language, the computer has to be programmed to understand the context of say, a children's story," said van Santen. "The computer needs to decide which words to emphasize, which words should be spoken as parenthetic, and where in the story to briefly accelerate to better express, for example, a bunny who is running. For this to be possible, we want to program the computer in ways that it can better understand the world so it can make all these inferences from text."

For example, said van Santen, the computer could be modeled so it would know that bunnies in young children's stories are often good and wolves are generally bad, or that a grocery cart should be pushed by hands gripping a handle, and not pushed by, say, the wheels of the cart. "This kind of innate real-world knowledge will help the computer system speak more realistically and, therefore, naturally," said van Santen. "It's a tough problem to solve for computers, but we think its doable."

Speech technology has traditionally been driven by the military and telecommunications industries, noted van Santen, a longtime Bell Labs researcher who joined the School of Science & Engineering in 2001. "But there is huge potential for speech technology that is useful for education and health. We are trying to tap into that market and make our work helpful for the average person who has a learning or medical problem." The OGI School of Science & Engineering is the only school in the United States focusing on speech technology for education and health, he said.

Speech technology could someday be used to help illiterate people learn to read, to help non native speakers learn English, and to give autistic people more ways to communicate. Researchers at the School of Science & Engineering are studying a variety of ways humans and computers can better interact, and are developing innovative solutions that are strengthening communications between man and machine.

The OGI School of Science & Engineering (formerly the Oregon Graduate Institute of Science & Technology) became one of four specialty schools of Oregon Health & Science University in 2001. The OHSU OGI School of Science & Engineering has 63 faculty and more than 300 master's and doctoral students in five academic departments.
-end-
The Center for Spoken Language Understanding has five fulltime faculty, four postdocs, a dozen graduate students, and additional programming staff. For more information, visit www.cslu.cse.ogi.edu/.

Note: A photo of van Santen is available at www.ohsu.edu/news/

Oregon Health & Science University

Related Language Articles from Brightsurf:

Learning the language of sugars
We're told not to eat too much sugar, but in reality, all of our cells are covered in sugar molecules called glycans.

How effective are language learning apps?
Researchers from Michigan State University recently conducted a study focusing on Babbel, a popular subscription-based language learning app and e-learning platform, to see if it really worked at teaching a new language.

Chinese to rise as a global language
With the continuing rise of China as a global economic and trading power, there is no barrier to prevent Chinese from becoming a global language like English, according to Flinders University academic Dr Jeffrey Gil.

'She' goes missing from presidential language
MIT researchers have found that although a significant percentage of the American public believed the winner of the November 2016 presidential election would be a woman, people rarely used the pronoun 'she' when referring to the next president before the election.

How does language emerge?
How did the almost 6000 languages of the world come into being?

New research quantifies how much speakers' first language affects learning a new language
Linguistic research suggests that accents are strongly shaped by the speaker's first language they learned growing up.

Why the language-ready brain is so complex
In a review article published in Science, Peter Hagoort, professor of Cognitive Neuroscience at Radboud University and director of the Max Planck Institute for Psycholinguistics, argues for a new model of language, involving the interaction of multiple brain networks.

Do as i say: Translating language into movement
Researchers at Carnegie Mellon University have developed a computer model that can translate text describing physical movements directly into simple computer-generated animations, a first step toward someday generating movies directly from scripts.

Learning language
When it comes to learning a language, the left side of the brain has traditionally been considered the hub of language processing.

Learning a second alphabet for a first language
A part of the brain that maps letters to sounds can acquire a second, visually distinct alphabet for the same language, according to a study of English speakers published in eNeuro.

Read More: Language News and Language Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.