The future voice of speech-driven interfacesSeptember 28, 2004With speech recognition databases spanning 24 languages, SpeeCon is helping organisations throughout Europe create linguistically diverse voice-driven applications capable of recognising commands in different languages and operating in diverse acoustic conditions. One of the by-products of a market-driven digital economy over recent years has been the almost exclusive use of English as the means of communication between man and machine. To ensure the future multilingualism of man-machine communications, the IST programme-funded SpeeCon project has focused on building speech recognition databases. This will assist the development of speech-driven interfaces (SDIs) that can be activated by a wide-range of European and other languages. SpeeCon recognised that to successfully develop the market for SDIs two essential technical obstacles had to be removed. First, the words used to command interfaces have to be transferred to many languages because of linguistic diversity. Secondly, they have to work satisfactorily under acoustic conditions in which the consumer devices are used (with background noise, or with different types of microphone - e.g. using a mobile phone in hands-free mode). This required a pool of expertise in speech processing to boost progress in the field, to create affordable user-friendly, multilingual interfaces for future consumer electronics devices, and to ensure a larger European share of this global market. With a large multinational consortium including Siemens (the project coordinator), Nokia and IBM, amongst many others, SpeeCon arose from a strong base. "The collection of speech data is very costly - there are around 600 speakers required to create each language database," says Herbert Tropf, project coordinator at Siemens "They need to be recruited, recorded, and the resulting data then needs to be transcribed and validated." In the end 24 separate language databases were collected made possible due to the wide-range of SpeeCon partners. The languages included French, Spanish, Mandarin and Hebrew, a number of them with assorted formats - e.g. Austrian and Swiss dialects of German. "The important addition to previous work is the wide-range of data collected - so within each database there will usually be anywhere from 4-6 dialects and a range of age-groups (e.g. 30 per cent of the data collected were from under 15 year olds)," says Tropf. "Each speaker would have to repeat several hundred words that were a mixture of application specific data (e.g. load, play), general phrases (like date and time) and other phonetically rich words." Everyone's talking: market analysis The project also focused on the market for voice-driven interfaces. The analysis looked at six market segments: mobile phones, information kiosks, audio/video devices, automotive devices, toys, and Personal Digital Assistants (PDAs). Although all segments have grown rapidly, SpeeCon identified the greatest growth in cars and mobile phones. The research also identified the need for speech recognition technology to be able to handle a variety of different environments, and also with a difference between sexes, dialects and age groups across the globe. Other results revealed that SDIs will be one of the future key features of the huge consumer electronics industry. They are considered easier to use especially for non-technical users. And, says Tropf: "it is widely accepted, that SDIs will make everyday life safer in many ways: they offer car drivers the ability to operate radios and navigation devices without taking their hands off the steering wheel." One of the greatest challenges for the team was 'adaptation'. This refers to the ability to use the recorded voice to command interfaces in environments with different acoustic qualities. So, an interface that recognised in-car speech would not typically recognise the speech associated with open or office environments. However, using the SpeeCon data the project team developed algorithms that enabled the raw speech data to be coupled with environmental data to enable use of data in new acoustic situations. This massively broadens the range of applications in which the speech data can be used. For example it launches researchers on the path to developing new algorithms to enable dynamic speech recognition no matter the acoustic conditions - ideal for controlling mobile devices - one of the areas where market analysis reveals enormous growth potential. Real-world applications Three SpeeCon consortium partners developed demonstrator applications, operating with different languages and in different environmental conditions, to illustrate that the collected data could work in a real-life applications. Philips demonstrated a voice-driven CD player and mobile phone that can be used in a car or other environment. The user is able to operate the main functions of the CD-player and some functions of the telephone by voice. Sony, on the other hand, took a voice-driven toy - the AIBO (an artificial intelligence pet dog) - and demonstrated command recognition in Spanish and in Polish using data from the SpeeCon-created language databases. And IBM has just announced the full-scale launch of its speech-driven in-car navigation system in partnership with Honda. SMEs have benefited from SpeeCon by being given commercial access to the SpeeCon speech databases via the ELRA (European Language Resource Association). This allows SMEs to play an active role in the market of speech driven interfaces for consumer applications stimulating the market with innovative ideas and products. SpeeCon's work is a significant landmark for speech recognition research: it has collated a massive amount of data to enable SDIs for consumer devices across the EU, and has demonstrated the possibility of speech recognition across a wide range of acoustic environments. The SpeeCon team also envisage other spin-offs from the research: products that will enable speaker identification, and multilingual speech understanding and translation systems. Contact: Herbert Tropf Siemens AG CT IC 5 SP Otto-Hahn-Ring 6 D- 81739 Munich Germany Tel: +49-89-63644195 Email: herbert.tropf@siemens.com Source: Based on information from SpeeCon PLEASE MENTION IST RESULTS SERVICE AS THE SOURCE OF THIS STORY AND, IF PUBLISHING ONLINE, PLEASE CARRY A HYPERLINK TO: http://istresults.cordis.lu/ IST Results |
|||||||||||||||||||||
Science Research Departments
Earth Science Alternative Energy | Anthropology and Archaeology | Earthquakes and Volcanoes | Environment and Nature News | Global Warming | High-Energy and Particle Physics | Ozone Hole | Scientists Slow Light | Tsunami Space Science Astronomy and Space News | Black Holes | Chandra X-Ray Observatory | Extrasolar Planets | Hubble Telescope | International Space Station | Jupiter Galileo Mission | Jupiter Cassini Mission Flyby | Mars Exploration | Mars Odyssey 2001 | Mars Global Surveyor | Mars Polar Lander | Mars Climate Orbiter | Mars Pathfinder | Meteors and Asteroids | Mir Space Station | NEAR Asteroid Probe Mission | Pluto Planet Debate | Search for Extraterrestrial Life | Space Shuttle Program | Space Shuttle Mission: STS-102 | Space Weather Life Science Animal News | Biotechnology and Genetics | Brain Research | Human Cloning | Dinosaur and Fossil Discoveries | Endangered Species | Gene Therapy | Genetically Modified Food | Stem Cell Research | Whales and Whaling |
|||||||||||||||||||||
|
|||||||||||||||||||||
|
|||||||||||||||||||||