Text-to-Speech - Naturalness and Accuracy
By Tetschner, Walt; ASR News,Publication Date: July 2003
Article discussing the naturalness and accuracy of text-to-speech (TTS) technology. Naturalness is simply a measure of how “natural” the synthesized voice sounds; accuracy is measured by determining if the typed text is spoken correctly. The history and evaluation methods of TTS technology are included, with a quantitative analysis of how TTS technology handles words of foreign origin, acronyms, abbreviations, names, addresses, and homographs. Voice likability, languages supported, utilization within an application, and appropriateness for specific domains are included in the analysis. Eighteen TTS products were tested for accuracy: (1) Aculab TTS, (2) AT&T Natural Voices, (3) Babel Babel TTS, (4) Cepstral TTS, (5) Fonix FAAST, (6) Fonix DECtalk, (7) IBM TTS, (8) Loquendo TTS, (9) Microsoft TTS, (10) Nuance Vocalizer 3, (11) Rhetorical rVoice, (12) ScanSoft RealSpeak, (13) ScanSoft TTS3000SpeechWorks Speechify, (14) SpeechWorks ETI Eloquence, (15) Speechworks Solo, (16) SVOX TTS, (17) Voiceware Voicetext, and (18) Winbond TTS. The TTS accuracy test is displayed in a table and a bar graph that measure the average percentage of correctly spoken words in seven categories: (1) number processing; (2) words of foreign origin processing; (3) acronym processing; (4) abbreviation processing; (5) name processing; (6) address processing; and (7) homograph processing. The most common errors and implications for improvements to TTS technology are discussed.
Assistive Products Discussed: DECTALK ACCESS 32
CEPSTRAL TEXT-TO-SPEECH (TTS) VOICES
LOQUENDO TTS DIRECTOR
AT&T NATURAL VOICES TEXT-TO-SPEECH DESKTOP SDK
IBM VIAVOICE TEXT-TO-SPEECH SDK
REALSPEAK WORD
ETI-ELOQUENCE
SVOX MOBILE TTS
NEOSPEECH TTS ON DEMAND
Published by: Voice Information Associates, Inc. (Website:http://www.asrnews.com)

