Have you talked to your computer lately? (Shouting curses at a system crash doesn’t count.) Chances are, you haven’t — though spoken commands and dictation are options for both Microsoft Windows XP Tablet PC Edition and Office XP, as well as the latter’s rival Corel WordPerfect Office 2002 Professional, the concept of a voice-controlled PC continues to be more talked about than talked to, so to speak.
But as faster, more powerful PCs become more commonplace — and voice-interface telephony applications such as automated receptionists and order menus become increasingly popular in corporate call centers — PC speech is less dismissed as the stuff of Star Trek and more viable for everyday Windows users. (Developers use the phrase “speech recognition” to refer to speaker-independent command and dictation vocabularies, reserving “voice recognition” for biometric or security applications that respond to a specific person’s voice.)
A couple of recent announcements highlight the progress made on both the listening and speaking sides of the PC conversation. Let’s check out what’s new in both speech recognition and text-to-speech software.
Dragon NaturallySpeaking 7
The biggest brand in PC dictation is Dragon NaturallySpeaking — a pioneering product that’s bounced from Dragon Systems to Lernout & Hauspie to ScanSoft. Last year’s NaturallySpeaking version 6 wasn’t a major upgrade so much as a merger of version 5 with its former rival L&H VoiceXpress, though it was good news nonetheless in terms of ScanSoft’s rescue of the product from L&H’s bankruptcy.
Recently ScanSoft introduced NaturallySpeaking 7, which it says will finally bring speech recognition into the home and office mainstream by fixing the category’s perennial hassles of setup, accuracy, and ease of use. According to the company, the new release increases accuracy by 15 percent, enabling users to achieve accuracy levels of up to 99 percent while dictating up to 160 words per minute. Initialization or training time has been cut in half, too, as users can train the software on their voice patterns in just five minutes.
Other new features include Natural Punctuation, which eliminates the need to say “period” or “comma” when dictating e-mail messages, sending instant-messaging text, or completing Web-based forms, and a Vocabulary Optimizer that analyzes sentence structure and word use frequency in existing documents and tunes NaturallySpeaking’s recognition engine accordingly.
NaturallySpeaking 7 is available in Standard ($100) and Preferred ($200) editions; the latter adds specific support for Microsoft Excel 97/2000/2002 and Eudora Pro 5.1 as well as the former’s Word, WordPerfect, Internet Explorer, AOL, and Outlook Express. The Preferred version can also transcribe dictation from handheld voice recorders— and, in a new feature, Pocket PCs— and both play back your dictation and read PC files aloud via ScanSoft’s RealSpeak text-to-speech engine. Both are bundled with headset microphones.
Via Voice for Windows from IBM
The Dragon upgrade puts the ball back in ScanSoft’s main competitor’s court: IBM’s ViaVoice for Windows Release 10, introduced last fall — available in $30 Personal, $60 Standard, $70 Advanced, and $190 Pro USB versions, the last bundled with Plantronics’ DRP-300, a top-quality, USB-based headset mic.
ViaVoice has a reputation for being arguably a step behind NaturallySpeaking for general PC use, but arguably a step ahead for serious dictation or heavy-duty word processing. Release 10 introduced a new speech engine with improved background-noise adaptation and one-key control of dictation and command modes.
When it’s time to proofread or read back a document— or if you’d like to have the morning’s e-mails read to you while you’re busy with something else — you’ll find PC text-to-speech (TTS) applications have progressed far beyond the droning, robotic voices you may have heard from old demonstrations or Stephen Hawking’s synthesizer.
Both Dragon NaturallySpeaking and ViaVoice include TTS capability, but users who want the most flexibility or most natural-sounding (though still computerized) voices available often rely on dedicated playback software. And there are new releases to report in this category, too— specifically, last week’s debut of iSpeak 3.0 from Fonix .
iSpeak While Your PC Listens
In addition to reading plain text files, the $50 iSpeak adds integration with Microsoft Word, letting users listen to documents or selected text from within Word itself without having to cut and paste or save a document in text format. The upgrade also offers e-mail reading for AOL and Outlook Express as well as Outlook 2000, and a new “Drag and Speak” desktop icon that launches iSpeak and reads Word, WordPerfect, Excel, or several other types of documents when you drop their icons there.
iSpeak 3.0 can subdivide and save documents as one- to 20-minute MP3 files for later handheld listening, and offers no fewer than 14 playback voices— with Fonix’s own SimplySpeaking “voice fonts,” Roger and Jessica, joined by nine Fonix DECtalk formant-based voices, the AT&T Natural Voices Mike and Crystal, and Lucent Technologies’ Articulator voice Basil.
Voice Sampling from ReadPlease
If you’d like to sample TTS technology for free, experiment with even more digitized voices, or both, ReadPlease offers both free basic and $50 Plus versions of ReadPlease 2003, which can read any text copied to the Windows Clipboard (or to its own window if you prefer).
Both versions come with four Microsoft-supplied voices; the Plus edition, in addition to offering more flexible, VCR-like playback controls, works with both the 8KHz and higher-quality 16KHz AT&T Natural Voices voice fonts (available for an additional $25, including male and female 8KHz voices, plus $45 per additional voice). The latter offer your choice of German, French, British, and Latin American Spanish accents.
Adapted from WinPlanet.com.