For the keyboard impaired, the disabled, or anyone who just thinks the idea of talking to a computer is kind of cool, speech recognition has enormous appeal.
You talk and the computer types. Or you talk and the computer does your bidding. The potential productivity benefits are phenomenal.
I decided to look at speech recognition again when I began to experience chronic tendonitis in my mouse elbow. I figured using voice commands would save much mousing, and thus much pain and discomfort. A good idea, but one that hasn’t worked out so far.
Speech recognition is a little like bears dancing. It’s so amazing that computers can do it at all; you’re almost willing to forgive them for not being able to do it very well. Almost willing.
The technology has been around for over ten years now and it’s supposedly getting better all the time. It has improved from the state of the art a few years ago when I last looked at this technology.
But based on my experience reviewing the latest version of IBM’s marketing-leading ViaVoice product — ViaVoice for Windows, Pro USB Edition, Release 10 — speech recognition still has a way to go.
Great When It Works
ViaVoice, when it works well, is a marvelous tool. You can dictate to Word and other programs, including Outlook and Outlook Express, Microsoft’s e-mail and personal information management programs, and see your words appear magically on the screen without having to touch the keyboard.
You can create dictation macros to insert your address in a document, for example, or call a boilerplate paragraph.
You can issue commands to and make menu selections in most Windows programs. You can even use natural language commands in some Microsoft Office programs — for example, say, “Make the document Arial bold blue” to change the type face, or “Create a table here with three rows and four columns.”
You can issue voice commands to launch programs — “Start program Microsoft Word.” You can surf the Web with voice commands. And you can create voice command macros to perform a series of actions.
But getting to the point where you’re fluent in ViaVoice is no cakewalk. It takes a significant commitment of time and patience to install, configure and learn to use the program. And it does not work well on all systems — which is to say, it didn’t work well on my test system.
Can You Hear Me Now?
The problems started with installing the included Plantronics DSP-300 headset microphone, a highly rated noise canceling microphone that supposedly ensures optimum speech recognition performance.
The Plantronics instructions included a sticky note pasted over the unit’s USB plug marked, “Install software before plugging in headset!” The trouble is, you can’t install the software on a Windows XP computer without first plugging in the hardware.
Sorting out this confusion took a call to Plantronics’ technical support — but only after 45 minutes of fruitless effort on my part — and then downloading a new XP version of the headset software. Even so, the software Plantronics supplied doesn’t have a Windows XP signature certifying that it’s 100 percent compatible. This may explain some later problems.
Installing the ViaVoice software itself was relatively painless. But once installed, the next step is to “train” the system to understand your particular speech patterns by dictating the provided text.
Count on at least 20 to 30 minutes just to do the bare minimum of training. After that, the program supposedly improves in accuracy the more you use it. In the meantime, don’t expect instant productivity.
For one thing, many of the voice commands you’ll want to use — you can always display a pop-up list of available commands by saying, “What can I say?” — turn out to be “not trained” yet.
Before ViaVoice will respond to these commands, you need to double click on the command in the What-can-I-say list to launch a dialog that allows you to speak the words so ViaVoice can learn your pronunciation.
Command And Dictate
ViaVoice has two distinct modes — command and dictate. To move to dictation mode, in which the program transcribes every word you say, you speak a command — “Dictate to Word,” for example.
It’s amazing how much of your dictation the system will interpret correctly after even minimal training. You can even configure it to interpret dictation recorded on a voice recorder.
But specialized vocabulary — such as the vocabulary related to the high-tech topics I routinely write about — and proper names are still a big problem. You can tell ViaVoice to analyze a collection of your documents to find “new” words that it doesn’t know that you’re likely to use.
When I did that it turned up a long, long list. I then had to go through the word-by-word training process described above. It’s tedious.
You can also spell words while dictating, first issuing a voice command to move into spell mode, then saying each letter in the word or name.
But mistakes will inevitably be made, and mistakes slow everything down. To correct mistakes you either select them with a mouse and type over them — which theoretically helps the system recognize the word better next time — or use an unwieldy voice command process that involves selecting the correct word from a list of possible alternates.
Troublesome Troubleshooting
Beyond these almost inevitable start-up and learning issues, ViaVoice did not perform well on my system, a Dell 1.7GHz Pentium 4 machine with 256MB of RAM and gigabytes of free storage.
In most sessions, the Plantronics microphone did not work without repeatedly going into set-up dialogs and repeating steps that theoretically only needed to be done once — a problem I’m guessing relates to the Plantronics software.
Even when the audio set-up procedure reported that the microphone was adjusted for “excellent” reception, the system had difficulty interpreting many of my voice commands — even commands for which it was trained or for which I had made a point of training it using the word-by-word method.
Worse, ViaVoice appeared to make the whole system slightly unstable — shades of the bad old Windows 98 days on my normally rock solid XP machine. Programs would crash; ViaVoice would simply stop recognizing anything.
In fairness to ViaVoice, at least some of these problems may have had to do with resource shortages — mine is a loaded system and I run McAffee’s virus protection tools which eat up memory, which ViaVoice also needs.
On the other hand, my system falls well within the program’s system requirements and it should certainly be able to co-exist with McAffee.
When it works the way it’s supposed to work — and I did see flashes of this — ViaVoice is a very impressive product. The question is, is it worth the trouble of getting it to work the way it’s supposed to work?
Parting Thoughts
I know from talking to attorneys and radiologists who started using the technology years ago when it new and nowhere near as sophisticated that it is possible to get past the start-up woes I experienced and realize real productivity gains with speech recognition. I believe them. I just haven’t experienced it yet.
Recommendation: Buy only if you have real need — disability, repetitive strain injury, serious keyboard incompetence — and count on a steep learning curve. Remember, patience is a virtue.