Nuance’s IBM Via Voice for Macintosh is a Disaster

I hadn’t meant for this forum to be about all things computer related – certainly not my personal struggles with software. But there is really no other forum for lambasting a product for truly terrible behavior. Couple this with a complete lack of customer support and frankly, I’m pretty upset. I’m hoping that by posting here, I can get it out of my system and get on with my work.

I make extensive use of voice recognition speech-to-text capabilities. My hands simply cannot type as much on a computer as I need to do in any given week. Getting speech transcription working has been a huge boon to me to be able to get my work done without having to spend my weekends recovering with ice packs and ibuprofen.

When I used Windows, I was directed to Dragon Naturally Speaking. After a modest amount of training, that product allowed me to cut my keystrokes by more than half. While it’s not perfect, it does work, even to the ability to pick up on my quirky manner of speech which is part Americanisms and part Britishisms and filled with technical jargon. After every transcription session, Dragon can reanalyze what it has learned, the corrections, it mistakes to get a little more accurate.

All well and good, this was clearly a success for me.

Then, I switched to a Macintosh. There are many reasons for me to use a mac for my work, most notably: it’s a UNIX variant that allows me to run that tool set when I need them natively. This is a wonderful addition for me: no dual boot, no libraries on top of the OS (cygwin), just native UNIX (BSD) when I need it, which is fairly often. And, I don’t forget which OS I’m in, mistakenly typing “ls” when I need “dir” – smile.

To be fair, I do run Dragon in my Macintosh hosted XP virtual machine. But, that doesn’t help with native mac editing – most notably, I run native mac email, Thunderbird. Grump. So, I set out to get speech-to-text running on my mac. Oh, boy.

In my research, IBM Via Voice seemed to be the best alternative for mac transcription. To be fair, I haven’t tried MacSpeech. But I may yet, considering.

I run OS X 10.4.10 on a recently purchased Intel dual-core Macbook. IBM Via Voice hasn’t been updated since OS X 10.3! It doesn’t find the USB microphone no matter how one sets the sound preferences. Ugh! Nuance (current the supporter of Via Voice) simply told me to return the software. In other words, unless you’re running a relatively ancient version of the OS, you’re unsupported. Are they really “supporting” this product, or just hoodwinking uninformed consumers such as myself into purchasing something that cannot work?
However, before returning the software (wish that I had!), I trolled online forums and discovered that the software will start if the sound preferences are open as it comes up. OK. That works. Now to use the thing.

Ah, but unlike Dragon, all the nice macros like “correct” and “select” that allow hands free operation only work in the Via Voice Text Pad, not in all entry areas (like, uh, email, Word, you know, all the software that one might actually use to create documents on a mac. Uh, oh.) OK, I can do my own editing if this thing will just transcribe reasonably accurately? Read on, no such luck.

Via Voice ships with a USB headset from Andrea, their NC-7100. It is NOT recommended for speech recognition! They have pricier models for that. Ah, a follow on sale, uh? Using the shipped headset delivers poor recognition. Plus, setting it to the correct volume is very difficult. I used the headset not just for Via Voice, but also, my soft phone. I’ve gotten beaucoups complaints about being muddy, too loud, distorted, inaudible on phone conferences. By the way, I live or die on phone conferences. Ok, crappy headset. No wonder Via Voice doesn’t work, right?

This wouldn’t be so bad unless I had already used Dragon, which ships with a cheapo non-USB (standard 1/8″ stereo plugs) headset that works fine. Comparisons make Via Voice look just awful. I could use almost anything for Dragon and it would work acceptably. Obviously, better headsets make a difference.

So, maybe I should get a better headset?

Oh, did I mention that unlike Dragon, Via Voice will not even start (much less use) a headset that does not deliver audio through USB? I, in my other life as a musician, have access to really good microphones. One of the tests that I ran was to try improving recognition by using studio level microphones. No dice! If it’s not USB, Via Voice cannot make use, no matter how fine the sound.

Why in heck would you design a product to a particular sound input, considering the gazillion ways there are for computers to take in audio? What if I wanted to increase success by using a studio grade A/D converter (I own several!) to deliver audio quality at a level way above speech recognition needs? It’s a silly design to tie oneself to a technology that may be superceded. Let the user choose. Keep your product alive as technology changes. Pretty basic design principle.

One very good reason to use a cheap, light headset is when traveling. I may not want to carry a relatively heavy USB headset when on the road. Or, maybe I have a really nice bluetooth headset that I want to use, allowing me to break the wire tether to my machine. Let me choose, Via Voice. The mac supports all of the above well. I’ve tested them all. Via Voice, however, fails miserably.

Did the new headset improve things? I bought the VXi Parrot Translator. My web research shows this head set as one of the favorites for speech-to-text recognition. This headset, while noticeably better, does not take Via Voice to the realm of Dragon – not even vaguely.

“OK” says I, always up for a challenge, “perhaps I need to do a lot more training with my new headset?”

Bringing up the Via Voice training software is a nightmare (on Elm Street?) The program doesn’t analyze my speech. Instead, as near as I can tell, it’s purpose is to train me to speak like it expects. It has particular trouble with the beginning of phrases and especially the very common words: the, a, in, or, for, if.

I would expect (and this is the way that Dragon works) that after a few iterations, the machine would start to recognize my particular manner of enunciating the articles (and other words being analyzed, right?) Not on your life.

What the analyzer does is complain at you and refuse to move on until it gets something that it can understand. I’ve been on the same “short story, 30 minutes” for 3 days. I’ve repeated “a” and “the” and “if” hundreds of times to no avail. All I get is an error telling me to start at the underlined place. Ugh! This isn’t training, this is torture.

Mind you, my rating as a speaker at the conference mentioned in my last post was 5th out of 24 (from the top). Not bad. And, my audience must have been able to understand most of what I said, yes? While I can mangle the English language pretty badly (even worse in French and worse yet in Spanish!), I do produce pretty well articulated English language articles, I’m guessing?

My speech mannerisms, no matter how quirky, must be widely understandable. And they are, to Dragon. But Via fails most often at these simple, common articles, often entering text so far off as to be laughable. If I wasn’t trying to get something done efficiently, I would laugh. But it’s darn frustrating, I can assure you.

I still haven’t finished the “30 minute” story. It’s stuck on an “it” at 71%. What a lousy piece of software. When writing software, upon encountering the same user error continuously, one must assume that something else is wrong and take corrective action. I’ve written a lot of software. And the first rule is to expect the impossible and deal with it gracefully for the user. Via Voice just gets bogged down and collapses. If I force it, it will refuse to recognize anything except words one-at-a-time. Not sentences, not phrases, words one at a time. I type at 70-80 words per minute. Speaking one word at a time is incredibly laborious.

So, my considered recommendation is: don’t try Via Voice. Maybe MacSpeech is better? I don’t know. I’ll let you all know if I try it.

I will send this link to Nuance for their consideration.

frustrated, with hurting hands after typing this missive.