Speech Recognition: Afterthoughts on Its Strengths and Weaknesses

I have to admit that I hadn’t used any kind of speech recognition in a long time when I was asked to do the tutorials. The last time I had tried speech recognition, it was with Dragon Dictate, which was new when Windows 98 was also new. A very long time ago in computer years!

I was happy to tackle this subject because I was very interested to see how speech recognition has improved. And boy, has it improved! Even a basic, built-in application like this did an amazingly good job "right out of the box." In this final article about Speech Recognition, I’d like to talk about what I learned while reacquainting myself with the wonders of speech recognition, and where I’ll be going from here.

Not ready for Star Trek yet

I’m sure many of us watched the crew of the Starship Enterprise saying "Computer!" and getting an immediate answer. We don’t have Starfleet computers yet, but beginning with Windows Vista and continuing in Windows 7, we do have computers that will listen to us and respond to what we tell them—and answer us, if "What was that?" is considered an answer.

I did have problems when I first tried to use Speech Recognition, and troubleshooting wasn’t particularly straightforward. The help files are not always helpful enough. I was able to find the answers on Microsoft’s web site and in an assortment of online forums without too much work. That’s how I found out that I had forgotten that my webcam (sitting right in front of me on top of my monitor, and directly in line with the way I was speaking) also had an active microphone and was adding to the confusion. Once I fixed that, it was pretty smooth sailing from then on.

I even tried speaking with an assortment of different accents (BBC British and American redneck, for example) and was able to get reasonably good recognition, allowing for differences in standard American pronunciation. Of course, saying "Friends, Romans, countrymen, lend me your ears!" while doing my very best impressions of Helen Mirren and Jeff Foxworthy made me laugh too much to get entirely accurate results.

Language recognition

Speech Recognitioncan be used with different languages, and I thought I might try it out with my limited, American accented, Spanish, German and French, but unfortunately you cannot use other languages unless your operating system is also in that language. You can change your operating system’s language by installing another language pack from Microsoft, but you can only do that if you’re running Windows 7 Ultimate or Windows 7 Enterprise.

Speech Recognition is available for US English, UK English, French, Spanish, German, Japanese, Traditional Chinese and Simplified Chinese, and will be found in those languages’ versions of Windows 7 (all versions). I was sorry not to be able to try that out. I have no idea what will happen with Windows 8, but I think the ability to install other language packs would be a good addition to the Windows 8 equivalents of Windows 7 Professional and above.

What works well

As I mentioned, Speech Recognition is designed to work best with other Microsoft software. As long as I experimented with Microsoft products I was very successful (although as one might expect, using Microsoft Office Excel was both limited and complicated). With other software it was hit or miss. I could use the Google Chrome browser fairly well (definitely not as well as Internet Explorer) and my Eudora email program, which is pretty much antique software by now. It’s worth experimenting with your own favorite software to see what you can do. The "show numbers" command was especially helpful in selecting items and commands.

I also found that it didn’t take very long for the accuracy of the recognition to improve markedly. I went through the training exercises twice, and after that the recognition was almost 100% correct. I was able to speak a little faster and put in fewer pauses for the software to keep up. I really enjoyed watching my voice translated into words on the screen. My early experiences with speech recognition software were nowhere near this pleasant.

What doesn’t work well

As I mentioned, some software is just incompatible with Speech Recognition. I couldn’t even open Adobe Reader or the Adobe AIR version of TweetDeck. I found that I could not sign into my Google account with Internet Explorer to try out Google Docs—there seemed to be no way to speak or spell my password. I suspect this is a security issue, not allowing passwords to be spoken out loud where someone else might hear, but it was annoying.

I could open iTunes and select a song to play, but could not actually get it to play. I could open Scrivener (my word processor of choice) but "Show numbers" did not overlay numbers on anything I wanted to use. I didn’t do any really extensive experimentation with my favorite software—those are just a few that I tried. It would be worthwhile for anyone who wants to use Speech Recognition to test out the programs they want to use it with, to be sure it’s going to be compatible.

More links and resources

If you haven’t already seen the previous articles, you can find them here:

Oddly, it’s almost impossible to find any information about Speech Recognition on the Microsoft Answers web site without clicking a link from a Google or Bing search. I was unable to get any answers at all by putting "Speech Recognition" into the search box, even though there are a few questions about it in the forums. Use this link to get Speech Recognition help from the Windows web site: Speech Recognition search results.

Here’s a brief Wikipedia article that talks about the history of speech recognition at Microsoft: Windows Speech Recognition.

Here is a blog entry that gives the author’s thoughts on comparing Speech Recognition with Dragon Naturally Speaking: Dragon NaturallySpeaking Versus Windows 7 Voice Recognition.

Wrapping it up

I really enjoyed working with Speech Recognition and marveling at the improvements that have been made over time. It would certainly be good enough for casual everyday use, especially with Microsoft products.

Will I keep using Speech Recognition? Yes, when I can. At this point I don’t need anything more sophisticated. It was well worth the time it took to train it and to train myself to use it right.