Three Cheers for “Hybrid” Speech Apps

Picture 4If there is a positive take-away from Spinvox’s recent public relations disaster it should be that a general realization that there is nothing inherently wrong with human intervention on behalf of a more successful automated speech interaction. When it comes to popularizing speech-enabled or speech-initiated services – repeated use will correlate with high success rates and even the most accurate recognizers top out around 90%, which means they will fail 10% of the time. Unfortunately, those 10% of misunderstood utterances can often be names of people or places whose mis-identification can render the entire message meaningless or inaccurate.

As I’m discovering as I’m compiling my report on “Mobile Speech Applications” accurate rendering of spoken words will always require some sort of human intervention. At a minimum, “the system” needs some understanding of the “domain” under discussion. In other words, a speech-to-SMS application may involve a broad, generic vocabulary (called a grammar in speech rec terms), but chances are it will not include medical terms. Likewise a local mobile search application benefits from using a grammar that includes local street names and points of interest. The human touch figures into this formula because people can be employed to “tag” captured utterances to assign them to the proper domain and employ the most appropriate grammar.

I’ll admit that the SpinVox debacle devolved into a discussion of violation of privacy, in that entire .wav files that include “private” conversations were shipped wholesale to contact centers for interpretation and transcription. In my mind, the issue here is one of disclosure. If voicemail messages, that are supposedly private, are being sent offshore for transcription, it should only be carried out by informed consent, and callers should be given the ability to “opt-out” if they don’t want their calls transcribed. I, personally, don’t think such a protocol would have a chilling effect. Certainly, it would be less deleterious to the Mobile Speech Application movement than the innuendo that seems to crescendo when incidents like the SpinVox revelations call privacy into question.



Categories: Articles