IBM Labs Boosting “The Spoken Web”

This article in the Economist magazine, entitled, “A Web of Sound: Talk About That”, reminded me that the legend of using VoiceXML to speech-enable the World Wide Web is alive, well and targeting the greater good by making Web sites more accessible to the illiterate. The article’s author credits Guruduth Banavar, the director of IBM’s India Research Laboratory, with undertaking a project to make it easier to develop so called “voice sites” which enable callers to navigate the Web and retrieve personal information.

The “spoken Web” conjured up in the article, will ride the coattails of the growth in wireless subscribers. However, unlike their commercial cousins, the speech-enabled contact centers, these “voice sites” run on relatively small servers and are designed to be local or personal in nature. The article’s author calls them “portals through which people can find out such things as when the mobile hospital will next visit their village, the price of rice in the local market and which wells they should use for irrigation.” To support speech-based browing, IBM is employing a new linking mechanism called the hyperspeech transfer protocol (HSTP), which is the spoken equivalent the the hypertext transfer protocol which drives the “http://” in a visual browser’s navigation bar.

The development efforts are laudable and are a testament to the pervasive, global movement to extend the power of the Internet to mobile devices. There’s more than a little irony in the fact that the effort is characterized as part of a research effort, rather than a marketing or product development initiative. IBM’s rich history with VoiceXML goes back a decade and a half. It was cultivated by a Speech Products Group in Boca Raton, FL, that for much of its life-span occupied the same building where the original IBM PC was conceived and prototyped in the early 1980s. The Speech Group was successively absorbed into the now-defunct “Pervasive Computing” business unit and then “mainstreamed” out of existence when its core product – called WebSphere Voice Server – migrated into the huge catalogue of WebSphere-branded middleware and application servers.

The coup de gras for IBM Speech took place last January when IBM licensed a good deal of its source code to rival speech processing company, Nuance. In an advisory that we published at the time, we called it a variation of “Win-Win-Win” formula. IBM would get upfront money for its licensed technology, Nuance would have a richer code-based on which to build future products and services and customers would benefit by having better products from both companies.

Instead, it has been back-to-the-future (more accurately back-to-the-lab) for IBM voiceXML efforts. Meanwhile, Nuance and its rivals are vying to make a living by making it possible for wireless subscribers to speak commands, search terms, navigational instructions and messages into their wireless devices. As a result, we anticipate a rich set of commercial products to come out this year. Some of the new devices and services may be “powered by IBM”, but in most cases – as with the “voice sites” and the development of HSTP, they will have a decidedly “alpha” test feel to them.



Categories: Articles

Tags: , , ,