Siri Debuts on iPhone: Speech-based Virtual Personal Assistant

Today the App Store in Apple’s iTunes site begins distributing Siri, a new app that transforms the iPhone into a “virtual personal assistant.” I know, we’ve heard the term before, describing precursor services like Wildfire, HeyAnita or the product of General Magic. Yet, in all those cases, the principal roles of the Virtual Assistant was to handle scheduling, messaging and simple directory-based activities (call origination, incoming call handling and the like).

IMG_0185Siri is set apart because it applies the depth of knowledge its founders and software specialists have built at SRI and elsewhere in creating a “cognitive assistant that learns and organizes” (CALO). Siri users benefit from a voluminous amount of pre-preprocessing and organization of information that has been carried out “in the cloud” on their behalf.

The image above illustrates Siri’s landing page. The illustrated topic areas serve as reminders of the sort of often-asked-for information which the service is tuned to handle. It also suggests phrases that users might try to get the information they want. Note that the suggestion below “Movies” is “PG-13 movies this afternoon”, illustrating that the “artificial intelligence” ingrained in the service is quite capable of knowing a movies rating and the meaning of “this afternoon” as well as the physical location of the originating user. And, given the precepts of CALO, responses get more accurate and useful as the system acquires more usage history.

I’ve had the service for a couple of days and here are my initial reactions. My overall experience has been quite positive. The quality of voice recognition (powered by the same “engine” that supports Dragon Dictation and Dragon Search on the iPhone) is quite good. It has been accurate both indoors and out. More importantly, the results are illustrated in a large white box for editing before submision. This form of spoken “utterance triage” is a must for speech enabled applications and will ultimately give users a chance to correct punctuation and capitalization, in addition to spelling.

Response time could feel a bit draggy (the general public hates latencies); but, on the positive side, the answers were qualitatively different from those of a general search engine (like the voice-activated Google app for the iPhone). Put simply, the service is more “domain aware.” It recognizes the differences in intent when a query is about a Taxi versus a movie and responds accordingly. The request for a “taxi service” is a great example. Google serves up links to various local taxi cab services in the area, including the phone numbers and a means to get directions.

Siri, by contrast, assumes that you want a taxi immediately and serves up a form, using Taxi Magic (powered by RideCharge), to book a ride, based on your location and a specified time. Before delivering the form, however, Siri serves up a number of comic book like dialog balloons with statements in plain English to tell you how it is processing your request. For example it might say “I found these taxis within walking distance” or suggest another way to interpret your utterance: “Get me a cab”, for instance.

In each of the chosen categories, the search engine is designed to accelerate the process of search and decisionmaking that culminates in a purchas or transaction. The company’s financial success will be predicated on supporting multiple transactions and taking a percentage fof the revenue generated. That’s another big difference between Google Voice Search and Siri.

Based on my experience, I encourage people to download and gain experience with Siri, just as it gains experience with you.



Categories: Articles

Tags: , , , ,

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.