<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Opus Research &#187; Apple</title>
	<atom:link href="http://opusresearch.net/wordpress/tag/apple/feed/" rel="self" type="application/rss+xml" />
	<link>http://opusresearch.net/wordpress</link>
	<description>Analysis and Expertise on Voice Services and Conversational Commerce</description>
	<lastBuildDate>Wed, 08 Feb 2012 18:55:10 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>That Didn&#8217;t Take Long! Siri-based Comparison Shopping Adds Best Buy Catalog</title>
		<link>http://opusresearch.net/wordpress/2011/12/16/that-didnt-take-long-siri-based-comparison-shopping-adds-best-buy-catalog/</link>
		<comments>http://opusresearch.net/wordpress/2011/12/16/that-didnt-take-long-siri-based-comparison-shopping-adds-best-buy-catalog/#comments</comments>
		<pubDate>Sat, 17 Dec 2011 01:02:07 +0000</pubDate>
		<dc:creator>Dan Miller</dc:creator>
				<category><![CDATA[CAT Scans]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[mobile commerce]]></category>
		<category><![CDATA[Siri]]></category>
		<category><![CDATA[voice search]]></category>

		<guid isPermaLink="false">http://opusresearch.net/wordpress/?p=5043</guid>
		<description><![CDATA[Spoken queries to Siri regarding electronic gadgets, appliances, games and computers will result in a display of responses that include the the SKUs (stock keeping units) in Best Buy's catalog]]></description>
			<content:encoded><![CDATA[<p><a href="http://opusresearch.net/wordpress/wp-content/uploads/2011/12/siriwolfram.jpg"><img src="http://opusresearch.net/wordpress/wp-content/uploads/2011/12/siriwolfram.jpg" alt="" title="siriwolfram" width="144" height="219" class="alignright size-full wp-image-5044" /></a>It&#8217;s quickly becoming apparent that Conversational Commerce and Recombinant Communications are inextricably intertwine. (Try saying that five times fast). We&#8217;re in month 2 of Siri&#8217;s beta release on Apple&#8217;s iPhone 4S and we&#8217;re already witnessing how the service will improve as a product of natural selection, gradual upgrades, augmentation and evolution. As a case in point, spoken queries to Siri regarding electronic gadgets, appliances, games and computers will result in a display of responses that include the the SKUs (stock keeping units) in Best Buy&#8217;s catalog. </p>
<p>You can find coverage of the phenomenon in dozens of tech publications today, but they seem to trace back to <a href="http://www.razorianfly.com/2011/12/16/siri-can-now-help-you-shop-at-best-buy/">this post</a> on the Apple-centric tech blog called RazorianFly.com. According to the post, the enhancement is very much the result of Wolfram Alpha (an answer-oriented search engine that is integrated into Siri&#8217;s search results) integrating with Best Buy&#8217;s product database through BestBuy.com&#8217;s API. Or, as more than one tech blog put it, <a href="http://www.webpronews.com/wolframalpha-brings-comparison-shopping-to-siri-2011-12">Siri now returns the same errors as a search on Wolfram Alpha</a>. </p>
<p>As Shaylin Clark at WebProNews explains in the post above, as a &#8220;computational engine&#8221; oriented toward asking questions, Wolfram Alpha can be quirky (he calls it &#8220;finicky&#8221;). But putting speech-based access to comparison shopping that includes Best Buy&#8217;s inventory marks progress, even if the results are not always optimal. The point is that end-users are gaining experience with the service. They are learning what it is good at and where it fails. </p>
<p>My empirical observation is that people are being much more patient with Siri than they had been with prior renditions of voice-based &#8220;assistants&#8221; (like Wildfire, HeyAnita or Webley). One reason is that the service is faster, better, more robust and capable of doing more things than its predecessors. There&#8217;s more knowledge in the databases that comprise its available knowledge (heck, it defaults to a search on Google, but it has maps, online music and Wolfram Alpha to bring to bear). It&#8217;s very early days and Siri is bound to get better. And it will inspire competing services from Google, Microsoft/Tellme, Amazon, Nuance, Vlingo and a handful of others. Each will add new features, functions, information and APIs to differentiate their services and deliver a better customer experience. </p>
<p>At this point Apple has taken a leadership position by coming to market with a service that&#8217;s instantiated as an embedded application that recognizes utterances accurately; determines context and meaning; and then has meaningful integrations with a broad range of knowledge bases so that it starts by recognizing intent and finishes by delivering relevant results. The truly exciting aspect to this is that the the services from Apple and its competitors will continue to evolve and get better.</p>
]]></content:encoded>
			<wfw:commentRss>http://opusresearch.net/wordpress/2011/12/16/that-didnt-take-long-siri-based-comparison-shopping-adds-best-buy-catalog/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Twilio Positioning SMS as a Pre-API for Siri Development Efforts</title>
		<link>http://opusresearch.net/wordpress/2011/11/03/twilio-positioning-sms-as-a-pre-api-for-siri-development-efforts/</link>
		<comments>http://opusresearch.net/wordpress/2011/11/03/twilio-positioning-sms-as-a-pre-api-for-siri-development-efforts/#comments</comments>
		<pubDate>Thu, 03 Nov 2011 18:32:40 +0000</pubDate>
		<dc:creator>Dan Miller</dc:creator>
				<category><![CDATA[CAT Scans]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[Mobile Apps]]></category>
		<category><![CDATA[mobile speech]]></category>
		<category><![CDATA[Siri]]></category>
		<category><![CDATA[Twilio]]></category>

		<guid isPermaLink="false">http://opusresearch.net/wordpress/?p=4889</guid>
		<description><![CDATA[Twilio is encouraging developers to come up with interesting new applications for the iPhone 4S using the Twilio platform for SMS as a quasi-API.]]></description>
			<content:encoded><![CDATA[<p><a href="http://opusresearch.net/wordpress/wp-content/uploads/2011/09/twiliologo.jpg"><img src="http://opusresearch.net/wordpress/wp-content/uploads/2011/09/twiliologo.jpg" alt="" title="twiliologo" width="151" height="60" class="alignright size-full wp-image-4784" /></a>In <a href="http://www.twilio.com/contests/2011/10/siri-video-developer-contest.html">this post</a>, the peripatetic promoter of cloud-based phone hacks, Twilio, encourages developers to come up with interesting new applications for the iPhone 4S, taking advantage of speech-based assistant, <a href="http://opusresearch.net/wordpress/2011/10/04/siri-beta-assumes-primacy-on-iphone-4s-home-button/">Siri</a>. </p>
<p>Initiatives and contests like this one illustrate one more reason why Apple&#8217;s introduction of Siri is a signal event for mobile speech. The &#8220;application&#8221; (placed in quotes for reasons I will explain shortly) has its limits. In fact, it is not really an application in the traditional sense of the word. Like many of the downloadable &#8220;speech-enablers,&#8221; Siri defies categorization. There are &#8220;command and control&#8221; elements that fall in the category of &#8220;utility.&#8221; There are dictation and messaging components that make it a &#8220;communications&#8221; app. Finally, there are (or were) the links to 3rd party web sites that enabled Siri to transform the iPhone into a personal assistant. </p>
<p>The pros and cons of the speech-based mobile assistant were tossed around most recently when Google&#8217;s <a href="http://allthingsd.com/20111019/android-chief-says-your-phone-should-not-be-your-assistant/">Andy Rubin dismissed the idea at an AsiaD (an All Things D conference)</a>. The gist of his criticism was that he&#8217;d &#8220;been-there-done-that-and-it-failed,&#8221; with reference to two speech-enabled personal digital assistants. One was <a href="http://en.wikipedia.org/wiki/General_Magic">General Magic</a>, which was spun out of Apple Computer back in 1990 and had a few, high-visibility partnerships, including Sony, Motorola and AT&#038;T among others. </p>
<p>In hindsight, Rubin may see General Magic as a failure but, in fact, its engineers designed and developed a new operating system (Magic Cap) and scripting language (Telescript) that were precursors VoiceXML and efforts to create tools that support agile programming for speech-based, conversational interfaces. The technologies that started in General Magic live on in the automated speech offerings of GM OnStar. And somewhere among the intellectual property vault owned by Microsoft co-founder Paul Allen&#8217;s Vulcan Ventures are General Magic&#8217;s patents, which were bought at auction in 2002.</p>
<p>Rubin also made mention of Wildfire Communications, Inc., a company founded by Rich Miner, who is now a partner at Google Ventures. But Wildfire&#8217;s experience is quite different from General Magic. Founded in 1991, Wildfire built a very loyal following for its speech-enabled services which, at the time, were largely built around management of voice and telephony functions, like voicemail management, call origination, call answering and the like. In 2002, France Telecom&#8217;s Orange Wireless bought the company for $147 million and offered the service to its mobile constituency.</p>
<p>At the time, the service was well-received by mobile customers but, because it was originally engineered as an enterprise app, Orange realized that it would have to re-engineer the underlying technology platform in order to offer the service in sufficient scale. Instead, the telco opted to shutter the service in 2005. As <a href="http://www.theregister.co.uk/2005/07/05/orange_wildfire/">this article</a> by Tim Richardson in The Register explains, shutting down the service took longer than anticipated because of the protests of a loyal following of Wildfire users who, to this day, feel like Orange was too hasty in its decision to cease the offering.</p>
<p>Siri (as an Apple initiative) shares quite a few of the attributes of both General Magic and Wildfire that attracted the attention and imagination of developers. The big difference today is that modern technology around computing power and storage support offering the service economically at scale. In addition, even without a formal API, the creative energy of 3rd party developers can be applied to enhancing the service using tools and scripting languages that have evolved into agile environments since the days of Magic Cap and Telescript.</p>
<p>Google has reason to be dismissive of Siri because it is important to call into question its ability to provide answers to questions that used to be the sole domain of the Google Search box (and therefore a source of advertising supported revenue for Google). But it can equally be argued that Voice Search and Voice Actions on the Android platform will benefit from general acceptance of speech-enabled assistants, like Siri. We have to see whether and when Apple introduces Siri as a downloadable app that runs on other devices and how well it (re)integrates the service with popular destination sites like Yelp!, OpenTable, Fandango, etc. Today Vlingo and Nuance&#8217;s DragonGo! have an advantage when supporting mobile ecommerce.</p>
<p>Greg Sterling and I will be issuing a report on &#8220;Mobile Speech Applications and Services&#8221; in the coming month. In it we will assess current initiatives and provide our insights and perspectives on the ultimate impact on local search and conversational commerce.</p>
]]></content:encoded>
			<wfw:commentRss>http://opusresearch.net/wordpress/2011/11/03/twilio-positioning-sms-as-a-pre-api-for-siri-development-efforts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Siri Factor, Three Days On</title>
		<link>http://opusresearch.net/wordpress/2011/10/16/the-siri-factor-three-days-on/</link>
		<comments>http://opusresearch.net/wordpress/2011/10/16/the-siri-factor-three-days-on/#comments</comments>
		<pubDate>Sun, 16 Oct 2011 15:52:34 +0000</pubDate>
		<dc:creator>Dan Miller</dc:creator>
				<category><![CDATA[CAT Scans]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[mobile speech]]></category>
		<category><![CDATA[Siri]]></category>

		<guid isPermaLink="false">http://opusresearch.net/wordpress/?p=4848</guid>
		<description><![CDATA[While most people are marveling at how well Siri recognizes and fulfills on many of their intentions, the inevitable criticism has begun.]]></description>
			<content:encoded><![CDATA[<p><a href="http://opusresearch.net/wordpress/wp-content/uploads/2011/10/Unknown.jpeg"><img src="http://opusresearch.net/wordpress/wp-content/uploads/2011/10/Unknown.jpeg" alt="" title="Siributton" width="120" height="120" class="alignright size-full wp-image-4849" /></a>While most people are marveling at how well Siri recognizes and fulfills on many of their intentions, the inevitable criticism has begun. In this post called<a href="http://www.talkingpointz.com/siriously-this-sucks"> &#8220;SIRIously This Sucks!</a>&#8221; Colin Berkshire, a guest contributor on Dave Michels&#8217; new blog, recites a litany of deficiencies in the new service. The gist is that Siri is good at doing tasks that its developers anticipated &#8211; like setting the alarm clock, dictating text messages, getting directions &#8211; but &#8221; if you stray much off the beaten path it is like playing twenty questions with a belligerent two year old.&#8221;</p>
<p>Some speech app developers have piled by noting that the service should do more &#8220;on the device.&#8221; It is crippled when the data link to the server is down (which happens quite a bit over AT&#038;T&#8217;s network &#8211; at least in SF). I would also note that Apple made no friends by discontinuing the Siri App for those (like me) who have it on their plain vanilla iPhone 4s with several more months on their contracts.</p>
<p>Sight unseen, I take the attitude that this is the reason Siri made the transition from approved app in the iTunes store to &#8220;beta&#8221; version of a native feature (meaning it ships pre-loaded and accessible through the &#8220;Home&#8221; button).</p>
<p>I take the attitude that this rendition of Siri is the worst one that the general public will encounter and that it can only get better. This started me thinking of computer graphics for the movies. Anyone who saw the first Star Wars was totally &#8220;wow&#8217;d!&#8221; and had little idea how much better it would get. Meanwhile, the producers of the film were already seeing all its faults and telling themselves that they were spending too much time on the stupid stuff like making sure that the strings holding up models of starfighters.</p>
<p>Even in the days of Pixar, they are ever-improving computer generated images in subtle ways that make for a better viewer experience. The animators of the first &#8220;Toy Story&#8221; told themselves that &#8220;this is the worst looking movie we will ever make.&#8221; And so it was.</p>
<p>We should weather the criticism of &#8220;SIRIously sucking&#8221; that I&#8217;ve seen. We can only hope that the data link between device and server gets more consistent because the marriage of AI and speech rec that is required to provide a consistently successful user experience depends on it. And we need a better way for the app to work when the data link is down.</p>
<p>I&#8217;m pretty sure that Apple and the Siri folks are already addressing these issues. </p>
]]></content:encoded>
			<wfw:commentRss>http://opusresearch.net/wordpress/2011/10/16/the-siri-factor-three-days-on/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Update with Deal Terms: Nuance Acquires SVOX; Next Step in Battle Among Apple, Google, Microsoft, AT&amp;T and IBM</title>
		<link>http://opusresearch.net/wordpress/2011/06/18/nuance-acquires-svox-next-step-in-battle-among-apple-google-microsoft-att-and-ibm/</link>
		<comments>http://opusresearch.net/wordpress/2011/06/18/nuance-acquires-svox-next-step-in-battle-among-apple-google-microsoft-att-and-ibm/#comments</comments>
		<pubDate>Sun, 19 Jun 2011 02:12:43 +0000</pubDate>
		<dc:creator>Dan Miller</dc:creator>
				<category><![CDATA[CAT Scans]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[merger and acquisitions]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[Speech enabled search]]></category>

		<guid isPermaLink="false">http://opusresearch.net/wordpress/?p=4564</guid>
		<description><![CDATA[Nuance completed its acquisition of SVOX AG, the Switzerland-based provider of a full range of speech processing software.]]></description>
			<content:encoded><![CDATA[<p><a href="http://opusresearch.net/wordpress/wp-content/uploads/2009/08/NuanceLogo.png"><img src="http://opusresearch.net/wordpress/wp-content/uploads/2009/08/NuanceLogo.png" alt="" title="NuanceLogo" width="166" height="107" class="alignright size-full wp-image-1194" /></a>On June 16th, <a href="http://www.bizjournals.com/boston/news/2011/06/16/nuance-acquires-swiss-voice-technology.html">Nuance completed its acquisition of SVOX AG</a>, the Switzerland-based provider of a full range of speech processing software. According to a <a href="http://bit.ly/md91Vs">Form 8-K filed with the Securities and Exchange Commission on June 16</a>, Nuance paid former stockholders of SVOX €87 million (approximately $125 million), of which €57 million was paid in cash at the closing; €8.3 million is payable in cash or shares of Nuance common stock on the first anniversary of the closing and another €21.7 million is payable in cash or shares of Nuance common stock on or before December 31, 2012.</p>
<p>This is a signal event in the global battle for supremacy taking shape among Apple, Google, Microsoft, IBM and AT&#038;T, among other technology giants that recognize that the future hinges on providing a highly-personalized user interface that mates speech technologies, &#8220;artificial intelligence,&#8221; embedded technologies and &#8220;cloud-based,&#8221; dynamic information and resources that comprise a consistent, &#8220;predictive&#8221; multimodal user interface.</p>
<p><a href="http://opusresearch.net/wordpress/wp-content/uploads/2011/05/Screen-shot-2011-05-24-at-10.54.58-AM.png"><img src="http://opusresearch.net/wordpress/wp-content/uploads/2011/05/Screen-shot-2011-05-24-at-10.54.58-AM.png" alt="" title="SVOX Logo" width="151" height="39" class="alignnone size-full wp-image-4516" /></a>SVOX was founded in 2000 as a two-person company specializing in text-to-speech rendering. Over the past eleven years it has shown great creativity as it broadened its product offerings, adding ASR (automated speech recognition), acoustic processing (to isolate speech from background noise), dialogue management (bordering on artificial intelligence) and voice biometrics. Its only peer in product range (other than Nuance) is Loquendo, which is the speech-processing subsidiary of Telecom Italia. The company is profitable largely because of successes in licensing its multi-lingual TTS to a multiplicity of solutions providers, mostly &#8220;embedded&#8221; implementations, but also including Google (for Google Translate). </p>
<p>Nuance will be well-advised to take stock of the full-range of SVOX&#8217;s technology solutions and their &#8220;fit&#8221; with all of its mobile and enterprise offerings. The rap on Nuance of late &#8211; in the <a href="http://www.businessweek.com/magazine/content/11_22/b4230037736600.htm">Speech Gospel According to Vlingo</a> &#8211; is that the company (as the largest, diversified provider of speech processing technologies) would rather acquire its competition than take it on in the marketplace. When all else fails, it resorts to the courts (where the number of intellectual property suits regarding speech procesing and the mobile user interface are not worth enumerating). But the truth of the matter, which many of industry pundits fail to register, is that automated speech processing &#8211; be it text-to-speech rendering, speech recognition or speaker identification &#8211; is, almost always, merely part of a solution rather than a solution in and of itself.</p>
<p>The cold reality is that the market for the speech processing technologies developed by SVOX is driven by forces that are much larger than the sum total of all speech processing providers. Pundits have already rushed to point out that the SVOX acquisition comes in the wake of Apple&#8217;s non-announcement of its Siri-supported, iOS-based personal assistant (or in advance of the release of iOS5, as reported <a href="http://www.tuaw.com/2011/06/16/nuance-buys-svox-ahead-of-ios-5-release/">here</a>). That pits it squarely against Google Voice Search and related multimodal user interfaces on Android. The other major competitor is Microsoft, which is tightly coupling Windows Phone OS-based services with its own flavor of speech processing and the dialogue management and AI (artificial intelligence) resulting from cooperation between its Tellme subsidiary and Bing, its search engine business unit.</p>
<p>The other major players, of course, are AT&#038;T and IBM. AT&#038;T has invested in Vlingo and provides its core speech processing resources. Vlingo has demonstrated industry-leading (showcase) hands-free applications with device makers, like Samsung and carriers, like T-Mobile. IBM, on the other hand, has put its stock in Nuance by licensing its speech processing technology and forming a developmental joint venture to bring new technologies to market. You can now add SVOX&#8217;s intellectual property to the portfolio. SVOX, by itself, had to choose its battles and opeted to focus on embedded TTS. The combo of IBM, Nuance and SVOX has a better chance to bring a formidable portfolio of solutions (ASR, TTS, voice biometrics, acoustic processing, and accompanying application logic) to market.</p>
<p>Collectively, this group of competitors is destined to define the next generation of virtual assistants. </p>
]]></content:encoded>
			<wfw:commentRss>http://opusresearch.net/wordpress/2011/06/18/nuance-acquires-svox-next-step-in-battle-among-apple-google-microsoft-att-and-ibm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Updated: No Difference Between Native and Captive: Apple To Leverage Both Siri and Nuance</title>
		<link>http://opusresearch.net/wordpress/2011/05/10/no-difference-between-native-and-captive-apple-to-leverage-both-siri-and-nuance/</link>
		<comments>http://opusresearch.net/wordpress/2011/05/10/no-difference-between-native-and-captive-apple-to-leverage-both-siri-and-nuance/#comments</comments>
		<pubDate>Tue, 10 May 2011 21:33:46 +0000</pubDate>
		<dc:creator>Dan Miller</dc:creator>
				<category><![CDATA[CAT Scans]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[mobile speech]]></category>
		<category><![CDATA[multimodal]]></category>
		<category><![CDATA[Nuance]]></category>

		<guid isPermaLink="false">http://opusresearch.net/wordpress/?p=4450</guid>
		<description><![CDATA[With Microsoft plunking down $8.5 billion of its $36 billion in cash and near cash to buy Skype, a few analysts have started to take a closer look at the $25 billion in cash and short term investments on Apple's balance sheet and apparently concluded that it is contemplating a deal with Nuance Communications.]]></description>
			<content:encoded><![CDATA[<p><a href="http://opusresearch.net/wordpress/wp-content/uploads/2011/05/nuanceapplelogos.png"><img src="http://opusresearch.net/wordpress/wp-content/uploads/2011/05/nuanceapplelogos.png" alt="" title="nuanceapplelogos" width="151" height="71" class="alignright size-full wp-image-4456" /></a>With Microsoft plunking down $8.5 billion of its $36 billion in cash and near cash to buy Skype, a few analysts have started to take a closer look at the $25 billion in cash and short term investments on Apple&#8217;s balance sheet and apparently concluded that it is contemplating a deal with Nuance Communications. The discussion started at about the time that Google, Microsoft and Facebook were said to be in a three-way auction for Skype, whose S-1 filing (in preparation for an initial public offering of common stock) reflected a net loss in 2010 on revenues of about $850 million. <a href="http://techcrunch.com/2011/05/06/apple-nuance-ios-siri/">MG Siegler wrote a piece in TechCrunch</a> that described yet another three-way relationship. This time the dynamics involve Apple as parent of mobile assistance service provider, Siri (which <a href="http://opusresearch.net/wordpress/2010/04/28/if-true-apples-purchase-of-siri-heralds-new-age-of-virtual-assistance/">Apple acquired almost exactly one year ago</a>), and long-time customer/partner of Nuance, which sources both the speech processing capabilities that power Voice Control in iOS platforms and a number of downloadable apps to support dictation and predictive input of text-based content.</p>
<p>Yesterday Siegler published <a href="http://techcrunch.com/2011/05/09/apple-nuance-data-center-deal/">this story</a>pinpointing Apple&#8217;s new data center in the hills of North Carolina as the locus where at least some of the servers will be running instantiations of both Siri and Nuance-based applications so that mobile and hybrid apps running on the new iOS can take optimize the interplay between speech-recognition or predictive texting, along with application logic and &#8220;artificial intelligence&#8221; to understand intent and deliver results. As Greg Sterling points out in <a href="http://www.internet2go.net/news/mobile-platforms/apple-discussions-nuance-broaden-speech-control-iphone">this post on Internet2Go</a>, Apple&#8217;s iOS-based experience has bit of catching up to do vis-a-vis Google&#8217;s Android-based devices. </p>
<p>As I noted in <a href="http://opusresearch.net/wordpress/2010/08/16/voice-actions-for-android-speechable-moments-from-google-spell-new-market-dynamics/">this pos</a>t in August 2010, Google&#8217;s &#8220;Voice Actions&#8221; conditioned both users and application developers to expect spoken utterances to be one of the input modalities across all applications. A month later I described <a href="http://opusresearch.net/wordpress/2010/09/10/google-has-home-field-advantage-on-the-android-home-page/">Google&#8217;s &#8220;home field advantage&#8221;</a> when it introduced the many ways that a set of widgets could be used in Android that, in essence, made speech processing &#8220;native&#8221; to the operating system and therefore, of consistent use starting with the Home Screen and spanning all applications (like search) and utilities (like texting or dictation). Indeed, at Google I/0, Vlingo is showing off the latest version of its &#8220;Virtual Assistant&#8221; for Android-based phones. On a Samsung Galaxy 2, Vlingo is showing off speech-based access and control to a multiplicity of functions directly from the home page, but [contrary to what I may have implied here before] the Vlingo app connects directly to ASR resources and applications in Vlingo&#8217;s cloud.</p>
<p>Apple has been signaling its intent to meet and exceed Google&#8217;s speech-based offerings for a number of years now. In doing so, it has formed a broad (but not highly publicized) relationship with Nuance as provider of speech processing and predictive input for a broad spectrum of products and services. Followers of Siri know that roughly a month before its formal product launch in February 2010, it switched from its long-time speech recognition vendor to Nuance. At the time, it was thought to be avoiding a lawsuit.</p>
<p>From the mobile user&#8217;s point of view, there is no difference between Native and Captive. Google may have been first to market with speech-enabled services that smooth over the speed bumps between siloed applications. Apple, with an assist from &#8220;native&#8221; implementations of Nuance-based technology mated with &#8220;captive&#8221; Siri&#8217;s formidable combination of application logic and dynamic, e-commerce oriented data flows will try to meet and exceed Google&#8217;s efforts to provide the most pleasing user experience for goal-oriented mobile subscribers. The approach, which has been underway for more than a year now, obviates the need for Apple to spend billions of dollars to buy Nuance, but it will require a long-term relationship akin to the three-year joint development agreement between Nuance and IBM. </p>
]]></content:encoded>
			<wfw:commentRss>http://opusresearch.net/wordpress/2011/05/10/no-difference-between-native-and-captive-apple-to-leverage-both-siri-and-nuance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2011 Will See Stepped up Investment in &#8220;Speech And&#8230;&#8221;</title>
		<link>http://opusresearch.net/wordpress/2010/12/21/2011-will-see-stepped-up-investment-in-speech-and/</link>
		<comments>http://opusresearch.net/wordpress/2010/12/21/2011-will-see-stepped-up-investment-in-speech-and/#comments</comments>
		<pubDate>Tue, 21 Dec 2010 18:15:22 +0000</pubDate>
		<dc:creator>Dan Miller</dc:creator>
				<category><![CDATA[CAT Scans]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[Mobile Speech Apps]]></category>
		<category><![CDATA[Speech recognition]]></category>
		<category><![CDATA[user experience]]></category>

		<guid isPermaLink="false">http://opusresearch.net/wordpress/?p=3889</guid>
		<description><![CDATA[Today Apple posted a job listing for an "iOS Speech SW Application Engineer." The job involves working with the iOS Applications Framework Team in "a fast paced environment with rapidly changing priorities." ]]></description>
			<content:encoded><![CDATA[<p><a href="http://opusresearch.net/wordpress/wp-content/uploads/2009/08/Apple_logo.png"><img src="http://opusresearch.net/wordpress/wp-content/uploads/2009/08/Apple_logo.png" alt="" title="Apple_logo" width="121" height="137" class="alignright size-full wp-image-1314" /></a>Today Apple posted <a href="http://jobs.apple.com/index.ajs?BID=1&#038;method=mExternal.showJob&#038;RID=68870&#038;CurrentPage=1">a job listing</a> for an &#8220;iOS Speech SW Application Engineer.&#8221; The job involves working with the iOS Applications Framework Team in &#8220;a fast paced environment with rapidly changing priorities.&#8221; </p>
<p>Earlier this year I coined the term &#8220;No rest for the RESTful&#8221; to dramatize how new tools and API&#8217;s to support agile programming would accelerate application development and service delivery. As we enter 2011, the mantra appears to be &#8220;Speech is Sexy,&#8221; as evidenced not just by Apple&#8217;s &#8220;Help Wanted&#8221; posting, but by <a href="http://opusresearch.net/wordpress/2010/12/03/googles-latest-acquisition-brings-text-to-speech-luminaries-into-its-fold/">Google&#8217;s recent acquisition of Phonetic Arts</a> and rapid-fire refinement of mobile user interfaces from Nuance, Vlingo (powered by AT&#038;T&#8217;s Watson Engine), Google and Microsoft/Tellme.</p>
<p>The big difference in 2011 is that speech is getting more pervasive while, at the same time, it is being subsumed into multimodal user interfaces. Microsoft, for instance, continues to call speech recognition &#8220;foundational&#8221; to its user interface but, with the introduction of Kinect, already puts much more emphasis on accurate recognition of gestures. Last year Google&#8217;s Mike Cohen explained his objective of making speech as an alternative &#8220;every time&#8221; a keypad or keyboard is used on a mobile device.</p>
<p>2011 will be a year for smoothing out some of the rough spots in speech enabling the user experience. Candidates include better (more accurate) recognition, noise cancellation, more &#8220;human sounding&#8221; text-to-speech rendering, speech-to-speech translation, low-latency interaction with dynamic data &#8220;in the cloud,&#8221; and (to keep things safe, secure and personalized) voice biometrics-based authentication or ID proofing.</p>
<p>But there&#8217;s been a fundamental change. I used to write about the &#8220;Voice User Interface.&#8221; In the coming year attention will be on the &#8220;Mobile User Interface&#8221; that includes voice. That&#8217;s why it should not be a surprise to see that the the speech software engineer at Apple should be prepared to work on a &#8220;team&#8221; to accommodate the &#8220;rapidly changing priorities.&#8221; </p>
]]></content:encoded>
			<wfw:commentRss>http://opusresearch.net/wordpress/2010/12/21/2011-will-see-stepped-up-investment-in-speech-and/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pre-holiday News Rush for Nuance</title>
		<link>http://opusresearch.net/wordpress/2010/11/23/pre-holiday-news-rush-for-nuance/</link>
		<comments>http://opusresearch.net/wordpress/2010/11/23/pre-holiday-news-rush-for-nuance/#comments</comments>
		<pubDate>Tue, 23 Nov 2010 18:27:13 +0000</pubDate>
		<dc:creator>Dan Miller</dc:creator>
				<category><![CDATA[CAT Scans]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[Ask.com]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[mobile search]]></category>
		<category><![CDATA[Mobile Speech Apps]]></category>
		<category><![CDATA[TTS]]></category>

		<guid isPermaLink="false">http://opusresearch.net/wordpress/?p=3791</guid>
		<description><![CDATA[The week of Thanksgiving is usually characterized by a news lull, but Nuance has been an exception. ]]></description>
			<content:encoded><![CDATA[<p><a href="http://opusresearch.net/wordpress/wp-content/uploads/2010/11/1278079098.usr105634.jpg"><img src="http://opusresearch.net/wordpress/wp-content/uploads/2010/11/1278079098.usr105634.jpg" alt="" title="AppleVoiceControl" width="180" height="141" class="alignright size-full wp-image-3794" /></a>The week of Thanksgiving is usually characterized by a news lull, but Nuance has been an exception. Yesterday it announced financial results for last quarter which reflected more than 17% growth in top line revenues across all business units, driven 34% growth in &#8220;mobile and consumer&#8221; and a near doubling in &#8220;imaging&#8221; solutions. Then the financial community was momentarily tantalized by a rumor (traced to a video interview with Steve Wozniak of all people) that Apple &#8220;had bought&#8221; Nuance &#8211; a prospect that is not likely at this time.</p>
<p>But Nuance&#8217;s success, both in the marketplace and financial markets, is increasingly predicated on its successful support of new mobile, customer-facing apps. Corporate spending on automated speech is down, as reflected in a 4.5% decline in top line revenues among for the &#8220;enterprise&#8221; business unit. In spite of the fact that Nuance&#8217;s roster of clients includes a healthy mix of telecom and technology companies, like Acer, AT&#038;T, Comcast, Delta, Express Scripts, GM Onstar, IB System, Invomo, Metro PCS, Telekom Deutschland, Telstra, T-Mobile, and Vodafone.</p>
<p>The rumor of Nuance&#8217;s acquisition by Apple may be greatly exaggerated, but stories about successful integration of Nuance mobile speech processing into iPhone-based services is not. Today, just in time for the holiday shopping season, the company announced that its Dragon-branded technologies for &#8220;natural language&#8221; speech recognition, dictation and text-to-speech rendering is the foundation for a newly introduced <a href="http://www.nuance.com/company/news-room/press-releases/NC_007742">Price Check By Amazon</a> iPhone app. </p>
<p>The same technology provides a way for users <a href="http://www.nuance.com/company/news-room/press-releases/NC_007741">Ask.com for iPhone</a> to speak their queries. Ask.com, now a property of Barry Diller&#8217;s IAC, was one of the first Web-based resources designed to build communities of peers (or experts) who could answer a visitors questions. Through the iPhone, Ask.com users can get immediate responses while on-the-go.</p>
<p>That&#8217;s why Nuance&#8217;s future is not tied to any impending acquisition as much as it is predicated on a succession of partnerships and integrations. The Woz may have confused Nuance with Siri in the now famous &#8220;gaff&#8221;, but we must point out that Siri&#8217;s own speech recognition capability is &#8220;powered by Nuance.&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://opusresearch.net/wordpress/2010/11/23/pre-holiday-news-rush-for-nuance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Voice Actions for Android: Speechable Moments From Google Spell New Market Dynamics</title>
		<link>http://opusresearch.net/wordpress/2010/08/16/voice-actions-for-android-speechable-moments-from-google-spell-new-market-dynamics/</link>
		<comments>http://opusresearch.net/wordpress/2010/08/16/voice-actions-for-android-speechable-moments-from-google-spell-new-market-dynamics/#comments</comments>
		<pubDate>Mon, 16 Aug 2010 14:06:27 +0000</pubDate>
		<dc:creator>Dan Miller</dc:creator>
				<category><![CDATA[CAT Scans]]></category>
		<category><![CDATA[Android]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[speech-enabled mobile services]]></category>

		<guid isPermaLink="false">http://opusresearch.net/wordpress/?p=3336</guid>
		<description><![CDATA[Google stocked the Android App store with a set of new "Voice Actions" Applications. From a functional point of view, it is the superset of speech-enabled mobile services.]]></description>
			<content:encoded><![CDATA[<p><a href="http://opusresearch.net/wordpress/wp-content/uploads/2010/05/android_logo.jpg"><img src="http://opusresearch.net/wordpress/wp-content/uploads/2010/05/android_logo.jpg" alt="" title="android_logo" width="151" height="113" class="alignright size-full wp-image-2814" /></a>While I was in Canada on vacation, Google stocked the Android App store with a set of <a href="http://www.google.com/mobile/voice-actions/">new &#8220;Voice Actions&#8221; Applications</a>. From a functional point of view, it is the superset of speech-enabled mobile services. On new handsets (running the so-called &#8220;Froyo&#8221; &#8212; the Android 2.2 operating system), users will be able to initiate voice dialing, voice search (which equates to a Yellow Pages search based on Google Maps), messaging capabilities, music search and selection, and even map search and directions at the push of a single button, as depicted in this video demo:</p>
<p><object width="540" height="325"><param name="movie" value="http://www.youtube.com/v/tPPcTN5sdX4?fs=1&amp;hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/tPPcTN5sdX4?fs=1&amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="540" height="325"></embed></object></p>
<p>This being a demo, your own experience may be different. As head engineer for the Voice Actions project at Google, Mike LeBeau is quite adept at using the services in ways that are designed to impress and amaze. But, this is not like Google Wave, where some of the most creative minds in collaborative computing and messaging invented and launched a platform to show the virtues of sharing on-screen information in real time with little attention to the actual user experience. This is a combination (I&#8217;d say &#8220;recombination&#8221;) of Google&#8217;s formidable speech recognition and dictation capabilities with Google Maps and various flavors of Google Search which, unlike Wave, takes a major focus on the user experience, especially for mobile phones.</p>
<p>The set of services has been seen as a direct competitive foray against the native, speech-enabled features on Apple&#8217;s iPhone (including the services that may spring from Apple&#8217;s acquisition of Siri), as well as the myriad of multi-platform applications from Nuance (Dragon), Vlingo, Promptu and even AT&#038;T. Perhaps more ominously, Google seems to be making the statement that it plans to compete with a crop of fledgling speech-enabled service providers, like <a href="http://www.phonetell.com/">PhoneTell</a>, a company that developed some nifty mashups of voice search and call handling on Android phones, in part because there has been less friction involved in invoking and gaining access to the speech processing and call processing features in the Andoid SDK.</p>
<p>It can be argued that the Colossus of Redmond beat Google to the punch a couple of weeks ago at SpeechTEK when Zig Serafin, general manager of the Speech Group at Microsoft, showcased a set of speech-enabled features for the Windows Phone 7 operating system. But Microsoft&#8217;s marketing efforts will be hampered by two major issues. One is the overall lack of traction around Windows Phone 7, which is one of several candidates for third place behind iPhone and Android in race for smarphone marketshare (with the largely non-voice-aware Blackberry is the same boat). </p>
<p>The other major impediment is Microsoft&#8217;s mixed message surrounding the &#8220;Natural User Interface.&#8221; Its attempt to leapfrog the pack involves adding &#8220;gestures,&#8221; exemplified by the full-body involvement of game-players using a feature called Kinect on the xBox. It seems like a leap of faith to think that gestures will make a difference with small screens and mobile devices. Seems like Apple&#8217;s multitouch and Nuance&#8217;s predictive texting or services like Swype for input make a lot more sense.</p>
<p>As for Nuance, like Promptu and Vlingo, it has offered voice input for Android for several years now. As noted above, its differentiator is destined to be accuracy (which is the clay feet of all applications in the real world where background noise and microphone quality have greater impact than core recognition software), ease-of-use, and an existing installed of happy users. From my perspective, Nuance&#8217;s potential trump card in this game (as noted above) is support of multiple modalities through applying several of the principles that support predictive texting across multiple means of input. We also believe that Nuance has something of a &#8220;most favored voice technology provider&#8221; for both Apple and Siri which could be an important factor in the battle for primacy among the top-tier smartphone providers (Apple versus a broad range of Android manufacturers).</p>
<p>When we look back on the summer of 2010, the launch of Voice Actions for Android will be seen as a signal event. It goes a long way toward re-establishing the spoken word as the natural input for a phone (duh!). That&#8217;s the benign part. On the darker side, Google once again shows that it is not neutral when it comes to claiming pre-emptive market share where it sees potential for growth. The result will be accelerated innovation in the name of competition. </p>
<p>Game on! </p>
]]></content:encoded>
			<wfw:commentRss>http://opusresearch.net/wordpress/2010/08/16/voice-actions-for-android-speechable-moments-from-google-spell-new-market-dynamics/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Action in the Mobile Voice Front</title>
		<link>http://opusresearch.net/wordpress/2010/07/15/action-in-the-mobile-voice-front/</link>
		<comments>http://opusresearch.net/wordpress/2010/07/15/action-in-the-mobile-voice-front/#comments</comments>
		<pubDate>Thu, 15 Jul 2010 18:40:36 +0000</pubDate>
		<dc:creator>Dan Miller</dc:creator>
				<category><![CDATA[CAT Scans]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[Mobile Platforms]]></category>
		<category><![CDATA[mobile voice]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[Vlingo]]></category>
		<category><![CDATA[voice user interface]]></category>

		<guid isPermaLink="false">http://opusresearch.net/wordpress/?p=3220</guid>
		<description><![CDATA[Mobile voice technology providers Apple, Vlingo and Nuance took actions that, to varying degrees, turn up the heat in the world of mobile voice. ]]></description>
			<content:encoded><![CDATA[<p><a href="http://opusresearch.net/wordpress/wp-content/uploads/2010/07/nipper.jpg"><img src="http://opusresearch.net/wordpress/wp-content/uploads/2010/07/nipper.jpg" alt="" title="nipper" width="121" height="88" class="alignright size-full wp-image-3229" /></a>Mobile voice technology providers Apple, Vlingo and Nuance took actions that, to varying degrees, turn up the heat in the world of mobile voice. For its part, Apple has been granted yet another patent for a major component of a hands-free, voice user interface (VUI). In <a href="http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&#038;Sect2=HITOFF&#038;d=PALL&#038;p=1&#038;u=/netahtml/PTO/srchnum.htm&#038;r=1&#038;f=G&#038;l=50&#038;s1=7,757,173.PN.&#038;OS=PN/7,757,173&#038;RS=PN/7,757,173">U.S. Patent Number 7,757,173</a> the inventor describes a dynamic or &#8220;updateable&#8221; voice menu. As described in the filing, the technology is designed to offer many of the context-sensitive attributes of a dynamic, graphical user interface for search and retrieval of &#8220;media&#8221;, like recorded music; but the filing notes that &#8220;songs&#8221; or &#8220;music&#8221; could be &#8220;generalized to any form of digital media, which can include sound files, picture data, movies, text files or any other types of media that can be digitally stored on a computer.&#8221;</p>
<p>Some of what Apple describes conceptually, Vlingo is putting into practice with the r<a href="http://">elease of its SuperDialer for Android</a> application. Greg Sterling writes about it <a href="http://www.internet2go.net/news/local-search/vlingo-wants-take-siris-place">here</a>, noting that it is designed to take on Siri for local, mobile search. Yet, with &#8220;SuperDialer&#8221; Vlingo is delivering an easy-to-understand use case for a voice-based front end to messaging resources, social networks, search and, ultimately transactions.</p>
<p>Nuance, for its part, reminds us that the automobile is destined to be the ultimate smart, mobile device. The companies have jointly <a href="http://www.nuance.com/news/pressreleases/2010/20100715_MyFordTouch.asp">expanded the range of speech-enabled features and functions it is offering in conjunction with Ford as part of &#8220;MyFord Touch&#8221;</a>. By adding more first-level commands and making the interface more dynamic and personal, the initiative is designed to make a person&#8217;s voice &#8220;the primary in-car communications interface.&#8221; </p>
<p>Establishing the primacy of a user&#8217;s voice for command and information entry in cars and on smartphones remains a tall order, but the speed at which solutions providers introduce new refinements is definitely accelerating.</p>
]]></content:encoded>
			<wfw:commentRss>http://opusresearch.net/wordpress/2010/07/15/action-in-the-mobile-voice-front/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Apple&#8217;s Actions Reinforce Momentum Toward Recombinant Communications in Customer Care</title>
		<link>http://opusresearch.net/wordpress/2010/04/30/apples-actions-reinforce-momentum-toward-recombinant-communications-in-customer-care/</link>
		<comments>http://opusresearch.net/wordpress/2010/04/30/apples-actions-reinforce-momentum-toward-recombinant-communications-in-customer-care/#comments</comments>
		<pubDate>Fri, 30 Apr 2010 19:33:42 +0000</pubDate>
		<dc:creator>Dan Miller</dc:creator>
				<category><![CDATA[CAT Scans]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[customer care]]></category>
		<category><![CDATA[Recombinant Communications]]></category>
		<category><![CDATA[SpeechCycle]]></category>
		<category><![CDATA[Webcast]]></category>

		<guid isPermaLink="false">http://opusresearch.net/wordpress/?p=2788</guid>
		<description><![CDATA[Apple appears to be doing everything it can to accelerate the demise of the PC as we know it, and I say "go for it!"]]></description>
			<content:encoded><![CDATA[<p><a href="http://opusresearch.net/wordpress/wp-content/uploads/2009/08/Apple_logo.png"><img src="http://opusresearch.net/wordpress/wp-content/uploads/2009/08/Apple_logo.png" alt="" title="Apple_logo" width="121" height="137" class="alignright size-full wp-image-1314" /></a>Apple appears to be doing everything it can to accelerate the demise of the PC as we know it, and I say &#8220;go for it!&#8221; It&#8217;s not just Steve Jobs&#8217; rabid resistance to using Flash on the iPhone. It is a series of decisions by Apple to shift attention from the PC form factor to a broad array of touch sensitive devices. It will start at the World Wide Developers Conference (WWDC) where, for the first time, <a href="http://www.pcworld.com/businesscenter/article/195177/apple_drops_mac_category_from_annual_design_awards.html">the design awards will be granted to iPhone and iPad AppStore submissions only</a>.</p>
<p>I was directed to one thought provoking rationalization of the phenomenon by Charlie Stross <a href="http://www.antipope.org/charlie/blog-static/2010/04/why-steve-jobs-hates-flash.htm">here</a>. But the idea that the PC will be left behind (or at least marginalized) in the Recombinant Communications era was made even stronger when<a href="http://www.pcmag.com/article2/0,2817,2363291,00.asp"> Microsoft, itself, endorsed HTML5 in conjunction with the H.624</a> as the prescribed video codec rather than a Flash player as part of the forthcoming version of Internet Explorer (IE9).</p>
<p>Disputes about specific players, mobile platforms and device form factors are here to stay. As a matter of fact, it may be a desired state in that it should be the goal of every content and service provider to enable end-users to define how, when and where they gain access to the Internet and Web-based services. Fragmentation and mobility are a given.</p>
<p>At the recent Mobile Voice Conference, I was impressed how the community of attendees &#8211; many of whom were Voice User Interface experts and specialists &#8211; had made the transition to advocacy of multichannel and multimodal interfaces. Google gets it. So does Nuance. And now Apple and Microsoft are making it clear that the interface of the future will try to make it as seamless as possible to carry out conversations that transcend space, time and modality. </p>
<p>In the coming weeks, Opus Research will be working with our clients and others to make sure that high-quality customer care remains part of the equation. In June, we will start with a Webcast and White Paper, produced in conjunction with SpeechCycle, called <a href="http://opusresearch.net/wordpress/2010/04/27/webcast-recombinant-communications-extending-care-to-anywhere-customers/">&#8220;Recombinant Communications: Extending Care to Anywhere Customers&#8221;</a>. While we call them &#8220;Anywhere Customers&#8221;, we&#8217;re really out to capture the idea that they are calling the shots and it starts with them selecting the time, place and nature of their interactions. </p>
]]></content:encoded>
			<wfw:commentRss>http://opusresearch.net/wordpress/2010/04/30/apples-actions-reinforce-momentum-toward-recombinant-communications-in-customer-care/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

