Bubbly: The Voice Equivalent of Twitter or Just a Bubble?

2010 March 12

I’m fascinated with a firm that over the past five years has raised $30 million in venture funding and was recently profiled in Advertising Age as having 100 million users for a product that allows you to leave voice messages for others but, more importantly, garnered 500,000 users in India in less than weeks for a product that it characterizes as a voice-based replacement for Twitter. The new service is called Bubbly. It is a new, branded service offered by Bubble Motion, which is a firm with headquarters in Mountain View, CA, and Singapore.

BubbleTalk’s CEO, Thomas Clayton, told AdAge that the company did no traditional advertising for Bubbly. Instead it built a viral marketing campaign by paying for endorsements from two very popular Bollywood stars, Kareena Kapoor and Aamir Khan, who ‘bubbled’ about their new film “Three Idiots” in advance of its premiere.

BubbleTalk built its user base primarily in the Far East. Incmbent carriers offer it as a “virtual phone” service, which enables people to dial a number and leave a message for other people to retrieve from public phones or wireless handsets. As Niraj Sheth wrote in the Wall Street Journal in February, BubbleTalk, the virtual phone service, costs 75 paise, or 1.6 cents to originate a message. Listening to the message more than once carries the same fee. Bubbly enables users to post a voice message for free. Then the carrier charges “at least a minute of airtime per message”. Parent company Bubble Motion takes a cut of the carrier’s incremental revenue.

After India, the company plans to launch Bubbly with carriers in Brazil and Japan. With 100 million users in the Far East, the company has hands-on experience with cultures that put less currency in text messages and the written word. Yet, on a global basis, texting and tweeting, are largely silent endeavors. Nuance, Vlingo, ShoutOut, Yap and even Google are investing in a variety of services that made better with speech input. They are finding that people turn to dictation of tweets, emails and text, as they gain confidence with the technology and don’t mind people over hearing their messages.

Bubbly is different. It is definitely not a dictation or transcription service. It bears a greater resemblance to the “Group Access Bridge” or “GAB” lines that were all the rage as a pay-per-call service in the 1980s. Service providers encouraged people to dial-in to a number that put them into chat mode with like minded folks (often spinning toward adult conversation). GAB lines eventually succumbed to the simple economic fact that the people that used them the most were those that could least afford to pay for the service. Promoting usage was never a problem for these highly social, phone-based services. Billing and collections were the problem that the carriers could not overcome, and it starts with users who DAK (or “Deny All Knowledge”) of making the call.

We don’t want to pour cold water on what is emerging as a very hot phenomenon for the phone-using public. However, we do want the service providing community to be aware of potential pitfalls that have plagued similar services in the past. It may not be hard to get that 500,000 people to listen to messages from movie stars. But they may not lead to the constant stream of calls that make such an offering sustainable. Remember, the leading categories for pay-per-call were ’scopes, jokes and soaps, referring to the daily horoscope, joke of the day and a digest of soap opera plots.

Two Sides of eGovernment

2010 March 12

I was going to write this post under the headline “The Schizoid Nature of eGovernment” under the false assumption that “schizoid” meant some variation of “schizophrenic” which I’ve always associated with “having a dual personality.” I was surprised to learn from a variety of Web-based dictionaries that “schizoid” refers to a personality disorder “marked by dissociation, passivity, withdrawal, inability to form warm social relationships.” That’s definitely not what I meant.

Instead, this post is inspired by the fact that my inbox was graced, first, with this link to this story from Cisco’s newsroom describing a joint initiative with CSC eGovernance Services India, to extend medical and educational services to remote communities through “Common Service Centres.”

As explained in the release: “The Common Service Centres program is a strategic cornerstone of the government of India’s National e-Governance Plan (NeGP) with 250,000 CSCs planned across 600,000 villages.” In the bullet points that follow, there’s lots of emphasis on two specific Cisco Products WebEx on Demand, for distributing telecourses and Cisco HealthPresence for telemedicine. But with all that broadband capacity, it’s a certainty that the better educated and cared for villagers can define other applications to take advantage of connectivity and collaboration.

As the email and blogging gods would have it, the next post in my inbox carried the headline “Governments Use Internet as Tool for Control”. In it, reporter Emma Woollacott, cites a report issued today by the US Department of State which, using China as the primary culprit, claimed that governments can “monitor internet use, control content, restrict information, block access to foreign and domestic websites, encourage self-censorship, and punish those who violated regulations.”

That’s not a pretty picture, but it was one augmented by a report entitled “Enemies of the Internet” from Reporters without Borders, which lists the countries that are most abusive of the Internet’s ability to serve both public and individual needs. Burma, North Korea, Cuba, and Turkmenistan block Internet access altogether. The report adds: “For economic purposes, China, Egypt, Tunisia and Vietnam have wagered on a infrastructure development strategy while keeping a tight control over the Web’s political and social content.”

Such policies pose a real threat to the healthy growth of innovative applications on the global Internet.

SF Opens a Door; AT&T Closes One

2010 March 11

Two communications-oriented news stories make us long for more “public options” or at least more options for the public to build its own mobile solutions. On the private enterprise side, AT&T Mobility surprised subscribers who bought the Motorola Backflip (its only Android-based offering) by opting to support only what it calls “trusted” applications, meaning those offered in AT&T marketplace. It provides no mechanism to install other applications (including those that were purchased and installed on SD cards inserted in the device.

On the side of sunlight and open-ness, the City and County of San Francisco leveraged the efforts of many other cities, developers and non-profit organization to publish an “open API” for its 311-based non-emergency services hotline. This is Recombinant Communications at its best. A well-understood access technology (the venerable three-digit short code) is being deployed to offer more public service-oriented applications and to offload traffic from the over-burdened 911 emergency line. It will emerge as a channel for better “eGovernment” in an era when budget cuts spell reduced staffing and long lines at public offices.

Meanwhile, market forces have convinced AT&T Mobility (a) that it needs to have at least one Android device on the shelves of its retail stores but (b) it regards Google as a competitor whose products can only be offered within designated territories. That’s why Yahoo!, not Google, is the default search engine on the Backflip and why it is technically impossible for subscribers for shop around and personalize their devices with applications of their choice.

AT&T’s customers are short-changed by this short-sighted policy. Today, such heresy against open-ness and Recombinant Communications is part of an inside game and goes largely unnoticed. But the battle for share and survival among “mobile platform providers” (referring to the mobile OS and application delivery environments, like iPhone, Android, Blackberry, Symbian, Windows 7…) is heavily influenced by the policies and practices of mobile carriers. AT&T’s conditional support of Android is destined to be regarded as cynical, ineffective and, in the long-run, it is not sustainable.

Safe Driving: Another Speechable Moment

2010 March 10

A briefing with the principals at ZoomSafer inspired me to think, once again, about the important, yet supplementary, role that speech processing technologies have to play in making for safer motoring. With the CTIA (Cellular Telephone and Internet) Conference on the near horizon, the coverage in the general media is predictably destined to recite the litany of statistics about accidents and loss of life caused by “distracted drivers.”

AT&T Mobility is doing its part to cast a sharp light on the problem. It has launched a nationwide campaign of public service announcements desigend “to raise awareness about the risks of texting and driving and remind all wireless consumers, especially youth, that text messages can – and should – wait until after driving.” Advertising initiatives are largely ineffective, unless accompanied by some other form of restraint or constraint. A White Paper published by ZoomSafer notes that, at any moment in time, over 810,000 autos are being driven by people who are actively using their cellular phone. This is the sad case, in spite of the fact that texting while driving is banned in a total of 21 states or territories.

ZoomSafer is a solution provider that has developed and markets software that enables its users (both corporate and personal) to define and manage policies that govern the use of mobile devices or, as CEO and Founder Matt Howard put it, “promote safe and legal use of cell phones while driving.” The solution is comprised of three parts. A Web site enables users to identify the policies that they wish to enforce (for example, to prohibit reception or origination of text messages or phone calls when the device is moving faster than 10 mph). Client software on the handset detects speed and “enforces” the designated policies. Finally, and this is the “speechable moment” aspect of the solution, ZoomSafer and Irish voice application service provider Dial2Do offer a service called “Voice Mate”, provides single-button control of TTS-based reading of emails or texts as well as dictation of replies, email or texts.

At the the theme of AT&T’s national campaign is “No text is worth dying for,” and its tagline is ““Txtng & Drivng … It Can Wait.” The carrier also uses this Facebook page to encourage users to take the pledge not to text while driving.

I see ZoomSafer picking up where such pledges leave off. The company sees three distinct market segments: Teens (or rather their parents), “pro-sumers” (meaning mobile professionals) and corporations. For $2.99 each month, it gives subscribers the ability to define and enforce their own policies against distracted driving. The addition of Voice Mate brings the monthly rate to $5.99. In addition, $10 per handset per month is the charge for Corporate customers to manage, enforce and audit their policies.

“Policy Enforcement”, meaning keeping people true to their stated intentions, is the crux of ZoomSafer’s value proposition. The economic benefit arises from loss reduction, lawsuit avoidance and abidance to existing laws. However, for those to whom communications deferred is communications denied, the delivery of voice renderings of text and the spoken origination of email or texts will turn out to be a bargain at an incremental $3 per month. Combining speech-enabled services with broader service offerings is destined to be the norm.

CSO Online – March 11, 2010

2010 March 10
by Derek Top

Excerpt:
Biometrics offers several advantages over identification cards and passwords or PINs, namely the requirement that the person being identified is physically present and the elimination of the need to remember codes or tokens. Dan Miller, senior analyst and founder of Opus Research in San Francisco, distills the benefits of biometrics: Other systems rely on something you know or have, whereas biometrics works off something you are.

From the article, “Biometrics: What, Where and Why”, by Mary Brandel, CSO, March 10, 2010

Japan’s Largest Wireless Carrier Provides OpenID Authentication to Half the Adult Population

2010 March 9

According to this story on the OpenID Web site, NTT docomo, Japan’s largest wireless carrier, is using OpenID to enable its 55+ million subscribers to avail themselves of “one-click” purchases or “single sign-on” access to information and resources. OpenID is a standard for user authentication which is regarded as “open” because there is no centralized issuing authority, instead there are many OpenID providers that issue unique URL’s that replace multiple “username/password” combinations with a single sign on.

For NTT docomo, OpenID solved a very specific problem. All of its wireless subscribers were automatically issued an “imodeID” which enabled them to gain access to the communications, entertainment and information services that docomo provides to its wireless subscribers. But i-mode only works on docomo’s wireless handsets, not desktop PCs. For authentication on fixed line devices, the company issued a separate “docomoID”. Use of OpenID, enables subscribers to sign on to multiple services across multiple devices.

The list of large network operators and service providers deploying a flavor of OpenID authentiction is impressive. It includes AOL, Google, Yahoo!, Microsoft, MySpace, Orange and PayPal, among others. The addition of NTT docomo introduces OpenID into a country where wireless commerce has been highly successful thanks, in a large part, to the simplicity of access and the availability of multiple services. Earlier this year, 22 companies including NTT docomo, KDDI, Sony and NEC formed an “ID Platform Federation Forum” to test different ways to simplify user access across multiple carriers and services “based largely on OpenID.” The formal launch of OpenID-based authentication by NTT docomo moves the technology beyond the experimental stage.

Captions on YouTube? Just Another Speechable Moment

2010 March 5

Yesterday, as noted in this blog post, YouTube (a Google property) formally launched a service that automatically transcribes audio track of videos and displays them as captions for those who choose the option from the “Closed Caption” menu. The service was actually introduced in November 2009 and, as demonstrated in the video below, it uses the same transcription and translation resources that are embedded in Google Voice.

As the the video’s narrator admits, sometimes the transcriptions are not so accurate but, in certain cases, “they are still better than nothing.” That, in a nutshell, captures the notion of “satisficing” which I discussed in this blog post. At this point in the technology’s development, it’s important to note when “good enough” is good enough.

Yet that hasn’t stopped a significant number of industry luminaries from declaring the service a “#failure”. For instance, the video embedded in this article by Janko Roettgers at GigoOm’s jkOntheRunfrom showcases what he calls “auto-captioning gone wrong”.

You can detect the pattern here. Google makes public a feature that has been percolating within the confines of its cloud for a number of years. It shows up as “beta” or a product of its “labs” or simply as a button that can be invoked in one of its highly-trafficked properties – like Gmail or Google Apps. Early reviews are a mixture of delight, shock, awe and ridicule. All feedback is encouraged and ultimately employed to refine and adapt the service for general consumption… or relegate it back to cloud-based oblivion.

I see auto-captioning, as well as translation and timing, as yet another “speechable moment,” meaning that it is an instance where the resources employed for a new set of core services, like speech recognition for the purpose of transcription or translation, are deployed as part of a broader set of services. I coined the term while discussing enhancements to Vlingo’s iPhone app in this post on Internet2Go.net.

Even though I don’t subscribe to the belief that “all publicity is good publicity”, I do believe that exposing the public to both the good and bad instances of transcription and translation is an important part of setting realistic expectations for the technology. That provides prospective users with the power to decide how they want to use (or “game”) the service and determine whether it is “good enough” for them.

Thoughts on Orange’s Curious Choice of MeeGo

2010 March 4

One of the surprise announcements from the 2010 Mobile World Congress came from a strategic alliance between Orange and Intel. The French telecom giant announced its intent to promote development and delivery of services that are optimized for devices that have Intel’s Atom microprocessors inside and leverage the Linux-based MeeGo software platform. As Caroline Gabriel explains in this article in “Rethink Wireless”, Orange’s initiative with Intel aims to avoid a role as big dumb pipe by promising a consistent user experience that spans desktops, laptops, handsets, TVs and (one would assume) as many combinations and permutations of user experience (UX) as the technology can enable.

I can only ask whether this trip is necessary and whether it will necessarily be effective. Atom-equipped devices running Linux-derivative operating systems are, indeed proliferating. Until the Orange announcement, MeeGo’s future was uncertain. The platform, itself, is the product of merged development efforts combining the Linux-based Maemo platform – an “open source” effort underwritten by Nokia – and Moblin (Mobile Linux) efforts initiated by Intel.

Application developers, integrators and software vendors are the force multipliers destined to make Recombinant Communications (RC) successful. When it comes to smartphones, for instance, applications developers have voted with their fast-moving fingers. And the results are pretty clear. In spite of Apple’s iron fisted control of the release process, the iPhone App Store offers more than 100,000 apps. That compares to Google’s 20,000 titles. Then it’s a long-distance call to the next tier of retail outlets, where RIM is approaching a five-figure total and Palm’s WebOS has just hit four figures. The proliferation of platforms will ultimately lead to “developer fatigue”. In the spirit of “Beta versus VHS” or “HDDVD versus BlueRay” it may turn out that even two is too many.

Even though success is by no means assured, choosing the MeeGo platform with Intel as a partner is a gamble that’s worth taking early. Orange is right to focus on the quality of user experience across multiple “screens” and, in case nobody has noticed, the iPhone OS, Android and MeeGo are all Linux variants. I’m not a coder, but I see a common denominator here. What the developer community looks for is fair-handedness in terms of support and revenue models. What users look for is consistency across multiple platforms. A service provider of Orange’s size and footprint has an opportunity to offer both.

Voice Biometrics Conference 2010: Early-Bird Rate Ends Friday

2010 March 3

Anticipation is building for the upcoming Voice Biometrics Conference (May 4-5, 2010). Register Now to take advantage of the early-bird rate of $599. (Save $200, ends March 5th!)

The excitement for the event is spurred by an often-heard phrase regarding voice biometric deployments: “It’s on our roadmap for 2010.” Both for “internal” and customer-facing applications, executives are investigating and implementing voice biometric-based solutions not just for fraud-loss reduction, but also to improve the customer experience and raise confidence that a company is taking every measure to protect the public from identity theft.

With voice biometrics quickly morphing from competitive differentiator to competitive necessity, you cannot afford to miss Voice Biometrics Conference 2010the venue where corporate decision-makers join technology providers and integrators to hash out the realities of today’s voice biometrics solutions both in the lab and in the real world. Every major technology provider in the space will be present.

Hear panel discussions about opportunities in customer care, mobile payments, data security, and multifactor authentication, and see presentations from the banking and healthcare sector on launching customer-facing deployments. Additionally, hear how a North American law enforcement agency has deployed one of the largest known speaker identification projects to fight crime.

Sign-up now to attend this global gathering (May 4-5, Hyatt Regency Jersey City) and take advantage of the early-bird rate ending this week.

Recombinant Communications Brings New Life to Text-to-Speech

2010 March 2


Featured Research
The advent of Recombinant Communications has the potential to breathe new life into some well-established voice processing technologies – including text-to-speech (TTS) rendering. New applications “read” Tweets, email and text messages easily. New platforms allow tuning of output to support specific voices or brands.

Advisories are available to registered users only.

For more information on becoming an Opus Research client, please contact Pete Headrick (pheadrick@opusresearch.net).