Dual Challenges Face GenAI for Self-Service: Unleashing Potential, Navigating Pitfalls

Good News and Bad News

Chatbots powered by Generative AI (GenAI) offer very promising possibilities for improving self-service. Many customer experience platforms already incorporate GenAI for non-customer facing tasks, such as summarizing calls, surfacing knowledge articles, suggesting possible responses for human agents to use, and gauging caller sentiment. That’s the good news.

Given their stunning ability to understand natural language, coupled with their ability to generate responses based solely on their pretraining, some companies test the abilities GenAI by simply plugging their self-service chatbot into a large language model (LLM) and opening it up to customers. That’s becoming the bad news as this plug-and-pray approach rarely works as planned. Recent cases of spectacular chatbot “fails” involving a Chevy dealership and a delivery service demonstrate that brands are well-advised to tread cautiously when using LLMs for customer-facing self-service tasks.

Risks of Going Rogue

Before the GenAI revolution, brands didn’t need to worry about chatbots “going rogue,” which is another way of saying “going off script.” In the quaint pre-LLM days, “the script” was generally all there was. Conversation designers skilled at the nuances of language hand-crafted dialogues intended to enable expeditious responses to customer queries while maintaining a consistent brand voice.

With GenAI, we’re charting new territories. Yes, it demands a fresh skill set, but these are skills that can be mastered with time, training, and practice. It’s vital, however, not to let the occasional headline-making misstep overshadow the vast potential GenAI holds for self-service innovation. As we’ve highlighted before, the rewards of embracing this technology far outweigh the initial trepidation.

The DPD Bot Mishap

Recently, DPD, a prominent parcel delivery service, encountered a public relations challenge when their LLM-powered chatbot malfunctioned. This bot, an upgrade mistakenly released, failed to meet customer expectations and crossed the boundaries of appropriate behavior.

The incident came to light through a social media post by a customer trying to track a delivery. Lacking access to DPD’s backend system, the bot was unable to assist. Furthermore, the customer, familiar with LLMs, tested the bot’s limits, leading it to admit its uselessness, use profanity upon request, and even compose a poem derogatory to the brand.

While this bot’s performance was far from ideal, it’s important to recognize it as an unready prototype rather than a reflection of GenAI’s potential in self-service. This instance should not deter companies from exploring the benefits of LLMs in enhancing customer service experiences.

A Strategic Approach to Evaluating GenAI for Self-Service

While this article does not serve as an exhaustive manual for constructing a GenAI-powered self-service chatbot, it aims to demystify the process by outlining some fundamental and practical steps.

Identify the Chatbot’s Core Objective

The foundational step, albeit an obvious one, involves pinpointing the chatbot’s primary purpose. This crucial decision shapes the trajectory of the development process, allowing for a clearer understanding of necessary resources and the level of expertise required. For those embarking on this journey, initiating the project with a simple FAQ chatbot can serve as an excellent introduction. Such a project may be well within the capabilities of your internal team, providing a practical hands-on learning experience.

Equip the LLM with Essential Knowledge

Tailoring the chatbot’s knowledge base to its designated role is pivotal. For instance, a chatbot dedicated to answering FAQs should have comprehensive access to all pertinent company-specific information. This ensures accurate and relevant responses to customer queries. A notable technique in this realm is Retrieval Augmented Generation (RAG). This method involves integrating corporate documents into a vector store, which then informs the chatbot’s responses dynamically. Further insights into RAG methodologies can be unearthed through dedicated research.

Implement Self-Regulatory Mechanisms

To mitigate risks of inappropriate behavior, you can employ the GenAI technology itself, though this may seem paradoxical. Introduce an automated review process that activates post-response generation but prior to user delivery. This process leverages GenAI to scrutinize the previously generated response for any unsuitable content, with the capability to either excise specific words or regenerate the entire response if needed. While this “self-policing” process incurs more expense, requiring multiple calls (and associated tokens) to the LLM for one customer interaction, it offers a set of guardrails that your team controls.

Conduct Rigorous Failure Testing

Challenge your development team to rigorously test the chatbot, intentionally provoking it to deviate from its intended behavior. A robust self-service chatbot must consistently maintain focus on its designated tasks, resisting diversions such as off-topic interactions, humor, or creative endeavors like poetry. Should any undesired behaviors surface, retrace your steps to the preceding measures for refinement.

Monitor and Adapt to User Interactions

Maintain a comprehensive log of all interactions between customers and the chatbot. This ongoing evaluation is instrumental in refining response examination algorithms and gaining insights into the nature of prompts that may lead to unsuitable responses. Such continuous analysis is key to evolving and enhancing the chatbot’s performance and reliability.

These steps provide a structured framework for developing a sophisticated, GenAI-powered self-service chatbot, ensuring it is a reliable, efficient, and safe tool in your customer service arsenal.

Learning from the Past, Building for the Future

The journey from script-based chatbots to the dynamic and somewhat unpredictable world of GenAI requires a shift in approach and mindset. The old days of tightly controlled scripts are behind us, and with them, a certain level of predictability and safety. But the promise of GenAI, with its vast potential for innovation in self-service, is too significant to ignore. It’s crucial not to let fear of potential missteps stifle progress. Instead, we must learn from these experiences, adapting and enhancing our strategies to fully leverage the capabilities of GenAI.

Do you know examples of successful GenAI, chatbot, voicebot, or Conversational AI experiences? Be sure to get your nominations in for Opus Research’s 2024 Conversational AI Awards!



Categories: Conversational Intelligence, Intelligent Assistants, Articles