6. What’s Keeping You Up at Night?
Agent
Attrition
IT
Issues
“Do more
with less”
Overworked
Staff
High Call
Volumes
Increasing
Customer
Expectations
Decreasing
Performance
Low Customer
Satisfaction
Repetitive
Calls
Too many
tools
Increasing
Operational
Costs
#3:
We deliver AI-powered virtual agents as a service. That means we deliver the full conversational AI technology stack. It’s turnkey. It’s omnichannel. All of our clients use our voice self-service module. Most clients coming to us today rely on us for more than voice but their digital channels as well over chat and text. We use an open platform that incorporates best-of-breed AI and machine learning tools from both Google and Microsoft to augment our proprietary tools and stitching, so we really do believe we are delivering the best experience in the marketplace.
But what makes us a little different is that we’re not just trying to sell a software licenses or seats and throw them over the fence and wish you good luck on your journey. Conversations with machines are complex. It needs experts. So we bundle end-to-end CX services with our technology. And when I say end-to-end, that means everything – the design, the build, and even the ongoing operation after go-live because it requires care and feeding where a team needs to dedicated to training AI models, examining data and optimizing the experience. So at the end of the day, we’re really stepping in more as a partner instead of just a technology provider. That makes us responsible for delivering the CX that was promised and the ROI that was promised.
We’d like to think that approach is working for us. We operate the AI-powered CX for more than 100 brands
[only say this if not following with the Gartner Peer Insights slide]
currently the top-rated conversational solution on Gartner Peer Insights. So if you’re interested in what others have to say about us, starting with those reviews is a good place to start.
#5:Not to start this presentation on a negative note -- but the truth is, most AI projects fail. In 2018, Gartner predicted that 85% of all AI projects would fail. Fast forward a few years to today – and those estimates remain. Why?
https://www.gartner.com/en/newsroom/press-releases/2018-02-13-gartner-says-nearly-half-of-cios-are-planning-to-deploy-artificial-intelligence
#6:Well, it’s not usually for one reason, but rather a culmination of reasons. In another discouraging statistic from Gartner, more half of all AI projects don’t reach production – and a lot of it has to do with unclear business objectives and unrealistic expectations. One common assumption is that it’s easy to build a bot – and there is some truth to that for simple, rules-based chatbots…but when you’re talking about AI-powered virtual agents over voice, it’s a completely different ballgame.
So not having the right AI skillset or expertise, plays a huge factor. And the same goes to when it comes conversation design -- which is both a science and an art. It requires a solid understanding of how humans communicate and having the skills to replicate that in a human to AI conversation. We’ll dive into the details for what’s involved in conversation design later, but the takeaway is that this role can make or break the user experience – so I can’t underscore enough how critical this role is.
And to round out our list of failures is the AI toolset and training. If you’re working with a limited toolset that doesn’t have the functionality or the integrations needed for business applications, then you’re also limited by the scope of conversations you can automate.
. common assumption that AI just runs itself and that can’t be further from the truth.
--
Sources:
https://www.alphachat.ai/blog/4-reasons-conversational-ai-projects-fail
https://research.aimultiple.com/ai-fail/
https://voicebot.ai/2021/02/27/the-importance-of-conversational-analytics-3-metrics-to-consider/
https://venturebeat.com/2020/12/16/the-future-of-ai-deployments-reaching-production-is-bright-in-2021/
#8:Is automation right for you? Specifically, is conversational automation a fit for your business?
So then, where do you start? Start by asking:
What are the conversations happening that fall under that criteria?
What channels do we want to implement self-service?
Then for each channel, ask:
How does a live agent handle these today? (Talking point: Analyze calls)
How can we convert human-to-human conversations to human-to-AI convos without sacrificing the current experience? (Talking point: Narrow scope to be able to handle with super good accuracy)
Handling rules - determine which conversations need to be handled by a live agent entirely, which conversations can be handled as a team (live agent + virtual agent), which conversations can be fully automated through self-service
What’s the end goal for every conversation you wish to automate
Is the data available to ensure the best experience possible
Data is king!!
#9:Aligning internal stakeholders – easier said than done.
So, you want to go the virtual agent route – but you need to sell this internally and sometimes that can be a challenge because nearly everyone has had a bad experience with “automation”
You have to get internal buy in and with that you have to know what is most important to your executive team
What I hear most is –
What is the customer experience?
And how much money is it going to cost me and how much money is it going to save me?
When you think about what is most valuable to an organization – it’s their customers. We understand that being trusted with the being first touch point to your customer is a huge responsibility and we don’t take that lightly. So, when you’re having a conversation with us, the first thing we’re going to do for you is create a custom demo experience – we want you to experience what your customer is going to
I’m going to show you an example of that in a second,
Next – let’s talk numbers – how much is it going to cost me and how much is it going to save me. So we’ll put something in front of you based on your volumes…
#10:A robust bot is one that continuously learns from interactions, improves accuracy and speed of its responses. AI continuously evolves but it's not magic. Gather actionable insights from the performance of your conversational AI and improve your bot. But there’s an entire team of people
Who is the application power user (business analyst)?
designer
Bot builder
Developer
Monitoring & reporting (data scientist)
QA
Onscreen you can see the CX services we provide here at SmartAction. Every face represents a whole team of people behind them focused on that discipline. Anytime we have a new client or building a new app, that baton gets passed from one team to the other until the application goes live.
And even after application goes live, we have a team dedicated to your account who live and breathe the process of perpetual improvement with you. Any app we build is never perfect when it goes “live.” Callers invariably interact with the system in ways we didn’t expect, so you need a team dedicated to ongoing care and feeding to optimize and tune application over time. We’ve developed proven methodologies that work very well for us. The day 2 and beyond experience is where a lot of organizations can struggle because they can’t even pinpoint where the need to focus optimization efforts.
And this why we say that as much as we’re delivering the technology, we’re really stepping in as more of a partner because we’re on this journey with you. If this was something you were attempting to do on your own, after paying for the software platform and usage and paying for the headcount to run it, you’ve priced yourself out of what you can get from SmartAction.
#11:
We talked about the importance of the conversation designer in our last webinar, but I think this role is so important that it’s worth another discussion. The shoes of a conversation designer is incredibly difficult to fill – and that’s because this person needs to have expertise in three key (but very different) areas: technology, psychology, and language.
On the technology part – Turning a human-to-human conversation into a human-to-AI conversation is not a straight conversion.
They need to have a solid understanding of how NLU works and be able to design the flow (and order of questions) in such a way that’s optimal for the AI to process. They also need to know how to narrow the aperture of what the bot is listening for (to drive accuracy). Lastly, they need to be able to work with the data they have access to – even if they’re communicating what Apis are needed to IT,
On language and psychology – The guiding principle for designing conversations with machines is to lead with empathy – show that the bot understands the situation (and emotional undertones) before asking the customer to share information. You need to get the customer’s buy-in bc if they don’t trust that the bot – they’re going to hang up before the bot even gets a chance to prove that it can resolve their issue.
#12:
(Here’s what we can share with you from experience)
Steep learning curve (6 month ramp up). You need a bot builder with expertise on that specific platform you’re using (you can be an expert on one platform but not on another)
We see failure after failure of companies who pay a vendor to build a bot, throw them the keys after go-live and now they need to have someone on their staff learn how to use it and operate it. (this is the most common sunk cost that we see)
Other points of failure:
No context as to what was built – no transition of knowledge
#13:Brian
Idea that your bot builder and NLU specialist can be one person is not a good idea – this is one specialized field.
Must have NLU expertise and understand how NLU works with supervised machine learning
Fail to understand confidence scoring, how that relates to the aperture of intents the AI is listening for
example - healthcare org built a front door app on their own that worked well in QA. After they went live, they started adding more and more intents without doing enough training for each intent (and didn't understand how to handle some things rules-based) and the whole thing fell apart, so they went back to touchtone
Fail to understand as they add intents to the knowledge model, how patterns might begin to bleed and how confidence scores must be adjusted and more mistakes may occur which puts more training pressure on the rules based algorithmic NLU
No knowledge of how to use development resources for complex cases like alphanumeric
Fail to understand speed in which post processing is executed and how that's relative to confidence scoring and the amount of data trained (result of not understanding means the latency is replies from the bot is very slow and frustrating)
Don't know the relationship between ML-based NLU and rules-based algorithmic NLU and their synergistic relationship to know when/how to use either for which circumstance.
Fail to understand the difference between building and training models based on phrases versus hot words and why and when you need to
Fail to account for speech-to-text errors
Fail to account for the amount of training via supervised machine learning during hypercare
Example - a company who assigned one person on team to train the AI model but since they weren't a NLU expert, they were spending their entire day attempting to train but nothing was improving
#14:Phill
Not a relay race --
Project Manager – acts as the main point of contact between departments, client (herder of cats)
Developer/Engineer -
Data Analyst - (A data analyst needs to process and interpret data. A data scientist needs to be able to build and develop tools that process information.) Is involved in the design
#15:This is an example of an ROI calculation we did for one of our healthcare clients. They were getting swamped with “scheduling” call requests. Their call wait times were up to an hour and they were experiencing ‘no shows’ every day because people did not want to wait an hour to cancel an appt. (as you know, ‘no shows’ cost you money too!)
Walk through example
Assume we only take half of these calls (because you have automation resistance, biz rules, etc.) we’re still at nearly $1M ROI multiply that by 3 for term of our deal and that will get any CFO really excited
#16:Data analytics is not just a nice-to-have addition to your current data collection methods, it is crucial to improving the customer experience of your bots. This data is what gives you direct feedback about the way customers interact with your bot.
First, you have to know reporting methodology on how to coach your bot builder to program all possible outcomes and breadcrumbs into every end state prompt by prompt (and toolset must allow for this flexibility because if you're only relying on out of the box reporting or rigid reporting tools, this will fail except for the simplest interactions)
Example - AAA club that tried to build ERS using Amazon tools - took a year with 5 developers and still can't get better than 50% accuracy on year/make/model
When you’re dealing with so much data, you have to be able to know where to get actionable insights and that means following your conversations prompt by prompt with outcomes and endstates for each turn in a conversation flow – because if there is a problem, you have to be able to be able to identify WHERE the problem is happening so you can put a magnifying glass on it and know where you need to make adjustments or train the models.
Outcome
So an outcome is just how did the call end up? Did the call finish successfully? Did they hang up? Did they request an operator? Were they transferred to a live agent due to a planned handling rule or because of confusion.
Endstate
The End State tells you exactly which prompt the caller was at when the call ended like did they successfully book a reservation? Or did they hang up after giving their destination. By combining outcomes and endstates you can know exactly where all your callers ended up and why.
Breadcrumbs
And if you want to find out data prior to end states, you use breadcrumbs. Breadcrumbs give you the prompt-by-prompt trail of any caller or any group of callers, so you have the granular breakout of exactly what happened on their journey and how many took that journey.
This is where we really unlock some amazing potential in terms of tuning an application and understanding is there friction and where is it at? Did we ask a question that a caller wasn’t prepared for? Are callers responding to a questions in ways we didn’t expect?
So this is a huge part of the value of the reporting process that needs to be tailored to your exact business needs
#17:Requirements of what you need depends on what quadrant you put yourself in
Four Category Types:
Full DIY (starting from bottom left quadrant)
Build and Release
Virtual agents as a service
Embedded Tool in a CCaaS
#18:
We talked about the failures – let’s talk to how you can set your conversational ai project up for success.
#1 - Make sure business objectives, priorities, and problem/use cases are clearly defined before investing. This is probably your biggest hurdle – getting all internal stakeholders on board on the value, the problem/use case to tackle first
#2 - Recruit/hire a team with the right skillset to build, train, and optimize AI models for the long term. Important to note - AI has a talent gap problem, which means companies often have to scramble to recruit team members with the right skillsets for building effective AI. Most organizations are currently ill-designed to support scalable AI ventures, requiring re-orgs, new hiring efforts, and leveraging of third-party resources.
#3 – whether you build or buy: Make sure bot fully integrates to contact center platform, telephony systems, CRMs, homegrown solutions, etc.
#4 Have a reporting methodology in place at time of build.
#5 Make sure to allocate resources toward continuous training for the lifecycle of the bot. Of the projects that are deployed successfully, many face challenges with model drift — or changing external conditions — that lower the model’s accuracy or even make it obsolete. Models must consistently be retrained with new, relevant data to overcome this hurdle.
#23:Helena
Once you go-live, that’s where critical data starts pouring in -- how are your customers are engaging with bot? Are there certain points in the conversation flow that the bot is getting stumped? There’s no way to anticipate all the possible ways that your customers will ask or respond to questions, so that’s where the success manager comes in – they are the one who drives the hypercare process and beyond.
The success manager (or technical account manager) is just that – they ensure the success of your application
Must understand how to drive the hypercare process after go-live to map the technical work to business outcomes.
Drives optimization initiatives with the technical team that generally fall into 2 big buckets
supervised machine learning to train both the ML-based and rules-based AI models
adding new capabilities to the bot to account for any reason why callers get transferred to an agent. While that sounds easy, adding new capabilities requires participation from designers, bot builders, NLU expert, QA and sometimes (1) developers depending on the ask or (2) data initiative to either clean data and have engineers make new data fields available
(Data analyst can fulfill this role)
Stack ranking priorities on what needs to be done. (getting the right insights, the right ROI, roadmap)
#24:
The first question that people tend to ask when something’s not working correctly is – is it the technology?
Do you have an AI toolset that allows you to improve over time? How do you know if you’re working with the right toolset as the foundation?
On ASR and NLU:
Bot building platform that uses good ASR + NLU....but too rigid so not enough flexibility for anything beyond something simple (dialogflow)
Bot building platform that uses good ASR + NLU and flexibility but requires too much effort from engineers to build and change
Relying on poor or outdated ASR or NLU engine that doesn't offer supervised machine learning for continual improvement. (we've seen both Twilio and Genesys abandon their own IP in this area)
ASR or NLU sold as perpetual licenses and built for on-premises software
Either ML-based NLU or algorithmic NLU instead of both. It's an absolute must to have dual NLU approaches because while ML-based NLU is the workhorse for most things, there's lots it misses that you have to account for with a rules-based algorithmic approach
Stuck to just one ASR or NLU instead of being able choose the best for the use case at hand
Non-Technical or Limited Toolset
Using a toolset designed for simplicity of use by a non-technical resource (i.e. Nuance licenses that come with an IVR so they can create conversation flows using directed dialog. It's this experience that we all associate with the bad IVR experience
Rigid reporting that won't allow you find the needles in the haystack to point you where the bot or AI needs to be optimized
Closed stack using exclusively their own speech IP without ability to integrate best-in-class tools
Lack of omnichannel and ability for channel switching while maintaining context
On Voice:
Chatbot startups that don't have voice in their DNA trying to add voice capabilities to their platforms
Robotic TTS engine
Slow conversation with no ability to tweak time delay before post processing kicks off for any given question (some need to be fast while others need to be slow)
MUST DO VETTING PROCESS -> build a demo of exactly what you need minus integrations to see if you get the CX and flexibility you need
#25:Helena
When you’re dealing with so much data, you have to be able to know where to get actionable insights and that means following your conversation prompt by prompt with outcomes and endstates for each turn in a conversation flow [speak to that more in a min] – because if there is a problem, you have to be able to identify WHERE the problem is happening so you can put a magnifying glass on it and know where you need to make adjustments or train the models.
And so that’s what breadcumbs are…Breadcrumbs give you the prompt-by-prompt trail of any caller or any group of callers, so you have the granular breakout of exactly what happened on their journey and how many took that journey.
This is where we really unlock some amazing potential in terms of tuning an application and understanding is there friction and where is it at? Did we ask a question that a caller wasn’t prepared for? Are callers responding to a questions in ways we didn’t expect?
So this is a huge part of the value of the reporting process that needs to be tailored to your exact business needs
#26:Helena
So let’s
So here is an example conversation flow of a customer trying to schedule a service appointment for her car. I’m drawing attention an outcome is just how did the call end up? Did the call finish successfully?
The End State tells you exactly which prompt the caller was at when the call ended like did they successfully book a reservation? Or did they hang up after giving their destination. By combining outcomes and endstates you can know exactly where all your callers ended up and why.
#27:Helena
When you’re dealing with so much data, you have to be able to know where to get actionable insights and that means following your conversations prompt by prompt with outcomes and endstates for each turn in a conversation flow – because if there is a problem, you have to be able to be able to identify WHERE the problem is happening so you can put a magnifying glass on it and know where you need to make adjustments or train the models.
So here is an example conversation flow of a customer trying to schedule a service appointment for her car. I’m drawing attention an outcome is just how did the call end up? Did the call finish successfully? Did they hang up? Did they request an operator? Were they transferred to a live agent due to a planned handling rule or because of confusion.
If a transfer happens, you need to know why (some will have out of the box reporting on what happened, rather than WHY it happened
#28:Delete this slide?
Containment is not the best way to measure your agent
Virtual agent-resistant drives down that containment rate
Metrics that matter:
Confusion transfer – when something goes wrong. Was it really the fault of the bot? the AI wasn’t trained to handle it?
Request an operator
How much time is spent with your live agent?
#29:Brian
When a customer calls in, we use open-ended intent capture when we start the conversation by asking “How can I help you today?” This is a full natural language experience where the caller can reply to the AI in their own words and essentially say anything.
What we’re listening for is certain hot words or phrases to identify the intent they are calling about, so we can put them in the right intent flow. So if the customer is asking about proof of insurance, we can take them to that part of the flow to help them get a copy of their insurance. If the virtual agent hears “claim” but isn’t quite sure what the customer wants to do, then we can put them at the top of the claims-intent flow and ask them to describe what they’d like to do and hand-hold them from there (in this case, does the customer want to file a claim? Or just get information on the status of a claim?)
(you’re not going to have all the data yet)
If they call in about points and saying something related to banking or borrowing points, we’ll take them straight to that part of the flow to bank or borrow. If we heard points but we’re not entirely sure yet what they want to do with their points, we’ll put them at the top of the points intent flow and ask them to describe in a few words what they would like to do and hand-hold them from there. The same applies for reservations and member services. If they tell us they want to book a new room AND give us the destination, we can skip asking about reservation type or destination and go straight to check-in and check-out dates to confirm availability.
#30:Helena
So Brian was just discussing what’s happening on the customer experience side of things…and now I’ll take you through the process of what’s happening on the backend as well as some key terms to know.
In this example, the customer if asking to change their appointment – and that’s the intent, or the goal that the customer is trying to achieve. [Another way to think of the intent is as “the intention” or what is the customer’s intention?]
In this utterance, we also have an entity, or in this case, we have two entities – Friday (the day) and 4pm (the time). And an entity acts as the modifier to the intent. It’s essential capture both the intent and the entities correctly in order to deliver what the customer wants. It’s not enough to know that the customer wants to change their appointment – we also need to know what day and time to change it to.
Which brings us to the question -- what happens when the virtual agent isn’t able to understand the customer?
#31:Helena
?…and it really is a matter of when, not if.
So when you launch your conversational AI solution, it’s a given that your virtual agent isn’t going to understand everything. And that’s because the training it’s received so far has been mostly from the QA team. It hasn’t had a chance to interact with your customers, so it’s not able to anticipate all the possible intents and entities.
And so in this example, “my mother in law is flying in from out of town on Tuesday, so I won’t be able to make it to my appointment…” this isn’t something we’ve trained the virtual agent to handle [even if you are using prebuilt models to speed up intent prediction and extract the context within the utterance].
So the key point here, is that it’s only after deployment that critical data starts pouring in – and you can see how your customers are engaging with the virtual agent. Are they dropping off at certain points in the conversation flow? Do we see any patterns where the virtual agent is getting stumped?
---
(if the IVA doesn’t get it the first time, directed dialog)
#32:
On day 1 we know we won’t be capturing every possible scenario that will come through.
Example - someone says “add item" and comes through as unrecognized and a confusion transfer. That's not something the AI model can be trained to handle because the application doesn't account for it. We could start account for “add item" as something we don't handle yet and designate it a "business rule transfer" to send to a human with context. That will move it out of the "confusion transfer" and to the "biz rule" pile. As you do this and find reasons for transfer that IVA is not prepared to handle, you can go through the biz rule outcomes to identify highest volume reasons and see if you should change the application and/or add additional data source or capability to handle. If you see top biz rule transfer is add item, then pick that off as the top reason to handle.
Do we want to add this to the business transfer bucket? Pulls out of the confusion transfer
#33:Brian
Let’s move over to talking about actually training the AI models - decisions on training hot words or phrases and difference between recognized and unrecognized. Point is to show machines don't learn on their own. A human must be in the loop to make intelligent decisions about the training. BUT the AI does a really good job of point humans to where they need to make a decision
Explain “suggested”
Villa – if you see the phrase on “Villa” (4 from the bottom), the word “upgrade” wasn’t written into the NLU model by anyone but the AI had a high enough confidence score in the phrase that it sent the caller to the right prompt, “room type.” It was the right decision but since the AI wasn’t 100% sure, it was elevating this to a human for confirmation. By clicking and confirming that it made the right decision, that model is trained for that in the future, which improves confidence scores on anything else that comes in similar.
Occupants – (2nd from bottom) “I need to change number of occupants that are in my room.” The NLU sent them to the change reservation intent because it heard “change” and “room”. That was fine. It would get the job done. But it would be better if the model was trained to take them straight to the “change occupants” prompt. If that prompt was built and available in the drop-down, the model would be trained to take them straight there next time next time. So that way even if it hears “change” and “room” but also hears “occupants,” it’s a change occupants request
COVID-19 – you can see here that we’re training phrases associated to the Covid-19 prompt so the Machine Learning gets better and better at recognizing other similar Covid inquires. We could simply hard code Covid-19 as a hot word and send any phrase with it to that prompt. But since it’s possible someone could mention the pandemic in their response without specifically inquiring about the pandemic, we want to train the model for phrases instead of just the word itself.
Under “unrecognized” you can see this is what the AI had no idea what to do with. These resulted in “confusion transfers.” Talk about each of them
Exception
Extension
Pool – nothing was programmed into the model on how to handle
Fifteenth was a misspelling – missing an H
#34:
As I mentioned, you need to have a human in the loop to improve how the virtual agent engages your customers – but it can’t just be anyone. It has to someone who understands your business and customer motivations:
data expert looking for opportunities where the application needs improvement
conversation designer for new flow additions or changes
a builder who is going to work those changes into the application
Point is also to show you can't have a monkey training. They have to understand the application to know how it should be trained (e.g. misspellings, intent for exception vs intent for extension, etc.) But to be sure, you would have to go to the dashboard to search the transcriptions to see their endstate and listen to the recording if needed. No biggie if you just see one “exception" unrecognized. But if you see a bunch in the table,, you know you have a problem