Contact Center Customer Experience: A Solution with AWS Conversational Generative AI

In the evolving landscape of contact center management, AWS Conversational Generative AI presents a groundbreaking solution for agent evaluation to improve customer experience. By integrating AWS Bedrock’s advanced LLM models, including Claude Sonnet, Llama 3, and Mistral, businesses can leverage cutting-edge prompt engineering to assess agent performance with unparalleled precision.

These models enhance the evaluation process by analyzing conversation nuances, providing actionable insights, and identifying training needs. This solution stands out by combining deep learning capabilities with real-time analytics, ensuring that contact center agents are evaluated comprehensively and constructively, ultimately driving improved customer interactions and operational efficiency.

As part of enhancing customer performance, we deep dive into areas to evaluate customer facing interactions which is a daunting task because of the fluid and unpredictable nature of the interactions. We present our point of view on how such applications are now not only possible, but also scalable.

Continuous evaluation of contact center operations is crucial for achieving cost-efficiency, enhancing agent performance and improving customer experience.

Contact centre operations

A typical contact center operation using Amazon Connect + Contact Lens as the backbone transcription utility looks like this.

Amazon connect provides an omni channel UI for incoming calls and chats, call routing and analytics. It also provides high quality call transcriptions.

Transcriptions generated by Amazon connect are stored in S3 buckets for further processing, along with the raw audio (MP3) of the call. Each transcript file (which is in JSON format) contains additional information such as sentiment analysis, dead airtime, diarized conversation etc.

Transcripts generated by contact center are used for QA review by managers and quality analysts for identifying patterns and mistakes made by agents (to see if any agent coaching is required).

The challenges in contact centre quality assurance

Contact centers quality assurance is a manual and tedious process. The typical challenges are:

Improving Agent Evaluation with AWS Generative AI

AWS Generative AI can help automate the agent evaluation process while cutting costs and bringing several ancillary benefits.

Below are the areas to consider automating the contact center post call quality assurance.

A detailed point of view to Advanced RAG techniques can be found here.

Agent evaluation workflow

This workflow is a batch job which runs on the customer call transcript files generated by Amazon connect every day.

Call transcription with AWS Connect: Utilizing AWS Connect for automatic transcription, we ensure secure and scalable processing of up to half a million calls daily. This setup integrates seamlessly with AWS S3 for efficient data storage and subsequent analysis.

Parameter evaluation with LLMs: By harnessing the power of advanced LLMs such as Claude Sonnet 3, Mistral and Llama 3 on AWS Bedrock, we can evaluate call transcriptions against key metrics like intent probing and action execution. Regular updates and periodic LLM fine-tuning are essential to maintain accuracy and minimize bias.

Response validation with RAG system: The RAG system is employed to validate agent responses against internal domain knowledgebase, ensuring information accuracy and consistency. Technologies like hybrid vector search enhance data retrieval efficiency by providing the closest context to customer query, directly impacting customer satisfaction.

Tone analysis with machine learning models: Analyzing emotional tone using custom machine learning models (e.g., Amazon Chime SDK, Deepgram API) helps gauge agent demeanor and customer experience. Diverse datasets improve the model’s ability to detect a wide range of emotional cues.

Real-time in-call assistance: AWS Kinesis streams call audio into voice recognition models, providing real-time in-call assistance through systems like OpenAI's Whisper and Amazon Q. This ensures agents have the support they need during interactions.

Responsible AI (RAI) system: The RAI system continuously monitors outputs for bias and toxicity, maintaining ethical standards in both AI responses and agent behavior. Tools like AWS clarify, Bedrock Guardrails play a critical role in this process.

Agent evaluation with LLMs for end-to-end Call interactions

LLMs can be used to successfully automate tasks which were needed to be performed by human agents because of the requirement to

i. Understand large contexts

ii. Understand rules defined in natural language

iii. Put together the context with the rules to come up with evaluations

The following areas are used to evaluate agents for each call. Each of these areas can now be automatically evaluated using LLM – a thing not possible heretofore.

Initiate conversation: Agent needs to begin the conversation on a pleasant note, appropriately mentioning the company brand. He/she should ensure to welcome the customers call enthusiastically and start building confidence.

The LLM is prompted to note whether each of these rules was followed in the call by passing the call-transcript to it as context.

Create rapport: Agent needs to start asking effective questions, while showing empathy to the customers problem. Agent should strictly avoid any offensive language or a hostile attitude of voice.

LLM is prompted here to evaluate the rules. Custom LLM model is used to analyze voice (mp3) for hostile emotions.

Probe the issue: Agent needs to probe the issue at hand and start formulating a candidate solution for the same. Agent requires to display deep and accurate knowledge of the organization policies and procedures to fully diagnose the issue.

The LLM is prompted to note whether each of these rules was followed in the call by passing the call-transcript to it as context. RAG is used here to probe the organizational knowledge bases to find the best-match solution for the issue at hand. This solution is matched to the solution provided by agent during the call to evaluate agent efficacy.

Begin remediation procedure: Having formulated the candidate solution, agent needs to start communicating the same to customer. The language used for communication should be free of esoteric terms, or phrases which are internal to the organization – the idea being that the layman customer should be easily able to understand what is being said.

LLM is prompted here to take note of the language used by the agent for communication to the customer about the remediation procedure. LLM is also prompted to evaluate whether the response included repetitive or tedious terms.

Closing conversation: Having informed the client about the remediation process, the agent needs to bring the conversation to a graceful closure. Agent needs to explain future path of action, and the timeline for the same. The call needs to be closed cordially.

LLM is prompted to take notes of all these rules to evaluate the call closure.

Prompt engineering techniques for Agent Evaluation

Incorporating Generative AI (GenAI) into QA processes for call centers offers a powerful way to automate the evaluation of agent performance. By using well-crafted prompts, we can guide the AI to accurately assess call transcripts against company-specific guidelines. The effectiveness of this approach hinges on prompt engineering—the art and science of creating precise and contextually appropriate prompts. This section explores key prompt engineering techniques that can enhance the AI's ability to evaluate AWS Contact Center call transcripts.

Comparison of LLMs for Agent Evaluation

Here’s a table summarizing the performance comparison between GPT-4, LLaMA 3.1, Claude 3.5, and Mistral 7B based on accuracy, precision, speed, and use cases. Please note that models are constantly updated, and this data may have changed by the time of going to press.

References:

Vectara

MyScale | Run Vector Search with SQL

MarkTechPost

AI StartUps Product Information

Vellum AI

AI StartUps Product Information

Scalability considerations

To handle large data volumes, we leverage cloud-native services, ensuring elastic infrastructure and efficient data management.

i. High-performance clusters and load balancing techniques maintain system reliability under high traffic conditions.

ii. AWS Bedrock has limitation of 1000000 tokens/ min and 500 requests/ min. This can be circumvented by provisioning throughput and requesting additional MUs (model units) from Amazon.

iii. AWS Lambda has a limit of 1024 concurrent threads. This can be circumvented by requesting AWS.

LLM scaling using cross-region inferencing

It is often the case that LLM service quotas severely limit the bandwidth of the LLM based application. However, AWS Bedrock has now made available cross region inferencing, to mitigate this issue.

Cross-region inference in Amazon Bedrock helps developers manage traffic spikes for generative AI applications by routing requests across multiple AWS regions. This new feature ensures optimal availability and performance by automatically rerouting inference traffic during high-demand periods, enhancing resilience without additional routing costs. It integrates with the existing Amazon Bedrock APIs, simplifies application management, and improves the scalability of generative AI workloads.

References: AWS blog here.

Authors:

Satish Banka Amit Suresh Karnik

Contact Center Customer Experience: A Solution with AWS Conversational Generative AI

Satish Banka

Agentic AI / GenAI Scientist and Emerging technologies Consultant

Contact centre operations

The challenges in contact centre quality assurance

Improving Agent Evaluation with AWS Generative AI

Agent evaluation with LLMs for end-to-end Call interactions

Initiate conversation: Agent needs to begin the conversation on a pleasant note, appropriately mentioning the company brand. He/she should ensure to welcome the customers call enthusiastically and start building confidence.

Create rapport: Agent needs to start asking effective questions, while showing empathy to the customers problem. Agent should strictly avoid any offensive language or a hostile attitude of voice.

Probe the issue: Agent needs to probe the issue at hand and start formulating a candidate solution for the same. Agent requires to display deep and accurate knowledge of the organization policies and procedures to fully diagnose the issue.

Closing conversation: Having informed the client about the remediation process, the agent needs to bring the conversation to a graceful closure. Agent needs to explain future path of action, and the timeline for the same. The call needs to be closed cordially.

Prompt engineering techniques for Agent Evaluation

Comparison of LLMs for Agent Evaluation

More articles by this author

Others also viewed

The Convergence of AI and Single-Channel Customer Service

How AI Voice Bots Are Revolutionizing Customer Service in 2025

The Future of Customer Interactions: How Speech Analytics is Shaping CX

Maximize Business Decisions: Harnessing Salesforce AI Cloud Power

The Role of Agentforce in the Future of Customer Experience & Chatbots

What Is the Future of AI in Customer Service?

Demystifying AI in Your Workflows: A Look at Azure AI Actions in Power Automate

Revolutionizing Customer Engagement: The Era of Autonomous Customer Agents

Unleashing the Power of Salesforce Einstein AI: Revolutionizing Business Intelligence

Why does Zendesk have the largest AI customer base in the CX space?

Explore topics

Contact centre operations

The challenges in contact centre quality assurance

Improving Agent Evaluation with AWS Generative AI

Agent evaluation with LLMs for end-to-end Call interactions

Initiate conversation: Agent needs to begin the conversation on a pleasant note, appropriately mentioning the company brand. He/she should ensure to welcome the customers call enthusiastically and start building confidence.

Create rapport: Agent needs to start asking effective questions, while showing empathy to the customers problem. Agent should strictly avoid any offensive language or a hostile attitude of voice.

Probe the issue: Agent needs to probe the issue at hand and start formulating a candidate solution for the same. Agent requires to display deep and accurate knowledge of the organization policies and procedures to fully diagnose the issue.

Closing conversation: Having informed the client about the remediation process, the agent needs to bring the conversation to a graceful closure. Agent needs to explain future path of action, and the timeline for the same. The call needs to be closed cordially.

Prompt engineering techniques for Agent Evaluation

Comparison of LLMs for Agent Evaluation

Optimizing Prompt Engineering: An Approach with PromptHub & Playground

Apr 4, 2025

The Future of Presentations: Automated Slide Design and Generation with LLMs

Sep 27, 2024

Advanced Retrieval Augmented Generation with AWS Generative AI

Sep 3, 2024

Others also viewed

The Convergence of AI and Single-Channel Customer Service

How AI Voice Bots Are Revolutionizing Customer Service in 2025

The Future of Customer Interactions: How Speech Analytics is Shaping CX

Maximize Business Decisions: Harnessing Salesforce AI Cloud Power

The Role of Agentforce in the Future of Customer Experience & Chatbots

What Is the Future of AI in Customer Service?

Demystifying AI in Your Workflows: A Look at Azure AI Actions in Power Automate

Revolutionizing Customer Engagement: The Era of Autonomous Customer Agents

Unleashing the Power of Salesforce Einstein AI: Revolutionizing Business Intelligence

Why does Zendesk have the largest AI customer base in the CX space?

Explore topics