SlideShare a Scribd company logo
Frankbot - ML framework for auto-responding
to customer support queries
Outline of the talk
● Introduction to Freshdesk
● Motivation and Objectives
● Datasets for model training
● Modeling Methodology
○ Offline training
○ Online processing
○ Onboarding a customer account
○ Periodic model refresh
○ Teach the bot
● Metrics and business impact
○ Understanding the metrics
○ Challenges and learnings
Introduction to Freshdesk
Freshdesk is a multi-channel cloud based customer support product, which enables businesses to
● Streamline all customer conversations in one place - these are conversations between the business and its end
customers
● Automate repetitive work and make support agents more efficient
● Enable support agents to collaborate with other teams to resolve issues faster
● Freshdesk tickets are a record of customer conversations across channels (read phone, chat, e-mail, social, etc.)
○ A typical conversation includes customer queries and agent responses
○ Frequently recurring customer queries are called T1 tickets
● Freshdesk currently has ~150,000 customers from across the world
Some statistics from companies using Freshdesk
● Average proportion of T1 tickets - 80%
● Average proportion of tickets with answers in the knowledge base - 60%
● Average proportion of tickets with answers in the ticket conversation - 70%
Motivation and Objectives
● To build a Machine learning based bot which can do the following
○ Intercept and auto-resolve T1 tickets which are frequently recurring in the support helpdesk
○ Leverage content from the business’ Knowledge base to answer T1 queries
○ Reduce time spent by support agents on T1 tickets, thereby enhancing their overall
productivity levels
○ Identify historical tickets which are similar to a new ticket - agents can resolve tickets faster by
looking up information contained in the similar ticket
● Enabling support agents to understand the different types of questions which are raised by
customers
● Help support agents create FAQs which can in turn enhance the bot’s self service potential
● Enable support agents to train the bot further by mapping customer queries to expected responses
Frankbot in production
Datasets for model training
● Source - Freshdesk data pertaining to customer (business) accounts
○ Includes tickets and Knowledge base articles, FAQs
○ Includes tickets from different channels such as e-mail, portal (raised on website),
chat, social and phone
● Data of different accounts - All active and paid accounts with at least 100 tickets in the
last 3 months.
● Training strategy
○ One model per account trained end-end
○ Embeddings trained at industry level, models at account level
Note: Tickets from email, portal-direct, chat and phone channels account for close to 95% of
the ticket volume
Modeling Methodology
FAQ Answerbot
Data Train - Historical ticket data + knowledge base - test tickets
Test - tickets in the last 10 days (no overlap with train)
Candidate responses - Articles/FAQs from the Knowledge base
Preprocessing Email cleaning - signature cleaning, cleaning forwarded emails, removal of
code constructs, non-ascii characters, salutation, text below signature
Primary preprocessing - unicode normalization, lower casing, punctuation
removal, stop words removal & stemming
Secondary preprocessing - bigram processing
L1 Layer Ensemble of LSA and W2V vector space embeddings
L1 Similarity metric Cosine similarity
Modeling Methodology
FAQ Answerbot
L2 Features 1. % word match between the query and candidate responses
2. % word match between words with similar parts-of-speech tags
3. Word mover distance
4. Ordered bigram and trigram counts
L2 Model RandomForest / XGBoost
Thresholds Based on L1 and L2 scores (with override levels)
Offline Model Training
Train data
Candidate
responses (n)
Test data
(m)
Preprocessing - Email cleaning, primary &
secondary preprocessing
L1 (Embedding) Layer -
training
Candidate
responses (n)
Test
vectors (m)
Pick top k responses based on
L1 scores (m*k)
Feature Creation
Preprocessing - missing value imputation,
outlier treatment, scaling
L2 (Classification)
Layer - training
Relevance Probability Vector ((m-t)*k)
Pick top 3 based on prob ((m-t)*3) +
evaluation
Train data
(t)
Candidate
responses (k)
Test data
(m-t)
Redis
S3
Write
Lookup/w
ord
vectors/idf
Write class
model object
Write L1 & L2 thresholds for gating and
ranking
Online Processing
I/P Query
Preprocessing - Email cleaning, primary &
secondary preprocessing
L1 (Embedding) Layer -
transformation
Candidate
response
vectors (n)
Query
vector
Pick top k based on similarity
(1*k)
Feature Creation
Preprocessing - missing value imputation,
outlier treatment, scaling
L2 (Classification)
Layer - prediction
Relevance Probability Vector (1*k)
Pick top 3 based on prob (1*3)
Redis
S3
Read
Lookup/w
ord
vectors/idf
Read class
model object
Read L1 & L2 thresholds for gating and
ranking
Onboarding a customer account
● Onboarding a new customer account involves extracting tickets and articles from the data
lake and training the L1 model (LSA)
● Onboarding also involves choosing the right pre-trained word embedding corresponding
to the account’s industry
○ Example industries : Retail, Financial services, SAS, Healthcare, Education
● An ensemble of LSA and W2V embeddings is used to generate L1 scores for each
(query, response) pairs
● A downstream classification (L2) model is trained to generate model confidence scores
for each (query, {response}) tuple
○ If enough data is not available for the concerned account, an industry-level L2
model is used
● Thresholding, i.e. deciding whether to answer a given query or not, is based on both L1
and L2 scores
● Model refresh is key to ensuring that the models are up to date and stay relevant over
time
● This is done once a week; or as soon as an account accumulates a sizeable number of
new queries or Knowledge base updates
● It involves the following steps
○ Retraining the LSA model after including the newly accumulated data
○ Incremental training of word vectors with new data
○ Retraining the L2 (classification) model on recent data
■ The L2 model is trained by manually labeling if the responses from the L1 layer are
relevant or not (1/0)
■ A 3rd party company is engaged to label these responses
Periodic model refresh
Teach the bot
● Teach the bot is a feature that allows customer support agents to explicitly train the bot by
ingesting Q → A mappings
● When the Answerbot fails to respond to a query (Q), the agent can point the bot to the expected
response (A) which should have been returned
● If a suitable response (A) does not exist in the Knowledge base, it can be created on-the-fly
● This expected response (A) is consumed and mapped to be close to the query vector (Q) in the
L1 vector space
○ This ensures that article A would show up for future queries that are similar to Q
○ The same feature is re-purposed to resolve incorrect bot responses as well
○ This feature also helps to improve the overall coverage levels of the Answerbot
Metrics and business impact
Month
# Active
Clients
# Requests # Responded # Helpful # No Feedback % Deflection
May’18 97 10,805 6,075 1,657 1,868 15.34%
Jun’18 151 22,195 12,969 2,550 5,981 11.49%
July’18 182 30,376 19,330 3,792 5,669 12.48%
Aug’18 242 50,049 29,948 5,940 7,839 11.87%
Sep’18 347 63,587 38,064 8,308 10,112 13.07%
Oct’18 457 101,493 56,390 16,589 33,360 16.34%
Nov’18 478 130,687 78,902 25,680 46,555 19.65%
Dec’18 480 137,517 82,366 23,713 52,772 17.24%
● CSAT* - 79% with bots and 72% without bots
● Average First Response Time (overall) - 13 hrs with bots and 19 hrs without bots
*CSAT - Customer Satisfaction Score
Understanding the Metrics
● # Active clients - number of customers who are exposing the bot to their customers in their
support portal
● # Requests - number of requests that the bot gets
● # Responded - number of requests responded/answered by the bot
● # Helpful - number of requests where the bot responses were helpful
○ Alongside every bot response, a “Was this helpful?” message is also shown and the user’s
feedback is solicited. This helps in tracking helpful responses.
● # No Feedback - number of bot responses for which there was no feedback from users
● % Deflection - Ratio of the # Helpful and # Requests
Challenges and learnings
Challenges:
● Developing a preprocessing mechanism that can extract only the salient components from
messy emails
● Handling the complexity of storing and retrieving vector of floats (idfs, SVD components, word
vectors) for every account
● Serving predictions at low latency
● Handling kafka streams for updating content in real time - Spark streaming
● Usage of the right tools for monitoring and finding bugs in the codebase in a proactive manner
Lessons Learnt:
● Start with a simple model and add incremental improvements over a period of time
● Involve data engineers at the very beginning to create pipelines for data; front-end engineers for
making changes to the UI
● Define success metrics and inform stakeholders about what a reasonable target is
Thank You
Appendix
Why are some suggestions not helpful to the
user?
● Query could relate to a new topic for which there may not be enough FAQs or articles
● Query could relate to an existing topic but may contain keywords which are not in the vocabulary
- This may result in low L1 and 2 confidence which may not satisfy the thresholds
● Query may be related to a particular action - Example: “Can you connect me to an agent?”
which is a question for a task completion bot that has intent detection capabilities
● Query may not have a question or issue - Example: “I have an open ticket 3335924”
● Query may be ambiguous or unclear - Example: “discussion”

More Related Content

PPTX
ML Framework for auto-responding to customer support queries
PDF
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
PDF
DCXS best selfcare-solutions DynamicFAQ
PDF
NoCRM - BigData Amsterdam 4.0
PPT
Class 7 lecture notes
PPTX
ML Framework for auto-responding to customer support queries
DOCX
1) Question Add Targets to Balanced score Card
DOCX
1) Question Add Targets to Balanced score Card
ML Framework for auto-responding to customer support queries
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
DCXS best selfcare-solutions DynamicFAQ
NoCRM - BigData Amsterdam 4.0
Class 7 lecture notes
ML Framework for auto-responding to customer support queries
1) Question Add Targets to Balanced score Card
1) Question Add Targets to Balanced score Card

Similar to ML Framework for auto-responding to customer support queries (20)

DOCX
1) question add targets to balanced score card
PDF
Gemini Recommendation for Building an AI Service Desk.pdf
PDF
Learn data science with r programming
PDF
Learn data science with r programming
PDF
Learn data science with r programming (1)
PDF
Learn data science with r programming
PDF
Strategic AI Integration in Engineering Teams
PDF
Resume shutima p_dataeng01
PDF
Session 8 AI Associate Series: Fundamentals of Model Training
DOCX
Padma Jalneela updated
PDF
AI Talks Live - ML.NET and NLP (with ONNX)
DOC
Sai Krishna_Resume
DOC
CV - Luthfi Mohamad Latief
PPT
Quest Back 2010
PDF
Artificial Intelligence at LinkedIn
DOC
Preshanth without information
PDF
Teaching Data-driven Video Processing via Crowdsourced Data Collection
PPT
Sales Training
DOC
Karith_Rungwattana_Resume 201603 v 1.0
DOC
Resume_Suneeta
1) question add targets to balanced score card
Gemini Recommendation for Building an AI Service Desk.pdf
Learn data science with r programming
Learn data science with r programming
Learn data science with r programming (1)
Learn data science with r programming
Strategic AI Integration in Engineering Teams
Resume shutima p_dataeng01
Session 8 AI Associate Series: Fundamentals of Model Training
Padma Jalneela updated
AI Talks Live - ML.NET and NLP (with ONNX)
Sai Krishna_Resume
CV - Luthfi Mohamad Latief
Quest Back 2010
Artificial Intelligence at LinkedIn
Preshanth without information
Teaching Data-driven Video Processing via Crowdsourced Data Collection
Sales Training
Karith_Rungwattana_Resume 201603 v 1.0
Resume_Suneeta
Ad

Recently uploaded (20)

PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
Introduction to machine learning and Linear Models
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
1_Introduction to advance data techniques.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Foundation of Data Science unit number two notes
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Lecture1 pattern recognition............
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
IB Computer Science - Internal Assessment.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Mega Projects Data Mega Projects Data
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Introduction to machine learning and Linear Models
Galatica Smart Energy Infrastructure Startup Pitch Deck
1_Introduction to advance data techniques.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Foundation of Data Science unit number two notes
climate analysis of Dhaka ,Banglades.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Qualitative Qantitative and Mixed Methods.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Lecture1 pattern recognition............
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Data_Analytics_and_PowerBI_Presentation.pptx
Introduction to Knowledge Engineering Part 1
IB Computer Science - Internal Assessment.pptx
Ad

ML Framework for auto-responding to customer support queries

  • 1. Frankbot - ML framework for auto-responding to customer support queries
  • 2. Outline of the talk ● Introduction to Freshdesk ● Motivation and Objectives ● Datasets for model training ● Modeling Methodology ○ Offline training ○ Online processing ○ Onboarding a customer account ○ Periodic model refresh ○ Teach the bot ● Metrics and business impact ○ Understanding the metrics ○ Challenges and learnings
  • 3. Introduction to Freshdesk Freshdesk is a multi-channel cloud based customer support product, which enables businesses to ● Streamline all customer conversations in one place - these are conversations between the business and its end customers ● Automate repetitive work and make support agents more efficient ● Enable support agents to collaborate with other teams to resolve issues faster ● Freshdesk tickets are a record of customer conversations across channels (read phone, chat, e-mail, social, etc.) ○ A typical conversation includes customer queries and agent responses ○ Frequently recurring customer queries are called T1 tickets ● Freshdesk currently has ~150,000 customers from across the world Some statistics from companies using Freshdesk ● Average proportion of T1 tickets - 80% ● Average proportion of tickets with answers in the knowledge base - 60% ● Average proportion of tickets with answers in the ticket conversation - 70%
  • 4. Motivation and Objectives ● To build a Machine learning based bot which can do the following ○ Intercept and auto-resolve T1 tickets which are frequently recurring in the support helpdesk ○ Leverage content from the business’ Knowledge base to answer T1 queries ○ Reduce time spent by support agents on T1 tickets, thereby enhancing their overall productivity levels ○ Identify historical tickets which are similar to a new ticket - agents can resolve tickets faster by looking up information contained in the similar ticket ● Enabling support agents to understand the different types of questions which are raised by customers ● Help support agents create FAQs which can in turn enhance the bot’s self service potential ● Enable support agents to train the bot further by mapping customer queries to expected responses
  • 6. Datasets for model training ● Source - Freshdesk data pertaining to customer (business) accounts ○ Includes tickets and Knowledge base articles, FAQs ○ Includes tickets from different channels such as e-mail, portal (raised on website), chat, social and phone ● Data of different accounts - All active and paid accounts with at least 100 tickets in the last 3 months. ● Training strategy ○ One model per account trained end-end ○ Embeddings trained at industry level, models at account level Note: Tickets from email, portal-direct, chat and phone channels account for close to 95% of the ticket volume
  • 7. Modeling Methodology FAQ Answerbot Data Train - Historical ticket data + knowledge base - test tickets Test - tickets in the last 10 days (no overlap with train) Candidate responses - Articles/FAQs from the Knowledge base Preprocessing Email cleaning - signature cleaning, cleaning forwarded emails, removal of code constructs, non-ascii characters, salutation, text below signature Primary preprocessing - unicode normalization, lower casing, punctuation removal, stop words removal & stemming Secondary preprocessing - bigram processing L1 Layer Ensemble of LSA and W2V vector space embeddings L1 Similarity metric Cosine similarity
  • 8. Modeling Methodology FAQ Answerbot L2 Features 1. % word match between the query and candidate responses 2. % word match between words with similar parts-of-speech tags 3. Word mover distance 4. Ordered bigram and trigram counts L2 Model RandomForest / XGBoost Thresholds Based on L1 and L2 scores (with override levels)
  • 9. Offline Model Training Train data Candidate responses (n) Test data (m) Preprocessing - Email cleaning, primary & secondary preprocessing L1 (Embedding) Layer - training Candidate responses (n) Test vectors (m) Pick top k responses based on L1 scores (m*k) Feature Creation Preprocessing - missing value imputation, outlier treatment, scaling L2 (Classification) Layer - training Relevance Probability Vector ((m-t)*k) Pick top 3 based on prob ((m-t)*3) + evaluation Train data (t) Candidate responses (k) Test data (m-t) Redis S3 Write Lookup/w ord vectors/idf Write class model object Write L1 & L2 thresholds for gating and ranking
  • 10. Online Processing I/P Query Preprocessing - Email cleaning, primary & secondary preprocessing L1 (Embedding) Layer - transformation Candidate response vectors (n) Query vector Pick top k based on similarity (1*k) Feature Creation Preprocessing - missing value imputation, outlier treatment, scaling L2 (Classification) Layer - prediction Relevance Probability Vector (1*k) Pick top 3 based on prob (1*3) Redis S3 Read Lookup/w ord vectors/idf Read class model object Read L1 & L2 thresholds for gating and ranking
  • 11. Onboarding a customer account ● Onboarding a new customer account involves extracting tickets and articles from the data lake and training the L1 model (LSA) ● Onboarding also involves choosing the right pre-trained word embedding corresponding to the account’s industry ○ Example industries : Retail, Financial services, SAS, Healthcare, Education ● An ensemble of LSA and W2V embeddings is used to generate L1 scores for each (query, response) pairs ● A downstream classification (L2) model is trained to generate model confidence scores for each (query, {response}) tuple ○ If enough data is not available for the concerned account, an industry-level L2 model is used ● Thresholding, i.e. deciding whether to answer a given query or not, is based on both L1 and L2 scores
  • 12. ● Model refresh is key to ensuring that the models are up to date and stay relevant over time ● This is done once a week; or as soon as an account accumulates a sizeable number of new queries or Knowledge base updates ● It involves the following steps ○ Retraining the LSA model after including the newly accumulated data ○ Incremental training of word vectors with new data ○ Retraining the L2 (classification) model on recent data ■ The L2 model is trained by manually labeling if the responses from the L1 layer are relevant or not (1/0) ■ A 3rd party company is engaged to label these responses Periodic model refresh
  • 13. Teach the bot ● Teach the bot is a feature that allows customer support agents to explicitly train the bot by ingesting Q → A mappings ● When the Answerbot fails to respond to a query (Q), the agent can point the bot to the expected response (A) which should have been returned ● If a suitable response (A) does not exist in the Knowledge base, it can be created on-the-fly ● This expected response (A) is consumed and mapped to be close to the query vector (Q) in the L1 vector space ○ This ensures that article A would show up for future queries that are similar to Q ○ The same feature is re-purposed to resolve incorrect bot responses as well ○ This feature also helps to improve the overall coverage levels of the Answerbot
  • 14. Metrics and business impact Month # Active Clients # Requests # Responded # Helpful # No Feedback % Deflection May’18 97 10,805 6,075 1,657 1,868 15.34% Jun’18 151 22,195 12,969 2,550 5,981 11.49% July’18 182 30,376 19,330 3,792 5,669 12.48% Aug’18 242 50,049 29,948 5,940 7,839 11.87% Sep’18 347 63,587 38,064 8,308 10,112 13.07% Oct’18 457 101,493 56,390 16,589 33,360 16.34% Nov’18 478 130,687 78,902 25,680 46,555 19.65% Dec’18 480 137,517 82,366 23,713 52,772 17.24% ● CSAT* - 79% with bots and 72% without bots ● Average First Response Time (overall) - 13 hrs with bots and 19 hrs without bots *CSAT - Customer Satisfaction Score
  • 15. Understanding the Metrics ● # Active clients - number of customers who are exposing the bot to their customers in their support portal ● # Requests - number of requests that the bot gets ● # Responded - number of requests responded/answered by the bot ● # Helpful - number of requests where the bot responses were helpful ○ Alongside every bot response, a “Was this helpful?” message is also shown and the user’s feedback is solicited. This helps in tracking helpful responses. ● # No Feedback - number of bot responses for which there was no feedback from users ● % Deflection - Ratio of the # Helpful and # Requests
  • 16. Challenges and learnings Challenges: ● Developing a preprocessing mechanism that can extract only the salient components from messy emails ● Handling the complexity of storing and retrieving vector of floats (idfs, SVD components, word vectors) for every account ● Serving predictions at low latency ● Handling kafka streams for updating content in real time - Spark streaming ● Usage of the right tools for monitoring and finding bugs in the codebase in a proactive manner Lessons Learnt: ● Start with a simple model and add incremental improvements over a period of time ● Involve data engineers at the very beginning to create pipelines for data; front-end engineers for making changes to the UI ● Define success metrics and inform stakeholders about what a reasonable target is
  • 18. Appendix Why are some suggestions not helpful to the user? ● Query could relate to a new topic for which there may not be enough FAQs or articles ● Query could relate to an existing topic but may contain keywords which are not in the vocabulary - This may result in low L1 and 2 confidence which may not satisfy the thresholds ● Query may be related to a particular action - Example: “Can you connect me to an agent?” which is a question for a task completion bot that has intent detection capabilities ● Query may not have a question or issue - Example: “I have an open ticket 3335924” ● Query may be ambiguous or unclear - Example: “discussion”