ML Framework for auto-responding to customer support queries

Frankbot - ML framework for auto-responding
to customer support queries

Outline of the talk
● Introduction to Freshdesk
● Motivation and Objectives
● Datasets for model training
● Modeling Methodology
○ Offline training
○ Online processing
○ Onboarding a customer account
○ Periodic model refresh
○ Teach the bot
● Metrics and business impact
○ Understanding the metrics
○ Challenges and learnings

Introduction to Freshdesk
Freshdesk is a multi-channel cloud based customer support product, which enables businesses to
● Streamline all customer conversations in one place - these are conversations between the business and its end
customers
● Automate repetitive work and make support agents more efficient
● Enable support agents to collaborate with other teams to resolve issues faster
● Freshdesk tickets are a record of customer conversations across channels (read phone, chat, e-mail, social, etc.)
○ A typical conversation includes customer queries and agent responses
○ Frequently recurring customer queries are called T1 tickets
● Freshdesk currently has ~150,000 customers from across the world
Some statistics from companies using Freshdesk
● Average proportion of T1 tickets - 80%
● Average proportion of tickets with answers in the knowledge base - 60%
● Average proportion of tickets with answers in the ticket conversation - 70%

Motivation and Objectives
● To build a Machine learning based bot which can do the following
○ Intercept and auto-resolve T1 tickets which are frequently recurring in the support helpdesk
○ Leverage content from the business’ Knowledge base to answer T1 queries
○ Reduce time spent by support agents on T1 tickets, thereby enhancing their overall
productivity levels
○ Identify historical tickets which are similar to a new ticket - agents can resolve tickets faster by
looking up information contained in the similar ticket
● Enabling support agents to understand the different types of questions which are raised by
customers
● Help support agents create FAQs which can in turn enhance the bot’s self service potential
● Enable support agents to train the bot further by mapping customer queries to expected responses

Datasets for model training
● Source - Freshdesk data pertaining to customer (business) accounts
○ Includes tickets and Knowledge base articles, FAQs
○ Includes tickets from different channels such as e-mail, portal (raised on website),
chat, social and phone
● Data of different accounts - All active and paid accounts with at least 100 tickets in the
last 3 months.
● Training strategy
○ One model per account trained end-end
○ Embeddings trained at industry level, models at account level
Note: Tickets from email, portal-direct, chat and phone channels account for close to 95% of
the ticket volume

Modeling Methodology
FAQ Answerbot
Data Train - Historical ticket data + knowledge base - test tickets
Test - tickets in the last 10 days (no overlap with train)
Candidate responses - Articles/FAQs from the Knowledge base
Preprocessing Email cleaning - signature cleaning, cleaning forwarded emails, removal of
code constructs, non-ascii characters, salutation, text below signature
Primary preprocessing - unicode normalization, lower casing, punctuation
removal, stop words removal & stemming
Secondary preprocessing - bigram processing
L1 Layer Ensemble of LSA and W2V vector space embeddings
L1 Similarity metric Cosine similarity

Modeling Methodology
FAQ Answerbot
L2 Features 1. % word match between the query and candidate responses
2. % word match between words with similar parts-of-speech tags
3. Word mover distance
4. Ordered bigram and trigram counts
L2 Model RandomForest / XGBoost
Thresholds Based on L1 and L2 scores (with override levels)

Offline Model Training
Train data
Candidate
responses (n)
Test data
(m)
Preprocessing - Email cleaning, primary &
secondary preprocessing
L1 (Embedding) Layer -
training
Candidate
responses (n)
Test
vectors (m)
Pick top k responses based on
L1 scores (m*k)
Feature Creation
Preprocessing - missing value imputation,
outlier treatment, scaling
L2 (Classification)
Layer - training
Relevance Probability Vector ((m-t)*k)
Pick top 3 based on prob ((m-t)*3) +
evaluation
Train data
(t)
Candidate
responses (k)
Test data
(m-t)
Redis
S3
Write
Lookup/w
ord
vectors/idf
Write class
model object
Write L1 & L2 thresholds for gating and
ranking

Online Processing
I/P Query
Preprocessing - Email cleaning, primary &
secondary preprocessing
L1 (Embedding) Layer -
transformation
Candidate
response
vectors (n)
Query
vector
Pick top k based on similarity
(1*k)
Feature Creation
Preprocessing - missing value imputation,
outlier treatment, scaling
L2 (Classification)
Layer - prediction
Relevance Probability Vector (1*k)
Pick top 3 based on prob (1*3)
Redis
S3
Read
Lookup/w
ord
vectors/idf
Read class
model object
Read L1 & L2 thresholds for gating and
ranking

Onboarding a customer account
● Onboarding a new customer account involves extracting tickets and articles from the data
lake and training the L1 model (LSA)
● Onboarding also involves choosing the right pre-trained word embedding corresponding
to the account’s industry
○ Example industries : Retail, Financial services, SAS, Healthcare, Education
● An ensemble of LSA and W2V embeddings is used to generate L1 scores for each
(query, response) pairs
● A downstream classification (L2) model is trained to generate model confidence scores
for each (query, {response}) tuple
○ If enough data is not available for the concerned account, an industry-level L2
model is used
● Thresholding, i.e. deciding whether to answer a given query or not, is based on both L1
and L2 scores

● Model refresh is key to ensuring that the models are up to date and stay relevant over
time
● This is done once a week; or as soon as an account accumulates a sizeable number of
new queries or Knowledge base updates
● It involves the following steps
○ Retraining the LSA model after including the newly accumulated data
○ Incremental training of word vectors with new data
○ Retraining the L2 (classification) model on recent data
■ The L2 model is trained by manually labeling if the responses from the L1 layer are
relevant or not (1/0)
■ A 3rd party company is engaged to label these responses
Periodic model refresh

Teach the bot
● Teach the bot is a feature that allows customer support agents to explicitly train the bot by
ingesting Q → A mappings
● When the Answerbot fails to respond to a query (Q), the agent can point the bot to the expected
response (A) which should have been returned
● If a suitable response (A) does not exist in the Knowledge base, it can be created on-the-fly
● This expected response (A) is consumed and mapped to be close to the query vector (Q) in the
L1 vector space
○ This ensures that article A would show up for future queries that are similar to Q
○ The same feature is re-purposed to resolve incorrect bot responses as well
○ This feature also helps to improve the overall coverage levels of the Answerbot

Metrics and business impact
Month
# Active
Clients
# Requests # Responded # Helpful # No Feedback % Deflection
May’18 97 10,805 6,075 1,657 1,868 15.34%
Jun’18 151 22,195 12,969 2,550 5,981 11.49%
July’18 182 30,376 19,330 3,792 5,669 12.48%
Aug’18 242 50,049 29,948 5,940 7,839 11.87%
Sep’18 347 63,587 38,064 8,308 10,112 13.07%
Oct’18 457 101,493 56,390 16,589 33,360 16.34%
Nov’18 478 130,687 78,902 25,680 46,555 19.65%
Dec’18 480 137,517 82,366 23,713 52,772 17.24%
● CSAT* - 79% with bots and 72% without bots
● Average First Response Time (overall) - 13 hrs with bots and 19 hrs without bots
*CSAT - Customer Satisfaction Score

Understanding the Metrics
● # Active clients - number of customers who are exposing the bot to their customers in their
support portal
● # Requests - number of requests that the bot gets
● # Responded - number of requests responded/answered by the bot
● # Helpful - number of requests where the bot responses were helpful
○ Alongside every bot response, a “Was this helpful?” message is also shown and the user’s
feedback is solicited. This helps in tracking helpful responses.
● # No Feedback - number of bot responses for which there was no feedback from users
● % Deflection - Ratio of the # Helpful and # Requests

Challenges and learnings
Challenges:
● Developing a preprocessing mechanism that can extract only the salient components from
messy emails
● Handling the complexity of storing and retrieving vector of floats (idfs, SVD components, word
vectors) for every account
● Serving predictions at low latency
● Handling kafka streams for updating content in real time - Spark streaming
● Usage of the right tools for monitoring and finding bugs in the codebase in a proactive manner
Lessons Learnt:
● Start with a simple model and add incremental improvements over a period of time
● Involve data engineers at the very beginning to create pipelines for data; front-end engineers for
making changes to the UI
● Define success metrics and inform stakeholders about what a reasonable target is

Appendix
Why are some suggestions not helpful to the
user?
● Query could relate to a new topic for which there may not be enough FAQs or articles
● Query could relate to an existing topic but may contain keywords which are not in the vocabulary
- This may result in low L1 and 2 confidence which may not satisfy the thresholds
● Query may be related to a particular action - Example: “Can you connect me to an agent?”
which is a question for a task completion bot that has intent detection capabilities
● Query may not have a question or issue - Example: “I have an open ticket 3335924”
● Query may be ambiguous or unclear - Example: “discussion”

ML Framework for auto-responding to customer support queries

More Related Content

Similar to ML Framework for auto-responding to customer support queries (20)

Recently uploaded (20)

ML Framework for auto-responding to customer support queries