SlideShare a Scribd company logo
Natural Language
Processing At Scale
For optimizing business success and Customer
Experience
ML Ops Aspects
MLOps: Production and Engineering - Bay Area, March 2021
Andrei Lopatenko, VP Engineering, Zillow
What’s the focus of this talk
How to implement and use Natural Language Processing in with your organization
at scale for a large business impact, with low development and infrastructure
costs
How to solve many different business problems with NLP, improve customer
experience,
How build NLP development processes, ops serving your business
NLP at Scale
Building NLP systems ‘at scale’
At scale means both
1. for multiple business tasks, building systems for a wide adoption within the company for very
heterogeneous tasks related to processing of natural languages and processing online requests of
your customers/users, documents, etc
2. For high load in the number of users’ requests per day/second, the number of documents to be
processed per second, the number of documents to be processed in one batch etc
Doing it right way has the high ROIs
I would like to advocate that building company wide and deep impact NLP systems is relatively ‘easy’ now
vs 5-10 years ago: it’s doable within relatively short period of times, with small investments, low
maintenance costs but big business and customer experience impact
Why I am talking about it
I have been applying NLP in Google, Apple, WalmartLab, eBay, Zillow since 2006.
Core contributor to core ranking Google Search 2006, co-founder Apple Maps Search (2010), and core
contributor AppStore Search, Walmart search, led Walmart (2014) and eBay Search Science teams,
engineering of Recruit Holding AI Lab, leading Zillow Search and Conversational AI (2019-now). Startups:
Ozlo (NLP, Conversational AI startup, acquired by Facebook in 2017),
In every organization I worked for, NLP was one of key technologies driving business and customer
experience
In 2021, due to abundance of NLP tools from development to serving, building big impact NLP systems is
more available than even several years ago
I’d like to share my 15-year experience of how to build NLP systems at scale for customer and business
gains
Motivating examples: NLP use cases. Case 1
Customer facing online systems: Search
Example: Web, Maps, Real Estate , eCommerce, Apps, and any other big search engines
https://blog.google/products/search/search-language-understanding-bert/
BERT - new NLP models radically changes search for 10% of traffic (reported in 2018)
But ‘old’ NLP techniques such as synonym expansion, term weighting, shallow parsing, phrase
chunking, query classification and many other have been driving majority of search experience
online since early 200X
This is applicable to any search engine (ecommerce, apps, films, real estate), NLP radically
improve quality of search results leading to improvement of customer experience and revenues
through purchases
Motivating example. NLP use cases. Case 2
Customer facing online systems: Recommendations .
Example from Zillow. Embedding representing information about properties
extracted from full text helps with online recommendations and other downstream
application. Similar homes recommendations.
https://www.zillow.com/tech/improve-quality-listing-text/
Motivating examples: NLP use cases. Case 3
Business facing Question Answering / online
Example: Bloomberg Trade Order Management Solution
https://www.bloomberg.com/professional/blog/bloomberg-adds-new-nlp-capabilities-to-to
ms/
Questions such as “Who are our top 5 accounts in the tech sector?”
Natural Language based Question Answering to unstructured text information (documents, find the
paragraph with answer “what’s our return policy”) and structured information in databases (“how many
umbrellas we solt last week”) (Natural Language Interface to Databases NLIDB)
Motivating examples: NLP use cases. Case 4
Analysis of conversations / (near) real-time streaming
Fascinating example: school project : Prioritization of Emergency Dispatch Calls, 3 high school participants built a
system to analyze emergency phone calls and assess their priority
https://medium.com/ai4allorg/using-natural-language-processing-to-prioritize-emergency-dispatch-calls-ab830a72de
98
Recent, example, European company Corti deploys a realtime system to analyze calls and detect Cardiac arrests ,
it’s more voice analysis (area close to the NLP) rather than NLP, but typically similar MLOps and systems as for the
the NLP systems
https://www.theverge.com/2018/4/25/17278994/ai-cardiac-arrest-corti-emergency-call-response
Understanding phone calls, transcribing them, assessing quality of service, needs of customer, performance of
customer support/business to get customer insights, assess quality of business agents, quality of conversations,
extract global insights
Motivating examples: NLP use cases. Case 5
Customer support dialog systems (Chatbots)
Example: Amazon customer support chatbot
https://lifehacker.com/use-a-chatbot-for-faster-amazon-returns-1843927743
Reporting a problem to amazon, amazon chatbot solves many customer problems
Very fast and efficient, reduced costs on huma force
Motivating examples: NLP use cases. Case 6
Item understanding
Example: Amazon or Walmart marketplaces
Getting large stream of unstructured data from various providers
Converting it into structured data (rather than embedding as in ex 2),
understanding items, both for customer and business applications (extraction of
attributes of items from merchant descriptions, analysis of reviews): multiple
business and consumer facing downstream applications from online user
experience to business analytics
https://dl.acm.org/doi/abs/10.1145/3183713.3196926
Motivating Examples: summary
NLP is already used to improve customer experience and business in many very different
types of businesses and across different customer experiences and business application.
It has high ROI if applied correctly (right applications, right technologies, right people)
There are very different ways to apply NLP: online, streaming, batch processing, it
frequently requires different types of systems - task is to build MLOps for all of them
6 use cases - mostly random list: there are too many other NLP use cases (
autocomplete online, classification by category offline, ...) where it radically improves
customer experience and brings big business gains
NLP: impact on business
Most of media hype about the NLP is about chatbots
In 2020, most of business impact of NLP is in other areas (but conversational
systems are useful too)
Better search, recommendation, personalization based on NLP -> billions of
dollars
Document understanding, better classification, information extraction -> hundreds
of millions of dollars
NLP serving scenarios
online scenarios : customer facing applications, critical latency and throughput,
such as search query understanding, up to 100s models, 50 ms latency, 10000+
qps, business facing (question answering)
Streaming scenarios: documents, items, processing relatively large texts (10K
symbols), 100 millions per day
Batch scenarios: documents, users, (process billion documents to extract data,
latency requirements might be process batch within a day or 4 hours etc, depends
on business needs)
NLP systems
But applying is NLP is not just about training models, it’s about building systems
Systems which will
1. Serve models in production (various serving scenarios)
2. Train models (science workbenches)
3. Annotate data
4. Deploy models from lab to production
5. Test models, validate models, monitor models (performance, accuracy, compliance, fairness, models
and end to end systems)
6. Integrate model serving with instream of production data
7. Integrate with outstreams : consumer and business facing applications
NLP Systems
Majority of big business revenue and customer experience gains are not from the
most recent, the best NLP models (‘the best science’ )
But from ‘the best engineering’, high performance, reliable, robust, scalable
systems, which are integratable with multiple business and consumer
applications, monitorable, debuggable
Focus is on reliability, robustness, performness, operations excellence,
development engineering quality, openness for collaboration (across functions:
data engineering, NLP scientists, DS scientists, application developers, etc)
Once system works and brings value, state of the art models (accuracy, no bias,
fairness, performance) is the focus
NLP systems
Scientist workbench: access to data sets (from large corpora: web or search logs
to ‘small’ ), annotation tools, data processing and data management, metric tools,
model training, tuning, model management (sharing, storing, retrieving)
Deployment tools: model validation, deployment into various environments
(integrated with CI/CD), model management,
Inference: model workflows, monitoring, alerting, online validation, performance
measurement/’observability’, hardware allocation / scaling
Integration with instream and outstream
NLP systems: high level 3 ways
Cloud native: build the system using standard cloud components
From scratch: write your own from scratch ,
Hybrid: using big open source or cloud blocks for certain tasks (there are plenty of
those now), custom build systems for other tasks
NLP systems; Cloud native
Multiple ways to build NLP systems
high level cloud NLP and ML services, Amazon Comprehend, Sagemaker,
Sagemaker Ground Truth, Transcribe (for Speech to Text) , Text Analytics, Lex.
Textract
Pros: Very fast to develop a prototype and to make working systems, low
development costs, easy to integrate with other systems on the same cloud
(Redshift if amazon etc), low cost operations for managed solution
Cons: high cloud/compute costs, low flexibility in the types of models to develop
and opportunities to develop high accuracy models, performance is not optimized,
integration with non-cloud systems
NLP systems Cloud native
Advantage: Fast MLOps pipeline development
Plenty of tools: S3 for models and artifacts, CloudFormation, AWS CodePipeline
and CodeBuild (with Git) ,ECR Container Registry, SageMaker, AWS Batch, API
Gateway, Sagemaker pipelines = NLP Services, Comprehend - very fast to build
and prototype NLP systems
Another advantage: reasonably easy to adopt to your environment, Terraform
instead of cloudformation, your serving infrastructure instead of AWS
Easy to build multi environments deployment scenarios
NLP Systems. Built from scratch
Built from scratch or based on (rewritten if needed) open source
Abundance of open source: PyTorch serving/TF Serving, Hugging Face
Transformers, AllenNLP Lab environment, Spacy(plenty of other NLP libraries),
Docano (annotations)
Pros: more opportunities of optimization for accuracy of models and performance
of systems, customization for your company needs, owning the software
Cons: longer prototype and production development times, high operational
support costs,
NLP Systems Built from scratch
Might be necessary - example, your NLP models as a part of query understanding
stack, 100s models, Gb+ dictionaries, complicated dependencies, specialized
hardware - flash drive storage etc is required on many nodes, latency and
throughput critical.
There is no good available software to serve this scenario. Nevertheless, part of
this stack can be based on open source (training, model sharing, annotation,
analysis of experiments, monitoring, )
NLP Systems. Mix of cloud and custom built
Mix of cloud and custom built software:
Cloud solution for serving: Sagemaker, AWS Batch, Elastic Inference: different
scenarios
Or Kubeflow on AWS
Pros: quite rapid development and deployment
Cons: cloud costs higher than in the build from scratch scenario but less than in
the cloud native scenario, development costs are cheaper than in bfs, but higher
than in the cloud native,
NLP Systems. Mix of cloud and custom built
Multiple scenarios of deployments (managed Kubernetes vs own)
Requires support to build custom expansions (Kubeflow operators for your serving
frameworks, not all native kubeflow operators are good - require some work to
improve them )
Many high level tools are available: Kubeflow, Cortex (from CortexLabs),
Hydrosphere (managing, monitoring models), Seldon (serving), Neptune
(experiment management), MLFlow (experiment, model, data tracking,
deployment, model registry), Comet (experiment tracking, comparison)
And low level tools: Istio, Kubernetes, Prometheus
NLP Systems. Mix of cloud and custom build
Kubeflow , example, streaming, non latency / qps critical online
Kubeflow pipelines, (end to end orchestration)
TF.Serving, Seldon, (serving)
Jupyter, katib, modeldb, TFMA (TF Model Analysis), TF transform (training,
workbench)
Pytorch, Tensorflow, MXNet
NLP systems development and adoption timeline
NLP is relatively new for many businesses, there is a lot of excitement and a lot of uncertainties
in expectations
To prove value , one have to iterate very fast, build NLP systems and models rapidly, integrate
them with business systems and environments fast, with minimum development (human+
software+) costs - show the value from business and customers points of view.
Build cloud native system fast - show the value to the business, and as it scales by the number of
consumers, lines of business, data, other loads -> move to other architectures if needed to
improve performance, costs. Important to build a good evolvable design from the beginning (it’s
true for any system, the evolvability is as important as scalability etc)
NLP systems - tradeoff
Tradeoffs because of difference methods: classical vs deep neural
1. Inference: 10% model accuracy vs 90% latency difference (gain in customer
experience/conversion due to quality vs loss in customer experience due to
latency and op costs)
2. Training: ex: 1 billion documents, need results in 4 hours, training time
● The design must support running very different solutions inside
● Organizational structure must support taking such decision
● Analytics/ROI assessment must support proving proper data as input to make
such decision
Many other tradeoffs
Scalability by design
When building, important to design systems to be scalable in multiple dimensions: it’s hard to overestimate
future demands
1. The number of human languages and domain area languages
2. The load (qps online systems, the number of messages per second - streaming, the number of
documents in batch and the number of batches - batch systems )
3. The number of different models and the number of different types of models (extraction,
classification, correction, text prediction, text generation etc)
4. The number of developers and scientists working simultaneously deploying new models, new types
of data, new integrations etc
5. The number of metrics the system is monitored and the models are monitored
6. The amounts of data in training, serving
7. The number of use cases, the number of deployments (data centers, regions, nodes)
8. etc
NLP at Scale
Important factor: typically, there are very different serving scenarios from,
example, search online - dozen/hundreds models with multi gigabyte ‘dictionaries’,
some are in parallel, some are consequential, 50ms latency, 10^4+ qps, to
streaming - billion documents per day to batch processing. No one system will
serve all inference cases, necessary to build multiple systems
But training, verification, testing scenario are more unifiable, and it’s possible to
build one scientist workbench, lab environment. It’s beneficial to build one to share
data and models across the organization
Ops
Continuous retraining of models when needed
Support of frequent deployment of models as models are improved and new
models are deployed. Integration of NLP scientist workbench with production
environment. Validation of models
Scalability, how the system scale as traffic, stream, batch size increase, size of
document increase, the number of models run in parallel, other load parameters
may change
Monitoring for performance, incidents, exceptions, quality of models and
end-to-end applications based on the NLP
Ops
Monitoring - what may go wrong:
1. Model performance : model and end-to-end (overall, by segments: users,
categories, regions)
2. Global data changes (changes in global distributions caused by events or seasonal
shifts.. )
3. Incoming data quality issues
4. System performance, uptime
5. Biases, compliances, fairness
6. Significant changes, outliers
Monitoring, alerting, logging
NLP libraries
(separation is conditional, many of them are in both categories)
‘Old’ good technologies: hidden markov chains, conditional random fields, SVM for classification, PCFGs and Dirichlet processes
and software: Stanford CoreNLP, CRFSuite, CRF++, OpenNLP, MeTA, Sempre, Mallet (still useful in some scenarios) - tradeoffs are
in next slides
‘New’ technologies: Spacy, GenSim, Hugging Face Transformers (invaluable by now), FastText, AllenNLP (lab environment),
PyTorch NLP, FlairNLP, DeText, many others
A lot of academic open source code which is adaptable to industrial environments (see Papers with Code, NLP section)
High level libraries helping to build end to end solutions for some domains: Rasa (dialog systems),
Do not hesitate to get inside of open source: Stanford library performance was improved 10X by proper multithreading
implementation and it makes a big difference when you need to process a stream of large documents 40+ millions tokens per hour
Team
To build, support and use system successfully
Strong engineering, science, and product management is required
Modern NLP stack based on deep neural architectures -> BERT and other
Deep understanding of cloud ML infrastructure if you are on the cloud (example
AWS ML infrastructure)
Generic software engineering - building systems rather than just models
Engineering culture, Ops
Data Training sets
Many NLP models are re-usable for many tasks
You company operates in a certain domain such as eCommerce or real estate or
medical or transportation with its particular language. Models and knowledge
which learnt particularities of the domain language for a certain case might be
re-usable for other cases in the same domain (by various technique). Model
discovery, re-sharing simplifies adoption of the NLP across multiple lines of
business
Training and testing data resharing - accelerates model development and the
NLP adoption
NLP Training sets and Metrics
Training sets are important as they train your models for something important/beneficial and
metrics are important if they contribute to measurement of the final impact
What are classification tasks which will benefit your business (improve conversion or purchase
rate for search, better routing of phone calls or customer support ticket ), what are extraction
tasks which will benefit your business (what knowledge graph do you need for better search or
recommendation, which entities are important for browsing by your business agents etc ) , not
‘what’s perplexity of the language model’ but ‘how many symbols customer type in autocomplete
or what percentage of spelling errors is solved’- what will improve the end to end quality and
performance
Focus is on end to end performance rather than classical NLP level metrics only and mature
your systems by development them to impact end to end quality and performance
Continuous improvement circle
Almost none of real world NLP tasks can have a final ‘perfect’ solution <- needs a lot of
leadership support to promote this vision and align with business, incremental gains in system
performance and model accuracy means gains to the business (but need to build system,
measurement, and attribution framework to execute well on it)
Each NLP model can be improved in accuracy, perplexity, etc but what really important is impact
on end to end system - conversion, revenue per session, document processing time etc
Each NLP system can be improved from performance, scalability, cloud costs etc points of view.
Improvement of NLP models and NLP systems has high ROI if done correctly but to do it
correctly requires a lot of work. End to end analysis of systems rather than just model evaluation.
Attribution analysis etc. In big business, building such environment and building organization to
improve NLP pays back
ROI assessment. Expenses
Expenses: salaries + software costs + compute/storage costs + data annotation costs
Small team of several good experts can create an NLP system, integrate it with business
within your company and prove value of it. You do not need more than 5 people to solve
serious tasks
Software costs. Most business cases can be solved using open source software,
Hugging Face Transformers, PyTorch Serve or TF Serving etc whole infrastructure for
training and serving (and other tasks : annotation, can be built using open source)
Compute/Storage costs. Depends. AWS Comprehend etc - more expensive, less flexible,
but fast prototyping. GPU machines are needed in many cases
Data annotation. With transfer learning you do not need huge data sets.
ROI assessment. Returns
For some tasks, such as search/recommendation functions directly facing
consumer, the return is easily computed running online controlled experiments.
For some tasks, such as business facing functions: ex. Document classification to
make faster processing, question answering for agents, return is harder to
compute, since one needs to run new business operation for period of time to
measure impact
Some tasks, replacing humans for information extraction, or question answering to
consumer/customer support: return is computed by the number of people
replaced.
Key: build solutions rapidly, to experiment and find the maximum returns.
Conclusion
NLP systems bring significant gains to business and customer experience
Building them is relatively easy task. There are multiple open source libraries,
multiple cloud solutions, there are multiple alternatives how to build NLP system
for your company.
The task of building and using NLP typically has high ROI if approached correctly

More Related Content

PDF
Building multi billion ( dollars, users, documents ) search engines on open ...
PDF
AI in Multi Billion Search Engines. Career building in AI / Search. What make...
PDF
AI in Search Engines
PDF
Deep learning for e-commerce: current status and future prospects
PDF
Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?
PDF
How Artificial Intelligence & Machine Learning Are Transforming Modern Marketing
PDF
Haystack- Learning to rank in an hourly job market
PPTX
Improving Search in Workday Products using Natural Language Processing
Building multi billion ( dollars, users, documents ) search engines on open ...
AI in Multi Billion Search Engines. Career building in AI / Search. What make...
AI in Search Engines
Deep learning for e-commerce: current status and future prospects
Search Product Manager: Software PM vs. Enterprise PM or What does that * PM do?
How Artificial Intelligence & Machine Learning Are Transforming Modern Marketing
Haystack- Learning to rank in an hourly job market
Improving Search in Workday Products using Natural Language Processing

What's hot (20)

PDF
How Artificial Intelligence & Machine Learning Are Transforming Modern Market...
PDF
The Power of Declarative Analytics
PPTX
Interleaving, Evaluation to Self-learning Search @904Labs
PDF
Human in the Loop AI for Building Knowledge Bases
PDF
Text Analytics
PDF
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
PPTX
Taming the Wild West of NLP
PPTX
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
PPTX
Toolboxes for data scientists
PPT
SystemT: Declarative Information Extraction
PDF
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
PDF
Activate 2018 Closing Remarks: The Future of Search & AI - Trey Grainger
DOCX
Nikhil CV
PDF
The Machine Learning Workflow with Azure
PDF
Real-time big data analytics based on product recommendations case study
PPTX
[RakutenTechConf2013][C-4_3] Our Goals and Activities at Rakuten Institute o...
PPT
Automatic suggestion of query-rewrite rules for enterprise search
PDF
FrugalML: Using ML APIs More Accurately and Cheaply
PDF
Data Analytics and Artificial Intelligence in the era of Digital Transformation
PDF
Enterprise Search – How Relevant Is Relevance?
How Artificial Intelligence & Machine Learning Are Transforming Modern Market...
The Power of Declarative Analytics
Interleaving, Evaluation to Self-learning Search @904Labs
Human in the Loop AI for Building Knowledge Bases
Text Analytics
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
Taming the Wild West of NLP
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
Toolboxes for data scientists
SystemT: Declarative Information Extraction
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
Activate 2018 Closing Remarks: The Future of Search & AI - Trey Grainger
Nikhil CV
The Machine Learning Workflow with Azure
Real-time big data analytics based on product recommendations case study
[RakutenTechConf2013][C-4_3] Our Goals and Activities at Rakuten Institute o...
Automatic suggestion of query-rewrite rules for enterprise search
FrugalML: Using ML APIs More Accurately and Cheaply
Data Analytics and Artificial Intelligence in the era of Digital Transformation
Enterprise Search – How Relevant Is Relevance?
Ad

Similar to Natural Language Processing at Scale (20)

PPT
Stefan Geissler kairntech - SDC Nice Apr 2019
PDF
my model genuines.
PDF
Natural Language Processing Use Cases for Business Optimization
PDF
Using the power of OpenAI with your own data: what's possible and how to start?
PDF
The Future of Natural Language Processing (NLP) in Customer Service
PDF
DataScientist Job : Between Myths and Reality.pdf
PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
PDF
Unlocking Value from Unstructured Data
PDF
Machine Learning, Faster
PPT
Future directives in erp, erp and internet, critical success and failure factors
PPTX
Consulting
PPTX
Tom van Ees - Academic and Commercial software Development
PPTX
Analytics what to look for sustaining your growing business-
PPTX
AI for Customer Service: How to Improve Contact Center Efficiency with Machin...
PDF
Nova era računarstva
PPTX
PDF
Technovision
PPTX
Wdc tech talk cooper hackathon 2015
 
PDF
Cognitive Computing - A Primer
PPTX
Manna engr 245 lean launch pad stanford 2020
Stefan Geissler kairntech - SDC Nice Apr 2019
my model genuines.
Natural Language Processing Use Cases for Business Optimization
Using the power of OpenAI with your own data: what's possible and how to start?
The Future of Natural Language Processing (NLP) in Customer Service
DataScientist Job : Between Myths and Reality.pdf
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
Unlocking Value from Unstructured Data
Machine Learning, Faster
Future directives in erp, erp and internet, critical success and failure factors
Consulting
Tom van Ees - Academic and Commercial software Development
Analytics what to look for sustaining your growing business-
AI for Customer Service: How to Improve Contact Center Efficiency with Machin...
Nova era računarstva
Technovision
Wdc tech talk cooper hackathon 2015
 
Cognitive Computing - A Primer
Manna engr 245 lean launch pad stanford 2020
Ad

Recently uploaded (20)

PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Computer network topology notes for revision
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Lecture1 pattern recognition............
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
Foundation of Data Science unit number two notes
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Introduction to Business Data Analytics.
Clinical guidelines as a resource for EBP(1).pdf
Supervised vs unsupervised machine learning algorithms
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Computer network topology notes for revision
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
STUDY DESIGN details- Lt Col Maksud (21).pptx
Lecture1 pattern recognition............
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Database Infoormation System (DBIS).pptx
Foundation of Data Science unit number two notes
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Acceptance and paychological effects of mandatory extra coach I classes.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
IB Computer Science - Internal Assessment.pptx
Introduction to Business Data Analytics.

Natural Language Processing at Scale

  • 1. Natural Language Processing At Scale For optimizing business success and Customer Experience ML Ops Aspects MLOps: Production and Engineering - Bay Area, March 2021 Andrei Lopatenko, VP Engineering, Zillow
  • 2. What’s the focus of this talk How to implement and use Natural Language Processing in with your organization at scale for a large business impact, with low development and infrastructure costs How to solve many different business problems with NLP, improve customer experience, How build NLP development processes, ops serving your business
  • 3. NLP at Scale Building NLP systems ‘at scale’ At scale means both 1. for multiple business tasks, building systems for a wide adoption within the company for very heterogeneous tasks related to processing of natural languages and processing online requests of your customers/users, documents, etc 2. For high load in the number of users’ requests per day/second, the number of documents to be processed per second, the number of documents to be processed in one batch etc Doing it right way has the high ROIs I would like to advocate that building company wide and deep impact NLP systems is relatively ‘easy’ now vs 5-10 years ago: it’s doable within relatively short period of times, with small investments, low maintenance costs but big business and customer experience impact
  • 4. Why I am talking about it I have been applying NLP in Google, Apple, WalmartLab, eBay, Zillow since 2006. Core contributor to core ranking Google Search 2006, co-founder Apple Maps Search (2010), and core contributor AppStore Search, Walmart search, led Walmart (2014) and eBay Search Science teams, engineering of Recruit Holding AI Lab, leading Zillow Search and Conversational AI (2019-now). Startups: Ozlo (NLP, Conversational AI startup, acquired by Facebook in 2017), In every organization I worked for, NLP was one of key technologies driving business and customer experience In 2021, due to abundance of NLP tools from development to serving, building big impact NLP systems is more available than even several years ago I’d like to share my 15-year experience of how to build NLP systems at scale for customer and business gains
  • 5. Motivating examples: NLP use cases. Case 1 Customer facing online systems: Search Example: Web, Maps, Real Estate , eCommerce, Apps, and any other big search engines https://blog.google/products/search/search-language-understanding-bert/ BERT - new NLP models radically changes search for 10% of traffic (reported in 2018) But ‘old’ NLP techniques such as synonym expansion, term weighting, shallow parsing, phrase chunking, query classification and many other have been driving majority of search experience online since early 200X This is applicable to any search engine (ecommerce, apps, films, real estate), NLP radically improve quality of search results leading to improvement of customer experience and revenues through purchases
  • 6. Motivating example. NLP use cases. Case 2 Customer facing online systems: Recommendations . Example from Zillow. Embedding representing information about properties extracted from full text helps with online recommendations and other downstream application. Similar homes recommendations. https://www.zillow.com/tech/improve-quality-listing-text/
  • 7. Motivating examples: NLP use cases. Case 3 Business facing Question Answering / online Example: Bloomberg Trade Order Management Solution https://www.bloomberg.com/professional/blog/bloomberg-adds-new-nlp-capabilities-to-to ms/ Questions such as “Who are our top 5 accounts in the tech sector?” Natural Language based Question Answering to unstructured text information (documents, find the paragraph with answer “what’s our return policy”) and structured information in databases (“how many umbrellas we solt last week”) (Natural Language Interface to Databases NLIDB)
  • 8. Motivating examples: NLP use cases. Case 4 Analysis of conversations / (near) real-time streaming Fascinating example: school project : Prioritization of Emergency Dispatch Calls, 3 high school participants built a system to analyze emergency phone calls and assess their priority https://medium.com/ai4allorg/using-natural-language-processing-to-prioritize-emergency-dispatch-calls-ab830a72de 98 Recent, example, European company Corti deploys a realtime system to analyze calls and detect Cardiac arrests , it’s more voice analysis (area close to the NLP) rather than NLP, but typically similar MLOps and systems as for the the NLP systems https://www.theverge.com/2018/4/25/17278994/ai-cardiac-arrest-corti-emergency-call-response Understanding phone calls, transcribing them, assessing quality of service, needs of customer, performance of customer support/business to get customer insights, assess quality of business agents, quality of conversations, extract global insights
  • 9. Motivating examples: NLP use cases. Case 5 Customer support dialog systems (Chatbots) Example: Amazon customer support chatbot https://lifehacker.com/use-a-chatbot-for-faster-amazon-returns-1843927743 Reporting a problem to amazon, amazon chatbot solves many customer problems Very fast and efficient, reduced costs on huma force
  • 10. Motivating examples: NLP use cases. Case 6 Item understanding Example: Amazon or Walmart marketplaces Getting large stream of unstructured data from various providers Converting it into structured data (rather than embedding as in ex 2), understanding items, both for customer and business applications (extraction of attributes of items from merchant descriptions, analysis of reviews): multiple business and consumer facing downstream applications from online user experience to business analytics https://dl.acm.org/doi/abs/10.1145/3183713.3196926
  • 11. Motivating Examples: summary NLP is already used to improve customer experience and business in many very different types of businesses and across different customer experiences and business application. It has high ROI if applied correctly (right applications, right technologies, right people) There are very different ways to apply NLP: online, streaming, batch processing, it frequently requires different types of systems - task is to build MLOps for all of them 6 use cases - mostly random list: there are too many other NLP use cases ( autocomplete online, classification by category offline, ...) where it radically improves customer experience and brings big business gains
  • 12. NLP: impact on business Most of media hype about the NLP is about chatbots In 2020, most of business impact of NLP is in other areas (but conversational systems are useful too) Better search, recommendation, personalization based on NLP -> billions of dollars Document understanding, better classification, information extraction -> hundreds of millions of dollars
  • 13. NLP serving scenarios online scenarios : customer facing applications, critical latency and throughput, such as search query understanding, up to 100s models, 50 ms latency, 10000+ qps, business facing (question answering) Streaming scenarios: documents, items, processing relatively large texts (10K symbols), 100 millions per day Batch scenarios: documents, users, (process billion documents to extract data, latency requirements might be process batch within a day or 4 hours etc, depends on business needs)
  • 14. NLP systems But applying is NLP is not just about training models, it’s about building systems Systems which will 1. Serve models in production (various serving scenarios) 2. Train models (science workbenches) 3. Annotate data 4. Deploy models from lab to production 5. Test models, validate models, monitor models (performance, accuracy, compliance, fairness, models and end to end systems) 6. Integrate model serving with instream of production data 7. Integrate with outstreams : consumer and business facing applications
  • 15. NLP Systems Majority of big business revenue and customer experience gains are not from the most recent, the best NLP models (‘the best science’ ) But from ‘the best engineering’, high performance, reliable, robust, scalable systems, which are integratable with multiple business and consumer applications, monitorable, debuggable Focus is on reliability, robustness, performness, operations excellence, development engineering quality, openness for collaboration (across functions: data engineering, NLP scientists, DS scientists, application developers, etc) Once system works and brings value, state of the art models (accuracy, no bias, fairness, performance) is the focus
  • 16. NLP systems Scientist workbench: access to data sets (from large corpora: web or search logs to ‘small’ ), annotation tools, data processing and data management, metric tools, model training, tuning, model management (sharing, storing, retrieving) Deployment tools: model validation, deployment into various environments (integrated with CI/CD), model management, Inference: model workflows, monitoring, alerting, online validation, performance measurement/’observability’, hardware allocation / scaling Integration with instream and outstream
  • 17. NLP systems: high level 3 ways Cloud native: build the system using standard cloud components From scratch: write your own from scratch , Hybrid: using big open source or cloud blocks for certain tasks (there are plenty of those now), custom build systems for other tasks
  • 18. NLP systems; Cloud native Multiple ways to build NLP systems high level cloud NLP and ML services, Amazon Comprehend, Sagemaker, Sagemaker Ground Truth, Transcribe (for Speech to Text) , Text Analytics, Lex. Textract Pros: Very fast to develop a prototype and to make working systems, low development costs, easy to integrate with other systems on the same cloud (Redshift if amazon etc), low cost operations for managed solution Cons: high cloud/compute costs, low flexibility in the types of models to develop and opportunities to develop high accuracy models, performance is not optimized, integration with non-cloud systems
  • 19. NLP systems Cloud native Advantage: Fast MLOps pipeline development Plenty of tools: S3 for models and artifacts, CloudFormation, AWS CodePipeline and CodeBuild (with Git) ,ECR Container Registry, SageMaker, AWS Batch, API Gateway, Sagemaker pipelines = NLP Services, Comprehend - very fast to build and prototype NLP systems Another advantage: reasonably easy to adopt to your environment, Terraform instead of cloudformation, your serving infrastructure instead of AWS Easy to build multi environments deployment scenarios
  • 20. NLP Systems. Built from scratch Built from scratch or based on (rewritten if needed) open source Abundance of open source: PyTorch serving/TF Serving, Hugging Face Transformers, AllenNLP Lab environment, Spacy(plenty of other NLP libraries), Docano (annotations) Pros: more opportunities of optimization for accuracy of models and performance of systems, customization for your company needs, owning the software Cons: longer prototype and production development times, high operational support costs,
  • 21. NLP Systems Built from scratch Might be necessary - example, your NLP models as a part of query understanding stack, 100s models, Gb+ dictionaries, complicated dependencies, specialized hardware - flash drive storage etc is required on many nodes, latency and throughput critical. There is no good available software to serve this scenario. Nevertheless, part of this stack can be based on open source (training, model sharing, annotation, analysis of experiments, monitoring, )
  • 22. NLP Systems. Mix of cloud and custom built Mix of cloud and custom built software: Cloud solution for serving: Sagemaker, AWS Batch, Elastic Inference: different scenarios Or Kubeflow on AWS Pros: quite rapid development and deployment Cons: cloud costs higher than in the build from scratch scenario but less than in the cloud native scenario, development costs are cheaper than in bfs, but higher than in the cloud native,
  • 23. NLP Systems. Mix of cloud and custom built Multiple scenarios of deployments (managed Kubernetes vs own) Requires support to build custom expansions (Kubeflow operators for your serving frameworks, not all native kubeflow operators are good - require some work to improve them ) Many high level tools are available: Kubeflow, Cortex (from CortexLabs), Hydrosphere (managing, monitoring models), Seldon (serving), Neptune (experiment management), MLFlow (experiment, model, data tracking, deployment, model registry), Comet (experiment tracking, comparison) And low level tools: Istio, Kubernetes, Prometheus
  • 24. NLP Systems. Mix of cloud and custom build Kubeflow , example, streaming, non latency / qps critical online Kubeflow pipelines, (end to end orchestration) TF.Serving, Seldon, (serving) Jupyter, katib, modeldb, TFMA (TF Model Analysis), TF transform (training, workbench) Pytorch, Tensorflow, MXNet
  • 25. NLP systems development and adoption timeline NLP is relatively new for many businesses, there is a lot of excitement and a lot of uncertainties in expectations To prove value , one have to iterate very fast, build NLP systems and models rapidly, integrate them with business systems and environments fast, with minimum development (human+ software+) costs - show the value from business and customers points of view. Build cloud native system fast - show the value to the business, and as it scales by the number of consumers, lines of business, data, other loads -> move to other architectures if needed to improve performance, costs. Important to build a good evolvable design from the beginning (it’s true for any system, the evolvability is as important as scalability etc)
  • 26. NLP systems - tradeoff Tradeoffs because of difference methods: classical vs deep neural 1. Inference: 10% model accuracy vs 90% latency difference (gain in customer experience/conversion due to quality vs loss in customer experience due to latency and op costs) 2. Training: ex: 1 billion documents, need results in 4 hours, training time ● The design must support running very different solutions inside ● Organizational structure must support taking such decision ● Analytics/ROI assessment must support proving proper data as input to make such decision Many other tradeoffs
  • 27. Scalability by design When building, important to design systems to be scalable in multiple dimensions: it’s hard to overestimate future demands 1. The number of human languages and domain area languages 2. The load (qps online systems, the number of messages per second - streaming, the number of documents in batch and the number of batches - batch systems ) 3. The number of different models and the number of different types of models (extraction, classification, correction, text prediction, text generation etc) 4. The number of developers and scientists working simultaneously deploying new models, new types of data, new integrations etc 5. The number of metrics the system is monitored and the models are monitored 6. The amounts of data in training, serving 7. The number of use cases, the number of deployments (data centers, regions, nodes) 8. etc
  • 28. NLP at Scale Important factor: typically, there are very different serving scenarios from, example, search online - dozen/hundreds models with multi gigabyte ‘dictionaries’, some are in parallel, some are consequential, 50ms latency, 10^4+ qps, to streaming - billion documents per day to batch processing. No one system will serve all inference cases, necessary to build multiple systems But training, verification, testing scenario are more unifiable, and it’s possible to build one scientist workbench, lab environment. It’s beneficial to build one to share data and models across the organization
  • 29. Ops Continuous retraining of models when needed Support of frequent deployment of models as models are improved and new models are deployed. Integration of NLP scientist workbench with production environment. Validation of models Scalability, how the system scale as traffic, stream, batch size increase, size of document increase, the number of models run in parallel, other load parameters may change Monitoring for performance, incidents, exceptions, quality of models and end-to-end applications based on the NLP
  • 30. Ops Monitoring - what may go wrong: 1. Model performance : model and end-to-end (overall, by segments: users, categories, regions) 2. Global data changes (changes in global distributions caused by events or seasonal shifts.. ) 3. Incoming data quality issues 4. System performance, uptime 5. Biases, compliances, fairness 6. Significant changes, outliers Monitoring, alerting, logging
  • 31. NLP libraries (separation is conditional, many of them are in both categories) ‘Old’ good technologies: hidden markov chains, conditional random fields, SVM for classification, PCFGs and Dirichlet processes and software: Stanford CoreNLP, CRFSuite, CRF++, OpenNLP, MeTA, Sempre, Mallet (still useful in some scenarios) - tradeoffs are in next slides ‘New’ technologies: Spacy, GenSim, Hugging Face Transformers (invaluable by now), FastText, AllenNLP (lab environment), PyTorch NLP, FlairNLP, DeText, many others A lot of academic open source code which is adaptable to industrial environments (see Papers with Code, NLP section) High level libraries helping to build end to end solutions for some domains: Rasa (dialog systems), Do not hesitate to get inside of open source: Stanford library performance was improved 10X by proper multithreading implementation and it makes a big difference when you need to process a stream of large documents 40+ millions tokens per hour
  • 32. Team To build, support and use system successfully Strong engineering, science, and product management is required Modern NLP stack based on deep neural architectures -> BERT and other Deep understanding of cloud ML infrastructure if you are on the cloud (example AWS ML infrastructure) Generic software engineering - building systems rather than just models Engineering culture, Ops
  • 33. Data Training sets Many NLP models are re-usable for many tasks You company operates in a certain domain such as eCommerce or real estate or medical or transportation with its particular language. Models and knowledge which learnt particularities of the domain language for a certain case might be re-usable for other cases in the same domain (by various technique). Model discovery, re-sharing simplifies adoption of the NLP across multiple lines of business Training and testing data resharing - accelerates model development and the NLP adoption
  • 34. NLP Training sets and Metrics Training sets are important as they train your models for something important/beneficial and metrics are important if they contribute to measurement of the final impact What are classification tasks which will benefit your business (improve conversion or purchase rate for search, better routing of phone calls or customer support ticket ), what are extraction tasks which will benefit your business (what knowledge graph do you need for better search or recommendation, which entities are important for browsing by your business agents etc ) , not ‘what’s perplexity of the language model’ but ‘how many symbols customer type in autocomplete or what percentage of spelling errors is solved’- what will improve the end to end quality and performance Focus is on end to end performance rather than classical NLP level metrics only and mature your systems by development them to impact end to end quality and performance
  • 35. Continuous improvement circle Almost none of real world NLP tasks can have a final ‘perfect’ solution <- needs a lot of leadership support to promote this vision and align with business, incremental gains in system performance and model accuracy means gains to the business (but need to build system, measurement, and attribution framework to execute well on it) Each NLP model can be improved in accuracy, perplexity, etc but what really important is impact on end to end system - conversion, revenue per session, document processing time etc Each NLP system can be improved from performance, scalability, cloud costs etc points of view. Improvement of NLP models and NLP systems has high ROI if done correctly but to do it correctly requires a lot of work. End to end analysis of systems rather than just model evaluation. Attribution analysis etc. In big business, building such environment and building organization to improve NLP pays back
  • 36. ROI assessment. Expenses Expenses: salaries + software costs + compute/storage costs + data annotation costs Small team of several good experts can create an NLP system, integrate it with business within your company and prove value of it. You do not need more than 5 people to solve serious tasks Software costs. Most business cases can be solved using open source software, Hugging Face Transformers, PyTorch Serve or TF Serving etc whole infrastructure for training and serving (and other tasks : annotation, can be built using open source) Compute/Storage costs. Depends. AWS Comprehend etc - more expensive, less flexible, but fast prototyping. GPU machines are needed in many cases Data annotation. With transfer learning you do not need huge data sets.
  • 37. ROI assessment. Returns For some tasks, such as search/recommendation functions directly facing consumer, the return is easily computed running online controlled experiments. For some tasks, such as business facing functions: ex. Document classification to make faster processing, question answering for agents, return is harder to compute, since one needs to run new business operation for period of time to measure impact Some tasks, replacing humans for information extraction, or question answering to consumer/customer support: return is computed by the number of people replaced. Key: build solutions rapidly, to experiment and find the maximum returns.
  • 38. Conclusion NLP systems bring significant gains to business and customer experience Building them is relatively easy task. There are multiple open source libraries, multiple cloud solutions, there are multiple alternatives how to build NLP system for your company. The task of building and using NLP typically has high ROI if approached correctly