SlideShare a Scribd company logo
Deep Learning and Modern Natural
Language Processing
Zachary Brown, Lead Data Scientist, S&P Global
Outline
• Neural Methods for Natural Language Processing
• Shapes of Natural Language Processing Tasks
• Perceptron Text Classification
• Local vs. Global Text Representations
• Contextual Representations and Sequence Modeling
2
Neural Methods for Natural Language
Processing
Neural Methods for Natural Language
Processing
• Natural Language Processing (NLP) has moved largely to neural
methods in recent years
4
Neural Methods for Natural Language
Processing
• Natural Language Processing (NLP) has moved largely to neural
methods in recent years
5
• Traditional NLP builds on years of research into
language representation
• Theoretical foundations can lead to model rigidity
• Tasks often rely on manually generated and
curated dictionaries and thesauruses
• Built upon local word representations
Neural Methods for Natural Language
Processing
• Natural Language Processing (NLP) has moved largely to neural
methods in recent years
6
• Few to no assumptions need to be made
• Active area of research; most open-source
• Ability to learn global and contextualized
word representations
• Purpose-built model architectures
Neural Methods for Natural Language
Processing
• Natural Language Processing (NLP) has moved largely to neural
methods in recent years
7
• Few to no assumptions need to be made
• Active area of research; most open-source
• Ability to learn global and contextualized
word representations
• Purpose-built model architectures
Shapes of Natural Language
Processing Tasks
Shapes of Natural Language Processing
Tasks
• A general task in natural language processing often takes the form:
9
Shapes of Natural Language Processing
Tasks
• For binary classification processes (relevance), our target is a single
number, often interpreted as a probability
10
Shapes of Natural Language Processing
Tasks
• For multi-class classification processes (type of text), our target is a
set of probabilities, one for each of the output classes
11
Shapes of Natural Language Processing
Tasks
• For sequential classification (e.g. LM, NER, POS) the target is a
probability of each class for each element in the input
12
Shapes of Natural Language Processing
Tasks
• In a traditional machine learning pipeline, vectorization (feature
engineering) process is often a (very) time consuming process
13
80-90%
Shapes of Natural Language Processing
Tasks
• A relatively small proportion of time is spent on the actual modeling
14
10-20%
Shapes of Natural Language Processing
Tasks
• Neural networks allow us to develop purpose-built architectures to
solve tasks, that learn the appropriate vectorization for a task
15
100%
Perceptron Text Classification
Perceptron Text Classification
• To introduce the shape of information as it flows through a neural
network, we'll first look at a network that only handles classification
17
Perceptron Text Classification
• For the vectorization, we'll assume that we've converted our text
into a vector using a count-based method like tf-idf
18
tf-idf
Perceptron Text Classification
• A perceptron is one of the simplest neural network architectures,
and is a good fit for this task
19
Perceptron Text Classification
• A perceptron is one of the simplest neural network architectures,
and is a good fit for this task
20
input
hidden
(linear)
activation
output
Perceptron Text Classification
• The hidden layer represents the weights that will be optimized by
the deep learning framework.
21
input
hidden
(linear)
activation
output
weights
Perceptron Text Classification
• If we want to change our task to multiclass classification, we can
simply change the size of our hidden layer (+ minor mods)
22
Perceptron Text Classification
• The result of this is that we now have a matrix of weights to optimize
23
weights
Local vs. Global Text Representations
Local vs. Global Text Representations
• Let's look back to the problem of creating a vector representation for
our text
25
tf-idf
Local vs. Global Text Representations
• Further, let's only consider the task of how we'd represent single
words or tokens as vectors
26
dog
Local vs. Global Text Representations
• Traditional approaches to word representations treat each word as
a unique entity
27
Local vs. Global Text Representations
• Traditional approaches to word representations treat each word as
a unique entity
28
Local vs. Global Text Representations
• Traditional approaches to word representations treat each word as
a unique entity
29
Local vs. Global Text Representations
• Traditional approaches to word representations treat each word as
a unique entity
30
Local vs. Global Text Representations
• Modern approaches move to a fixed dimensional vector size, with
dense vectors
31
Local vs. Global Text Representations
• Modern approaches move to a fixed dimensional vector size, with
dense vectors
32
Local vs. Global Text Representations
• Modern approaches move to a fixed dimensional vector size, with
dense vectors
33
Local vs. Global Text Representations
• There are a variety of frameworks available that allow for computing
these vectors in an unsupervised way
34
Contextual Representations and
Sequence Modeling
Contextual Representations and Sequence
Modeling
• Global word representations are a fantastic starting point for many
problems in NLP, but consider the following sentence
36
I'm going to book our vacation then relax and read a good book
Contextual Representations and Sequence
Modeling
• Global word representations are a fantastic starting point for many
problems in NLP, but consider the following sentence
37
I'm going to book our vacation then relax and read a good book
I don't really hate horror movies, but I hate comedies
Contextual Representations and Sequence
Modeling
• Global word representations are a fantastic starting point for many
problems in NLP, but consider the following sentence
38
I don't really hate horror movies, but I hate comedies
Contextual Representations and Sequence
Modeling
• Global word representations are a fantastic starting point for many
problems in NLP, but consider the following sentence
39
Context
Matters
Contextual Representations and Sequence
Modeling
• For modeling tasks where word ordering and context matter,
sequential models are often used. These tasks often take the
following shape:
40
Contextual Representations and Sequence
Modeling
• Recurrent neural networks are a type of neural network architecture
that naturally handles modeling sequential data
41
Contextual Representations and Sequence
Modeling
• This type of network generates a new output vector for each input in
a sequence, and also feeds that same information forward
42
Contextual Representations and Sequence
Modeling
• By feeding the information forward, each subsequent output vector
has contextual information encoded from the preceding words
43
Contextual Representations and Sequence
Modeling
• This type of architecture can be used to build language models,
where the task is to predict the next word in the sequence
44
Contextual Representations and Sequence
Modeling
• It can also be used for problems like named entity recognition
45
animalo o o o animal
Contextual Representations and Sequence
Modeling
• By taking the final vector in the sequence, you can perform tasks
like sentiment classification
46
positive
Contextual Representations and Sequence
Modeling
• For all of these different types of tasks, a network similar to the
perceptron can be placed at the end to carry out the final
classification of each word
47
Contextual Representations and Sequence
Modeling
• For all of these different types of tasks, a network similar to the
perceptron can be placed at the end to carry out the final
classification of each word, or the classification of the whole
sequence
48
Contextual Representations and Sequence
Modeling
• In a similar manner, these individual elements can be combined in a
variety of ways to tackle very complex tasks
49
Contextual Representations and Sequence
Modeling
• In a similar manner, these individual elements can be combined in a
variety of ways to tackle very complex tasks
50
Contextual Representations and Sequence
Modeling
• In a similar manner, these individual elements can be combined in a
variety of ways to tackle very complex tasks
51
Contextual Representations and Sequence
Modeling
• In a similar manner, these individual elements can be combined in a
variety of ways to tackle very complex tasks
52
Thank you.

More Related Content

PDF
Text Representations for Deep learning
PDF
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
PDF
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
PDF
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
PPTX
1909 paclic
PPTX
2010 PACLIC - pay attention to categories
PPTX
Tomáš Mikolov - Distributed Representations for NLP
PDF
Natural Language Processing
Text Representations for Deep learning
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
1909 paclic
2010 PACLIC - pay attention to categories
Tomáš Mikolov - Distributed Representations for NLP
Natural Language Processing

What's hot (20)

PDF
Frontiers of Natural Language Processing
PDF
Deep Learning for Information Retrieval
PDF
Deep learning for nlp
PPTX
Word2vec slide(lab seminar)
PPTX
NLP Bootcamp
PDF
Representation Learning of Vectors of Words and Phrases
PDF
NLP Bootcamp 2018 : Representation Learning of text for NLP
PDF
Anthiil Inside workshop on NLP
PPTX
Arabic question answering ‫‬
PPTX
1910 HCLT
PDF
Networks and Natural Language Processing
PPTX
Lecture 1: Semantic Analysis in Language Technology
PPTX
What is word2vec?
PDF
Learning to understand phrases by embedding the dictionary
PDF
Is acquiring knowledge of verb subcategorization in English easier? A partial...
PPTX
Language models
PDF
Word Embeddings, why the hype ?
PDF
Thai Word Embedding with Tensorflow
PPTX
Intent Classifier with Facebook fastText
PDF
(Deep) Neural Networks在 NLP 和 Text Mining 总结
Frontiers of Natural Language Processing
Deep Learning for Information Retrieval
Deep learning for nlp
Word2vec slide(lab seminar)
NLP Bootcamp
Representation Learning of Vectors of Words and Phrases
NLP Bootcamp 2018 : Representation Learning of text for NLP
Anthiil Inside workshop on NLP
Arabic question answering ‫‬
1910 HCLT
Networks and Natural Language Processing
Lecture 1: Semantic Analysis in Language Technology
What is word2vec?
Learning to understand phrases by embedding the dictionary
Is acquiring knowledge of verb subcategorization in English easier? A partial...
Language models
Word Embeddings, why the hype ?
Thai Word Embedding with Tensorflow
Intent Classifier with Facebook fastText
(Deep) Neural Networks在 NLP 和 Text Mining 总结
Ad

Similar to Deep Learning and Modern Natural Language Processing (AnacondaCon2019) (20)

PPTX
NLP Introduction and basics of natural language processing
PPTX
A Panorama of Natural Language Processing
PPTX
Natural Language Processing (NLP)
PPTX
naturallanguageprocessingnlp-231215172843-839c05ab.pptx
PPTX
Neural Language Model_ Webinar.pptx new1
PPTX
https://www.slideshare.net/amaresimachew/hot-topics-132093738
PDF
AINL 2016: Nikolenko
PPTX
Natural Language Processing Advancements By Deep Learning - A Survey
PDF
Beyond the Symbols: A 30-minute Overview of NLP
PDF
Building a Neural Machine Translation System From Scratch
PDF
Deep Learning for Natural Language Processing: Word Embeddings
PPTX
Intro to nlp
PDF
introtonlp-190218095523 (1).pdf
PDF
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
PPTX
Module 4.1 of chennai's slides wo hanve dot do thhopps otps
PPTX
Word embedding
PPTX
wordembedding.pptx
PPTX
Classroom Presentation_NLPMALLAIAH_CSE_PHD.pptx
PPTX
Artificial Intelligence Notes Unit 4
PDF
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
NLP Introduction and basics of natural language processing
A Panorama of Natural Language Processing
Natural Language Processing (NLP)
naturallanguageprocessingnlp-231215172843-839c05ab.pptx
Neural Language Model_ Webinar.pptx new1
https://www.slideshare.net/amaresimachew/hot-topics-132093738
AINL 2016: Nikolenko
Natural Language Processing Advancements By Deep Learning - A Survey
Beyond the Symbols: A 30-minute Overview of NLP
Building a Neural Machine Translation System From Scratch
Deep Learning for Natural Language Processing: Word Embeddings
Intro to nlp
introtonlp-190218095523 (1).pdf
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Module 4.1 of chennai's slides wo hanve dot do thhopps otps
Word embedding
wordembedding.pptx
Classroom Presentation_NLPMALLAIAH_CSE_PHD.pptx
Artificial Intelligence Notes Unit 4
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Ad

More from Zachary S. Brown (6)

PDF
Working in NLP in the Age of Large Language Models
PDF
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
PDF
Building and Deploying Scalable NLP Model Services
PDF
Deep Learning and Modern NLP
PDF
Cyber Threat Ranking using READ
PDF
Deep Domain
Working in NLP in the Age of Large Language Models
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Building and Deploying Scalable NLP Model Services
Deep Learning and Modern NLP
Cyber Threat Ranking using READ
Deep Domain

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Empathic Computing: Creating Shared Understanding
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Cloud computing and distributed systems.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
cuic standard and advanced reporting.pdf
PPT
Teaching material agriculture food technology
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Approach and Philosophy of On baking technology
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Machine Learning_overview_presentation.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Encapsulation theory and applications.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Empathic Computing: Creating Shared Understanding
“AI and Expert System Decision Support & Business Intelligence Systems”
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Assigned Numbers - 2025 - Bluetooth® Document
Spectroscopy.pptx food analysis technology
Cloud computing and distributed systems.
The Rise and Fall of 3GPP – Time for a Sabbatical?
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
cuic standard and advanced reporting.pdf
Teaching material agriculture food technology
Programs and apps: productivity, graphics, security and other tools
Approach and Philosophy of On baking technology
sap open course for s4hana steps from ECC to s4
MYSQL Presentation for SQL database connectivity
Machine Learning_overview_presentation.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf

Deep Learning and Modern Natural Language Processing (AnacondaCon2019)

  • 1. Deep Learning and Modern Natural Language Processing Zachary Brown, Lead Data Scientist, S&P Global
  • 2. Outline • Neural Methods for Natural Language Processing • Shapes of Natural Language Processing Tasks • Perceptron Text Classification • Local vs. Global Text Representations • Contextual Representations and Sequence Modeling 2
  • 3. Neural Methods for Natural Language Processing
  • 4. Neural Methods for Natural Language Processing • Natural Language Processing (NLP) has moved largely to neural methods in recent years 4
  • 5. Neural Methods for Natural Language Processing • Natural Language Processing (NLP) has moved largely to neural methods in recent years 5 • Traditional NLP builds on years of research into language representation • Theoretical foundations can lead to model rigidity • Tasks often rely on manually generated and curated dictionaries and thesauruses • Built upon local word representations
  • 6. Neural Methods for Natural Language Processing • Natural Language Processing (NLP) has moved largely to neural methods in recent years 6 • Few to no assumptions need to be made • Active area of research; most open-source • Ability to learn global and contextualized word representations • Purpose-built model architectures
  • 7. Neural Methods for Natural Language Processing • Natural Language Processing (NLP) has moved largely to neural methods in recent years 7 • Few to no assumptions need to be made • Active area of research; most open-source • Ability to learn global and contextualized word representations • Purpose-built model architectures
  • 8. Shapes of Natural Language Processing Tasks
  • 9. Shapes of Natural Language Processing Tasks • A general task in natural language processing often takes the form: 9
  • 10. Shapes of Natural Language Processing Tasks • For binary classification processes (relevance), our target is a single number, often interpreted as a probability 10
  • 11. Shapes of Natural Language Processing Tasks • For multi-class classification processes (type of text), our target is a set of probabilities, one for each of the output classes 11
  • 12. Shapes of Natural Language Processing Tasks • For sequential classification (e.g. LM, NER, POS) the target is a probability of each class for each element in the input 12
  • 13. Shapes of Natural Language Processing Tasks • In a traditional machine learning pipeline, vectorization (feature engineering) process is often a (very) time consuming process 13 80-90%
  • 14. Shapes of Natural Language Processing Tasks • A relatively small proportion of time is spent on the actual modeling 14 10-20%
  • 15. Shapes of Natural Language Processing Tasks • Neural networks allow us to develop purpose-built architectures to solve tasks, that learn the appropriate vectorization for a task 15 100%
  • 17. Perceptron Text Classification • To introduce the shape of information as it flows through a neural network, we'll first look at a network that only handles classification 17
  • 18. Perceptron Text Classification • For the vectorization, we'll assume that we've converted our text into a vector using a count-based method like tf-idf 18 tf-idf
  • 19. Perceptron Text Classification • A perceptron is one of the simplest neural network architectures, and is a good fit for this task 19
  • 20. Perceptron Text Classification • A perceptron is one of the simplest neural network architectures, and is a good fit for this task 20 input hidden (linear) activation output
  • 21. Perceptron Text Classification • The hidden layer represents the weights that will be optimized by the deep learning framework. 21 input hidden (linear) activation output weights
  • 22. Perceptron Text Classification • If we want to change our task to multiclass classification, we can simply change the size of our hidden layer (+ minor mods) 22
  • 23. Perceptron Text Classification • The result of this is that we now have a matrix of weights to optimize 23 weights
  • 24. Local vs. Global Text Representations
  • 25. Local vs. Global Text Representations • Let's look back to the problem of creating a vector representation for our text 25 tf-idf
  • 26. Local vs. Global Text Representations • Further, let's only consider the task of how we'd represent single words or tokens as vectors 26 dog
  • 27. Local vs. Global Text Representations • Traditional approaches to word representations treat each word as a unique entity 27
  • 28. Local vs. Global Text Representations • Traditional approaches to word representations treat each word as a unique entity 28
  • 29. Local vs. Global Text Representations • Traditional approaches to word representations treat each word as a unique entity 29
  • 30. Local vs. Global Text Representations • Traditional approaches to word representations treat each word as a unique entity 30
  • 31. Local vs. Global Text Representations • Modern approaches move to a fixed dimensional vector size, with dense vectors 31
  • 32. Local vs. Global Text Representations • Modern approaches move to a fixed dimensional vector size, with dense vectors 32
  • 33. Local vs. Global Text Representations • Modern approaches move to a fixed dimensional vector size, with dense vectors 33
  • 34. Local vs. Global Text Representations • There are a variety of frameworks available that allow for computing these vectors in an unsupervised way 34
  • 36. Contextual Representations and Sequence Modeling • Global word representations are a fantastic starting point for many problems in NLP, but consider the following sentence 36 I'm going to book our vacation then relax and read a good book
  • 37. Contextual Representations and Sequence Modeling • Global word representations are a fantastic starting point for many problems in NLP, but consider the following sentence 37 I'm going to book our vacation then relax and read a good book
  • 38. I don't really hate horror movies, but I hate comedies Contextual Representations and Sequence Modeling • Global word representations are a fantastic starting point for many problems in NLP, but consider the following sentence 38
  • 39. I don't really hate horror movies, but I hate comedies Contextual Representations and Sequence Modeling • Global word representations are a fantastic starting point for many problems in NLP, but consider the following sentence 39 Context Matters
  • 40. Contextual Representations and Sequence Modeling • For modeling tasks where word ordering and context matter, sequential models are often used. These tasks often take the following shape: 40
  • 41. Contextual Representations and Sequence Modeling • Recurrent neural networks are a type of neural network architecture that naturally handles modeling sequential data 41
  • 42. Contextual Representations and Sequence Modeling • This type of network generates a new output vector for each input in a sequence, and also feeds that same information forward 42
  • 43. Contextual Representations and Sequence Modeling • By feeding the information forward, each subsequent output vector has contextual information encoded from the preceding words 43
  • 44. Contextual Representations and Sequence Modeling • This type of architecture can be used to build language models, where the task is to predict the next word in the sequence 44
  • 45. Contextual Representations and Sequence Modeling • It can also be used for problems like named entity recognition 45 animalo o o o animal
  • 46. Contextual Representations and Sequence Modeling • By taking the final vector in the sequence, you can perform tasks like sentiment classification 46 positive
  • 47. Contextual Representations and Sequence Modeling • For all of these different types of tasks, a network similar to the perceptron can be placed at the end to carry out the final classification of each word 47
  • 48. Contextual Representations and Sequence Modeling • For all of these different types of tasks, a network similar to the perceptron can be placed at the end to carry out the final classification of each word, or the classification of the whole sequence 48
  • 49. Contextual Representations and Sequence Modeling • In a similar manner, these individual elements can be combined in a variety of ways to tackle very complex tasks 49
  • 50. Contextual Representations and Sequence Modeling • In a similar manner, these individual elements can be combined in a variety of ways to tackle very complex tasks 50
  • 51. Contextual Representations and Sequence Modeling • In a similar manner, these individual elements can be combined in a variety of ways to tackle very complex tasks 51
  • 52. Contextual Representations and Sequence Modeling • In a similar manner, these individual elements can be combined in a variety of ways to tackle very complex tasks 52