SlideShare a Scribd company logo
2
Most read
3
Most read
17
Most read
BY
VEENA .S.KUMAR
Natural Language Processing
(NLP)
Contents
• What Is NLP?
• Why NLP?
• Basic Terms In NLP
• Approaches To NLP
• NLTK
• Setting Up NLP Environment
• Components Of NLP
• Levels In NLP
• Stages In NLP
• Some Applications Of NLP
What Is NLP?
Artificial
Intelligence
Computational
Linguistics
NLP
•It is automatic manipulation of speech or text
•Goal  To accomplish human-like language processing
•The field of NLP involves making computers to perform useful tasks with the
natural languages humans use. The input and output of an NLP system can be
Speech
Written Text
Why NLP?
• Bsyhbuwhx  Computers lack knowledge
• Large Volumes of Textual Data There are at least 30 trillion pages70-80%
unstructured data i.e. raw text
• Structuring a highly unstructured data source
• Text Data-Website ,tweets , blog etc
• Audio Data-Speech
• Applications for processing large amount of data require NLP expertise
Basic Terms In NLP
Tokenization
It is the task of chopping up of string of characters into pieces, called tokens , perhaps at
the same time throwing away certain characters, such as punctuation.
Input: Friends, Romans, Countrymen, lend me your ears;
Output: Friends Romans Countrymen Lend me your ears
Stemming
Stemming is the process of eliminating affixes (suffixed, prefixes, infixes, circumfixes)
from a word in order to obtain a word stem.
running → run
Lemmatization
Lemmatization is related to stemming, differing in that lemmatization is able to capture
canonical forms based on a word's lemma.
Better → good
Corpus
Corpus refers to a collection of texts. Corpora may also consist of theme texts
(historical,Biblical, etc.). Corpora are generally solely used for statistical linguistic
analysis and hypothesis testing.
Stop Words
Stop words are those words which are filtered out before further processing of text
The quick brown fox jumps over the lazy dog.
Parts-of-speech (POS) Tagging
POS tagging consists of assigning a category tag to the tokenized parts of a
sentence. The most popular POS tagging would be identifying words as nouns,
verbs, adjectives, etc.
Approaches To NLP
Symbolic
• Explicit depiction of facts about language through well understood schemes
and algorithm
• Deep Analysis of linguistic phenomena
Statistical
• Uses mathematical techniques and large texts of corpora without
incorporating world knowledge
• Output produced by each state has a definitive probability
Connectionist
• Combines statistical learning with various representation theories
• Allows transformation,inference and logic formulae manipulation
• Less Constrained Architecture
NLTK
• Natural Language Toolkit (NLTK) was originally created in 2001 as part of a
computational linguistics course in the Department of Computer and Information
Science at the University of Pennsylvania.
• The Natural Language Toolkit (NLTK) defines a basic infrastructure that can be used to build
NLP programs in Python. It provides:
o Basic classes for representing data relevant to natural language processing.
o Standard interfaces for performing tasks, such as tokenization, tagging, and parsing.
o Standard implementations for each task, which can be combined to solve complex problems.
NLTK was designed with four primary goals in mind:
 Simplicity
 Consistency
 Modularity
 Extensibility
Natural Language Processing
Natural Language Processing
Setting Up NLP Environment
Open Anaconda Prompt
Install pip: run in terminal easy_install pip
Install NLTK:run in terminal pip install –U nltk
Open Spyder
Run in terminal 1)import nltk
2) nltk.download()
Press Enter
After Pressing Enter this dialogue box appears on the screen
Components Of NLP
There are two components of NLP as given −
Natural Language Understanding (NLU)
 Understanding involves the following tasks −
 Mapping the given input in natural language into useful representations.
 Analyzing different aspects of the language.
Natural Language Generation (NLG)
 It is the process of producing meaningful phrases and sentences in the form
of natural language from some internal representation . It involves
 Text planning − It includes retrieving the relevant content from knowledge
base.
 Sentence planning − It includes choosing required words, forming
meaningful phrases, setting tone of the sentence.
 Text Realization − It is mapping sentence plan into sentence structure.
The NLU is harder than NLG.
Levels In NLP Phonology
Syntactic
Lexical
Semantic
Morphology
Discourse Pragmatic
Stages In NLP
• Phonology
• Morphological
• Lexical
Parsing
• Syntactic
• Semantic
Translating
• Discourse
• Pragmatic
Generating
Input
Some Applications Of NLP
Machine Translation
• Machine Translation (MT) is the task of automatically converting one natural
language into another, preserving the meaning of the input text, and producing
fluent text in the output language.
• The human translation process may be described as:
• Decoding the meaning of the source text
• Re-encoding this meaning in the target language.
• How to program a computer that will "understand" a text as a person
does, and that will "create" a new text in the target language that
sounds as if it has been written by a person?
Provide a general, though imperfect, approximation of the
original text, getting the "gist" of it (a process called "gisting").
This is sufficient for many purposes, including making best use of
the finite and expensive time of a human translator, reserved for those cases in
which total accuracy is indispensable.
Information Retrieval
• The process of accessing and retrieving the most appropriate information from text
based on a particular query using context-based indexing or metadata.
• Simply, Information retrieval addresses the problem of finding those documents
whose content matches a user's request from among a large collection of documents.
User i/p Indian
PM
Doc1Indian PM
Doc2Pakistan
PM
Doc3American
President
Brings document
relating to Indian
PM
Sentiment Analysis
o The process of evaluating and determining the sentiment captured in a selection of
text
o Sentiment defined as feeling or emotion.
o This sentiment can be simply
• positive (happy)
• negative (sad or angry)
• Neutral
• precise measurement along a scale, with neutral in the middle, and positive and
negative increasing in either direction.
Information Extraction
• Information extraction (IE) is the task of automatically extracting structured
information from unstructured and/or semi-structured machine-readable documents.
Question Answering
• ELIZA-First Chatbot-developed by Joseph Weizenbaum
http://psych.fullerton.edu/mbirnbaum/psych101/Eliza.htm
• Question-answering systems are referred to as intelligent systems that can be used to
provide responses for the questions being asked by the user based on certain facts or
rules stored in the knowledge base.
• So the accuracy of a question-answering system to provide a correct response depends
on the rules or facts stored in the knowledge base.
To Conclude with
• While NLP is a relatively recent area of research and application, as compared to other
information technology approaches, there have been sufficient successes to date that
suggest that NLP-based information access technologies will continue to be a major area
of research and development in information systems now and far into the future.
ANY QUESTIONS???
THANK YOU!!

More Related Content

PPTX
Natural lanaguage processing
PDF
Natural language processing (NLP) introduction
PPTX
Natural language processing
PDF
Natural language processing
PDF
Natural language processing
PPTX
Natural language processing
PDF
Natural Language Processing (NLP)
PPTX
Natural Language Processing
Natural lanaguage processing
Natural language processing (NLP) introduction
Natural language processing
Natural language processing
Natural language processing
Natural language processing
Natural Language Processing (NLP)
Natural Language Processing

What's hot (20)

PPTX
natural language processing help at myassignmenthelp.net
PPTX
Natural Language Processing
PPTX
Natural Language Processing
PPTX
Natural language processing
PPTX
Natural language processing (NLP)
PDF
Introduction to Natural Language Processing (NLP)
PDF
Natural Language Processing seminar review
PPT
Introduction to Natural Language Processing
PPT
Introduction to Natural Language Processing
PPT
Natural Language Processing
PDF
Introduction to natural language processing
PDF
Natural language processing (nlp)
PPTX
Natural Language Processing (NLP) - Introduction
PPT
Natural language processing
PDF
Natural Language Processing
PPTX
DOCX
Natural language processing
natural language processing help at myassignmenthelp.net
Natural Language Processing
Natural Language Processing
Natural language processing
Natural language processing (NLP)
Introduction to Natural Language Processing (NLP)
Natural Language Processing seminar review
Introduction to Natural Language Processing
Introduction to Natural Language Processing
Natural Language Processing
Introduction to natural language processing
Natural language processing (nlp)
Natural Language Processing (NLP) - Introduction
Natural language processing
Natural Language Processing
Natural language processing
Ad

Similar to Natural Language Processing (20)

PPTX
Natural-Language-Processing -Stages and application area.pptx
PDF
Natural Language Processing for development
PPTX
Unit 1 Natural Language Procerssing.pptx
PPTX
AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
PPTX
LONGSEM2024-25_CSE3015_ETH_AP2024256000125_Reference-Material-I.pptx
PPT
Lecture1 Natural Language Processing for
PDF
NLP in artificial intelligence .pdf
PPTX
Natural Language Processing (NLP).pptx
PPTX
Natural Language Processing 20 March.pptx
PPT
NLP Introduction.ppt machine learning presentation
PPTX
NLP.pptx
PDF
NLP slides introduction, a basic introduction and application
PPTX
Natural Language Processing.pptx
PPTX
NLP(Natural Language Processing)
PPTX
NLP - updated (Natural Language Processing))
PDF
Natural Language Processing from Object Automation
PPTX
NATURAL LANGUAGE PROCESSING AA PPT1.pptx
PPTX
Natural Language Processing - Lecture.pptx
PPTX
Unlocking the Power of Language: A Beginner’s Guide to Natural Language Proce...
PPTX
Introduction to natural language processing, history and origin
Natural-Language-Processing -Stages and application area.pptx
Natural Language Processing for development
Unit 1 Natural Language Procerssing.pptx
AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
LONGSEM2024-25_CSE3015_ETH_AP2024256000125_Reference-Material-I.pptx
Lecture1 Natural Language Processing for
NLP in artificial intelligence .pdf
Natural Language Processing (NLP).pptx
Natural Language Processing 20 March.pptx
NLP Introduction.ppt machine learning presentation
NLP.pptx
NLP slides introduction, a basic introduction and application
Natural Language Processing.pptx
NLP(Natural Language Processing)
NLP - updated (Natural Language Processing))
Natural Language Processing from Object Automation
NATURAL LANGUAGE PROCESSING AA PPT1.pptx
Natural Language Processing - Lecture.pptx
Unlocking the Power of Language: A Beginner’s Guide to Natural Language Proce...
Introduction to natural language processing, history and origin
Ad

Recently uploaded (20)

PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
Launch Your Data Science Career in Kochi – 2025
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Lecture1 pattern recognition............
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PDF
Foundation of Data Science unit number two notes
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Business Acumen Training GuidePresentation.pptx
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Launch Your Data Science Career in Kochi – 2025
IBA_Chapter_11_Slides_Final_Accessible.pptx
Lecture1 pattern recognition............
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Moving the Public Sector (Government) to a Digital Adoption
Foundation of Data Science unit number two notes
IB Computer Science - Internal Assessment.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Clinical guidelines as a resource for EBP(1).pdf
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Business Acumen Training GuidePresentation.pptx
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm

Natural Language Processing

  • 2. Contents • What Is NLP? • Why NLP? • Basic Terms In NLP • Approaches To NLP • NLTK • Setting Up NLP Environment • Components Of NLP • Levels In NLP • Stages In NLP • Some Applications Of NLP
  • 3. What Is NLP? Artificial Intelligence Computational Linguistics NLP •It is automatic manipulation of speech or text •Goal  To accomplish human-like language processing •The field of NLP involves making computers to perform useful tasks with the natural languages humans use. The input and output of an NLP system can be Speech Written Text
  • 4. Why NLP? • Bsyhbuwhx  Computers lack knowledge • Large Volumes of Textual Data There are at least 30 trillion pages70-80% unstructured data i.e. raw text • Structuring a highly unstructured data source • Text Data-Website ,tweets , blog etc • Audio Data-Speech • Applications for processing large amount of data require NLP expertise
  • 5. Basic Terms In NLP Tokenization It is the task of chopping up of string of characters into pieces, called tokens , perhaps at the same time throwing away certain characters, such as punctuation. Input: Friends, Romans, Countrymen, lend me your ears; Output: Friends Romans Countrymen Lend me your ears Stemming Stemming is the process of eliminating affixes (suffixed, prefixes, infixes, circumfixes) from a word in order to obtain a word stem. running → run Lemmatization Lemmatization is related to stemming, differing in that lemmatization is able to capture canonical forms based on a word's lemma. Better → good
  • 6. Corpus Corpus refers to a collection of texts. Corpora may also consist of theme texts (historical,Biblical, etc.). Corpora are generally solely used for statistical linguistic analysis and hypothesis testing. Stop Words Stop words are those words which are filtered out before further processing of text The quick brown fox jumps over the lazy dog. Parts-of-speech (POS) Tagging POS tagging consists of assigning a category tag to the tokenized parts of a sentence. The most popular POS tagging would be identifying words as nouns, verbs, adjectives, etc.
  • 7. Approaches To NLP Symbolic • Explicit depiction of facts about language through well understood schemes and algorithm • Deep Analysis of linguistic phenomena Statistical • Uses mathematical techniques and large texts of corpora without incorporating world knowledge • Output produced by each state has a definitive probability Connectionist • Combines statistical learning with various representation theories • Allows transformation,inference and logic formulae manipulation • Less Constrained Architecture
  • 8. NLTK • Natural Language Toolkit (NLTK) was originally created in 2001 as part of a computational linguistics course in the Department of Computer and Information Science at the University of Pennsylvania. • The Natural Language Toolkit (NLTK) defines a basic infrastructure that can be used to build NLP programs in Python. It provides: o Basic classes for representing data relevant to natural language processing. o Standard interfaces for performing tasks, such as tokenization, tagging, and parsing. o Standard implementations for each task, which can be combined to solve complex problems. NLTK was designed with four primary goals in mind:  Simplicity  Consistency  Modularity  Extensibility
  • 11. Setting Up NLP Environment Open Anaconda Prompt Install pip: run in terminal easy_install pip Install NLTK:run in terminal pip install –U nltk
  • 12. Open Spyder Run in terminal 1)import nltk 2) nltk.download() Press Enter After Pressing Enter this dialogue box appears on the screen
  • 13. Components Of NLP There are two components of NLP as given − Natural Language Understanding (NLU)  Understanding involves the following tasks −  Mapping the given input in natural language into useful representations.  Analyzing different aspects of the language. Natural Language Generation (NLG)  It is the process of producing meaningful phrases and sentences in the form of natural language from some internal representation . It involves  Text planning − It includes retrieving the relevant content from knowledge base.  Sentence planning − It includes choosing required words, forming meaningful phrases, setting tone of the sentence.  Text Realization − It is mapping sentence plan into sentence structure. The NLU is harder than NLG.
  • 14. Levels In NLP Phonology Syntactic Lexical Semantic Morphology Discourse Pragmatic
  • 15. Stages In NLP • Phonology • Morphological • Lexical Parsing • Syntactic • Semantic Translating • Discourse • Pragmatic Generating Input
  • 17. Machine Translation • Machine Translation (MT) is the task of automatically converting one natural language into another, preserving the meaning of the input text, and producing fluent text in the output language. • The human translation process may be described as: • Decoding the meaning of the source text • Re-encoding this meaning in the target language. • How to program a computer that will "understand" a text as a person does, and that will "create" a new text in the target language that sounds as if it has been written by a person? Provide a general, though imperfect, approximation of the original text, getting the "gist" of it (a process called "gisting"). This is sufficient for many purposes, including making best use of the finite and expensive time of a human translator, reserved for those cases in which total accuracy is indispensable.
  • 18. Information Retrieval • The process of accessing and retrieving the most appropriate information from text based on a particular query using context-based indexing or metadata. • Simply, Information retrieval addresses the problem of finding those documents whose content matches a user's request from among a large collection of documents. User i/p Indian PM Doc1Indian PM Doc2Pakistan PM Doc3American President Brings document relating to Indian PM
  • 19. Sentiment Analysis o The process of evaluating and determining the sentiment captured in a selection of text o Sentiment defined as feeling or emotion. o This sentiment can be simply • positive (happy) • negative (sad or angry) • Neutral • precise measurement along a scale, with neutral in the middle, and positive and negative increasing in either direction.
  • 20. Information Extraction • Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents.
  • 21. Question Answering • ELIZA-First Chatbot-developed by Joseph Weizenbaum http://psych.fullerton.edu/mbirnbaum/psych101/Eliza.htm • Question-answering systems are referred to as intelligent systems that can be used to provide responses for the questions being asked by the user based on certain facts or rules stored in the knowledge base. • So the accuracy of a question-answering system to provide a correct response depends on the rules or facts stored in the knowledge base.
  • 22. To Conclude with • While NLP is a relatively recent area of research and application, as compared to other information technology approaches, there have been sufficient successes to date that suggest that NLP-based information access technologies will continue to be a major area of research and development in information systems now and far into the future.