4. Some Buzz-Words
NLP – Natural Language Processing
CL – Computational Linguistics
SP – Speech Processing
HLT – Human Language Technology
NLE – Natural Language Engineering
SNLP – Statistical Natural Language Processing
Other Areas:
Speech Generation, Text Generation, Speech Understanding,
Information Retrieval,
Dialogue Processing, Inference, Spelling Correction, Grammar
Correction,
Text Summarization, Text Categorization
5. Brief History of NLP
1940s –1950s: Foundations
Development of formal language theory
(Chomsky, Backus, Naur, Kleene)
Probabilities and information theory (Shannon)
1957 – 1970s:
Use of formal grammars as basis for natural language processing
(Chomsky, Kaplan)
Use of logic and logic-based programming
(Minsky, Winograd, Colmerauer, Kay)
1970s – 1983:
Probabilistic methods for early speech recognition (Jelinek, Mercer)
Discourse modeling (Grosz, Sidner, Hobbs)
1983 – 1993:
Finite state models (morphology) (Kaplan, Kay)
1993 – present:
Strong integration of different techniques, different areas.
6. Natural Language Processing
Natural Language Processing—also known as NLP or
computational linguistics—is a subfield of Artificial
Intelligence (AI), Machine Learning (ML), and linguistics.
A branch of AI, it helps computers or machines understand,
manipulate, and interpret human language.
For several decades now, humans have been communicating
with machines through coding and programming languages,
which in binary form, constitute of millions of zeroes and ones.
According to Gartner, by 2025, nearly 60% of analytical queries
will be generated through speech, Natural Language Processing
(NLP) or voice, or would be generated automatically.
7. Natural Language Processing
To define it simply, Natural Language is the
natural way in which humans communicate with
each other. Today, we have made computers
understand this natural language.
For example, with voice commands such as
“Alexa, what’s the news today” or “Ok Google,
play me my favorite track,” communicating with
machines has become easier.
Similarly, when Siri, Apple’s personal voice
assistant, is asked, “What is the cheapest flight
to Saudi Arabia tomorrow?”
8. Why Should You Care?
Two trends
An enormous amount of knowledge is now
available in machine readable form as
natural language text
Conversational agents are becoming an
important form of human-computer
communication
Much of human-human communication is
now mediated by computers
9. Forms of Natural Language
The input/output of an NLP system can be:
written text
speech
We will mostly concern with written text (not
speech).
To process written text, we need:
lexical, syntactic, semantic knowledge
about the language
discourse information, real world
knowledge
To process spoken language, we need
everything required to process written text, plus
the challenges of speech recognition and
speech synthesis.
12. Why NL Understanding is hard?
Natural language is extremely rich in form and
structure, and very ambiguous.
How to represent meaning,
Which structures map to which meaning
structures.
One input can mean many different things. Ambiguity
can be at different levels.
Lexical (word level) ambiguity -- different
meanings of words
Syntactic ambiguity -- different ways to parse
the sentence
Interpreting partial information -- how to
interpret pronouns
Contextual information -- context of the
sentence may affect the meaning of that sentence.
Many input’s can mean the same thing.
Interaction among components of the input is not
clear.
13. Knowledge of Language
Phonology:
concerns how words are related to the sounds
that realize them.
Morphology:
concerns how words are constructed from
more
basic meaning units called morphemes.
A
morpheme is the primitive unit of
meaning in a
language.
Syntax:
concerns how can be put together to form correct
sentences and determines what structural role
each word plays in the sentence and what
phrases are subparts of other phrases.
Semantics:
concerns what words mean and how these
14. Pragmatics:
concerns how sentences are used in
different situations and how use affects the
interpretation of the sentence.
Discourse:
concerns how the immediately preceding
sentences affect the interpretation of the
next sentence. For example, interpreting
pronouns and interpreting the temporal
aspects of the information.
World Knowledge:
includes general knowledge about the
world. What each language user must
know about the other’s beliefs and goals.
Knowledge of Language
15. NLP - an Inter-Disciplinary Field
NLP borrows techniques and insights from several
disciplines, namely.
Linguistics: How do words form phrases and sentences?
What constraints the possible meaning for a sentence?
Computational Linguistics: How is the structure of
sentences are identified? How can knowledge and
reasoning be modeled?
Computer Science: Algorithms for automatons, parsers.
Engineering: Stochastic techniques for ambiguity
resolution.
Psychology: What linguistic constructions are easy or
difficult for people to learn to use?
Philosophy: What is the meaning, and how do words and
sentences acquire it?
17. Summary
The field of Natural Language Processing (NLP) has significantly transformed the way
humans interact with machines, enabling more intuitive and efficient communication.
NLP encompasses a wide range of techniques and methodologies to understand, interpret, and
generate human language.
From basic tasks like tokenization and part-of-speech tagging to advanced applications like
sentiment analysis and machine translation, the impact of NLP is evident across various
domains. As the technology continues to evolve, driven by advancements in machine learning
and artificial intelligence, the potential for NLP to enhance human-computer interaction and
solve complex language-related challenges remains immense.
Understanding the core concepts and applications of Natural Language Processing is crucial
for anyone looking to leverage its capabilities in the modern digital landscape.