Argumentation Framework

Analyzing Arguments during a
Debate using Natural Language
Processing in Python - I
ABHINAV GUPTA

How a debate may proceed
This new movie ‘Superman
vs. Batman’ is so cool! The
winner has got to be
Superman, with his mighty
Kryptonian abilities and
people’s support. What do
you think?!
I agree! Superman is definitely
more capable than Batman.
What are you saying? Batman is
so much technologically
advanced!
Both, Batman and Superman,
are powerful in their own ways.
It will be a draw.
Ben Affleck is so HOT!

What will we discuss?
 Basic Natural Language Processing (NLP) techniques
 Implementation of NLP in Python NLTK
 Stepwise workflow for processing arguments in a debate to:
 Determine polarity of an argument
 Determine quality of argument and score it
 Determine the winner of debate
 A complete debating framework built from various Python modules

Why Natural Language Toolkit (NLTK)?
 Platform for implementing Natural
Language Processing through Python
programs
 Huge database of corpora and lexical
resources with an easy interface
 Built-in libraries of several text
processing algorithms
 Open Source!

Starting with the Basics
“I do not feel very good about Monday mornings.”
Tokenization [‘I’, ‘do’, ‘not’, ‘feel’, ‘very’, ‘good’, ‘about’, ‘Monday’, ‘mornings’]
Parts of Speech Tagging ‘I’ – Personal Pronoun
‘do’ – Verb
‘not’ – Adverb
‘feel’ - Verb,
‘very’ – Adverb
‘good’ – Adjective
‘about’ – Preposition
‘Monday’ – Proper Noun
‘mornings’ – Plural Proper Noun]

Tokens [‘I’, ‘do’, ‘not’, ‘feel’, ‘very’, ‘good’, ‘about’, ‘Monday’, ‘mornings’]
Removal of Stop Words [‘I’, ‘feel’, ‘good’, ‘Monday’, ‘mornings’]
Stemmed Words [‘I’, ‘feel’, ‘good’, ‘Monday’, ‘morn’]

What we look for in an argument?
What is the stance taken by
the debater in this argument?
Has the debater changed stance
from the previous arguments?
Is the argument related to the
debate or irrelevant?
Is the argument good enough?

Analysis of an Argument
• Is the argument
related to the
debate?
SEMANTIC
SIMILARITY
• What is the
polarity of the
argument?
SENTIMENT
ANALYSIS • Is the argument
good enough?
SCORING
• Has the debater
changed
stance?
BACKTRACK

Semantic Similarity
 Semantic Distance between words in context is the distance between their
underlying senses or lexical concepts.
 d(festival, celebration) < d(school, circus)
 Semantic Similarity is how close the lexical concepts of two units (word, sentence,
paragraph) of language are.
 d(Mangoes and bananas are fruits, Mangoes are sweeter than bananas) < d(Raj has a job at the
hospital, Hospitals have a huge staff of doctors and nurses)
 Lexical databases like WordNet group English words into sets of synonyms
expressing a distinct concept and are used for calculating semantic similarity

Word Net based Similarity
Such a network forms the basis of several distance formulae to calculate semantic similarity

Similarity between Sentences
A new NASA initiative will
help lead the search for signs
of life beyond our solar
system
The Nexus for Exoplanet
System Science, or NExSS,
will take a multidisciplinary
approach to the hunt for
alien life
new
NASA
Initiative
Help
Lead
Search
Signs
Life
Beyond
Solar
System
Nexus
Exoplanet
Alien
Science
NExSS
Take
multidisciplinary
Approach
Hunt
Joint Word Set
Sentence 1
Sentence 2
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
1
1
1
1
1
1
1
1
1
1 2

Similarity between Sentences
 The simplest similarity score is to take the cosine distance between the two vectors:
 More sophisticated formulae identify similar pair of words and assign decimal
values depending on the semantic distance. For example, in our word set,
 d(Search, Hunt) = 0.8
 d(Solar, Exoplanet) = 0.4
 Sometimes, the order in which the words appear in the sentence also make a
difference. Order Similarity is also considered.
 India defeated Pakistan
 Pakistan defeated India

Sentiment Analysis
 Sentiment Analysis (or opinion mining) is the process of detecting the contextual
polarity of text
 NLP Techniques, Statistics and Machine Learning is used to identify the sentiment
content in a text
 It finds application with Movie Reviews, Blogs, Customer Feedback, Twitter and
other microblogging sources
 Most popular classifier used for Sentiment Analysis is the Naïve Bayes Classifier,
available as a module in NLTK and TextBlob, a Python library for textual data

Sentiment Analysis using Naïve Bayes
Training Corpus
Polarity Lexicon
Naïve Bayes
Classifier
Neutral?
Test Data Yes
Positive/Negative

Thank You!
Please keep watching
this space for Part II

Argumentation Framework

More Related Content

Similar to Argumentation Framework (20)

Recently uploaded (20)

Argumentation Framework