Detecting negative words

Detecting Negative
Words
Adel Rahimi
Adel.Rahimi@mehr.sharif.edu
RezaTakhshid
Reza.takhshid95@student.sharfi.edu
Sharif University ofTechnology
Spring 2017

Sentiment datasets for other
languages
• AFINN by Finn Årup Nielsen
AFINN is a list of English words rated for valence with an integer
between minus five (negative) and plus five (positive).The words have
been manually labeled by Finn Årup Nielsen in 2009-2011.The file
is tab-separated.

languages
• Opinion Lexicon by Hu and Liu
A list of English positive and negative opinion words or sentiment words
(around 6800 words)

languages
• NRCWord-EmotionAssociation Lexicon by Saif Mohammad
and PeterTurney
The NRC Emotion Lexicon is a list of English words and their associations with
eight basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and
disgust) and two sentiments (negative and positive).The annotations were
manually done by crowdsourcing.

Datasets
• Refined Persian Polarity corpus
• Dehdarbehbahani, I., Shakery, A., & Faili, H. (2014). Semi-supervised word
polarity identification in resource-lean languages. Neural Networks, 58, 50-
59.
• Corpus of exceptions
• [working dataset]

Datasets
• Corpus of exceptions was extracted from Flexicon
database
• ‫ناشتا‬
• ‫نارنگی‬
• ‫نارج‬
• ‫بیدمشک‬
• ‫الروبی‬

Datasets
• The exceptions list also needs to be refined
• ‫اجتماعی‬ ‫غیر‬
• ‫اصولی‬ ‫غیر‬
• ‫ناشنوایی‬
• ‫پادگان‬

Datasets
• List of validly affixed words

How does the algorithm work?
INPUT
Not
Negative
Not
Negative
Negatives
list
Exception
list
Negative
Affix
searching
Negative

Further development
• Creating a database of Affixed but positive words
• ‫پروا‬ ‫بی‬
• ‫ضدآب‬
• Using Elasticsearch as database for making the process
faster
• Using a corpus instead of FLexicon
• Using statistical approaches for increasing accuracy

Further Research
• The datasets are still not reliable enough. Need to be
worked on so that the accuracy of the algorithm will be
higher.

Detecting negative words

More Related Content

Recently uploaded (20)

Featured (20)

Detecting negative words