SlideShare a Scribd company logo
Aspect Extraction Using
Conditional Random Fields
Yuliya Rubtsova
Sergey Koshelnikov
SentiRuEval-2015
Aspect-basedsentiment
analysistask
A. Extract explicit aspects
from the offered review
B. Extract all the aspects
from the offered review
C. Perform sentiment
analysis of the aspects
D. Categorize the aspects
terms by predefined
categories
E. Sentiment classification
of the whole review on
aspect categories
SentiRuEval-2015
Aspect-basedsentiment
analysistask
A. Extract explicit aspects
from the offered review
B. Extract all the aspects
from the offered review
C. Perform sentiment
analysis of the aspects
D. Categorize the aspects
terms by predefined
categories
E. Sentiment classification
of the whole review on
aspect categories
Major approaches to extract aspects
Frequency of nouns and/or noun phrases (Hu and Liu, 2004)
Simultaneous extraction of both sentiment words (user
opinions) and aspects
Supervised machine learning (HMM, Jin et al., 2009 and CRF,
Jakob and Gurevych, 2010).
Unsupervised machine learning or topic modeling (Titov and
McDonald, 2008; Brody and Elhadad, 2010)
Major approaches to extract aspects
Frequency of nouns and/or noun phrases (Hu and Liu, 2004)
Simultaneous extraction of both sentiment words (user
opinions) and aspects
Supervised machine learning (HMM, Jin et al., 2009 and CRF,
Jakob and Gurevych, 2010).
Unsupervised machine learning or topic modeling (Titov and
McDonald, 2008; Brody and Elhadad, 2010)
Conditional Random fields (CRF)
CRFs are a type of discriminative undirected probabilistic
graphical model. It is used to encode known
relationships between observations and construct
consistent interpretations.
Conditional Random fields (CRF)
Let G be a graph such that Y = (Yν) ν∈V, so that Y is indexed by the
vertices of G. Then (X, Y) is a conditional random field when the
random variables Yν, conditioned on X, obey the Markov
property with respect to the graph
P(yv |YV {v}, X)= P(yv |Yo(v), X),
Conditional Random fields (CRF)
Where Z(x) is normalization factor,
C – set of all graphs’ cliques,
fc – set of features,
𝜆𝑖 – factors.
P(Y | X) =
1
Z(X)
exp( lccÎC
å fc (yc, X)),
CRF advantages
Relaxation of the independence
assumptions
CRFs avoid the label bias
problem
System description
“s-e” – start of an explicit aspect term,
“c-e” – continuation of an explicit aspect term,
“s-i” – start of an implicit aspect term,
“c-i” – continuation of an implicit aspect term,
“s-f” – start of an implicit aspect term,
“c-f” – continuation of an implicit aspect term, “O”
indicates not an aspect term.
Pre-processing
System description
To extract syntactic features (e.g. POS,
lemma) we used TreeTagger for Russian
(Sharoff, 2008)
We also converted all the capital letters
into lowercase
Pre-processing
System description
Word
POS
Lemma
features
System description
example
Очень дружелюбное место, с порога встречают
симпатичные работники, тёплый, уютный интерьер и
зажигательная музыка
Very friendly place where pretty staff meet from the threshold, warm and
cozy interior and incendiary music
System description
example
w[0]=очень w[-1]=null w[1]=дружелюбное pos[0]=r O
w[0]=дружелюбное w[-1]=очень w[1]=место pos[0]=a O
w[0]=место w[-1]=дружелюбное w[1]=null pos[0]=n s-e
w[0]=с w[-1]=null w[1]=порога pos[0]=s O
w[0]=порога w[-1]=с w[1]=встречают pos[0]=n O
w[0]=встречают w[-1]=порога w[1]=симпатичные pos[0]=v s-e
w[0]=симпатичные w[-1]=встречают w[1]=работники pos[0]=a O
w[0]=работники w[-1]=симпатичные w[1]=тёплый pos[0]=n O
w[0]=тёплый w[-1]=работники w[1]=уютный pos[0]=a O
w[0]=уютный w[-1]=тёплый w[1]=интерьер pos[0]=a O
w[0]=интерьер w[-1]=уютный w[1]=и pos[0]=n s-e
w[0]=и w[-1]=интерьер w[1]=зажигательная pos[0]=c O
w[0]=зажигательная w[-1]=и w[1]=музыка pos[0]=a O
w[0]=музыка w[-1]=зажигательная w[1]=null pos[0]=n s-e
System description
System 1: CRF with all the above-mentioned labels. We used
s-e, c-e and O labels for explicit aspect extraction to perform
Task A and s-e, c-e, s-i, c-i, s-f, c-f, O to extract all the aspects
for Task B.
System 2: Combination of the results of two CRFs —CRF for
extraction of explicit aspect terms and CRF for extraction of
implicit aspect terms + sentiment facts terms (not explicit).
Task A was performed using System 1 and Task B — using both
systems.
Results
Exact matching and partial matching.
Macro F1-measure means in this case calculating
F1-measure for every review and averaging the
obtained values.
Micro F – partial matching, the intersection
between gold standard and extracted term was
calculated.
F-measure
Results
Task A restaurant domain in comparison to baseline
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Exact matching
baseline Word+POS +lemma
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Precision Recall Fmeasure
Partial matching
baseline Word+POS +lemma
Results
Task A restaurant domain in comparison to the best results
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Exact matching
baseline №1 №2
Word+POS +lemma
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Partial matching
baseline №1 №2
Word+POS +lemma
Results
Task A car domain in comparison to baseline
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Exact matching
baseline Word+POS +lemma
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Partial matching
baseline Word+POS +lemma
Results
Task A car domain in comparison to the best results
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Exact matching
baseline №1 №2
Word+POS +lemma
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Partial matching
baseline №1 №2
Word+POS +lemma
Results
Task B restaurant domain in comparison to baseline
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Exact matching
baseline №1
№2 System 1 Word+POS
+lemma System 2 Word+POS
+lemma2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Partial matching
baseline №1
№2 System 1 Word+POS
+lemma System 2 Word+POS
+lemma2
Results
Task B restaurant domain in comparison to the best results
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Exact matching
baseline №1
№2 System 1 Word+POS
+lemma System 2 Word+POS
+lemma2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Partial matching
baseline №1
№2 System 1 Word+POS
+lemma System 2 Word+POS
+lemma2
Results
Task B car domain in comparison to baseline
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Exact matching
baseline №1
№2 System 1 Word+POS
+lemma System 2 Word+POS
+lemma2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Partial matching
baseline №1
№2 System 1 Word+POS
+lemma System 2 Word+POS
+lemma2
Results
Task B car domain in comparison to the best results
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Exact matching
baseline №1
№2 System 1 Word+POS
+lemma System 2 Word+POS
+lemma2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Precision Recall Fmeasure
Partial matching
baseline №1
№2 System 1 Word+POS
+lemma System 2 Word+POS
+lemma2
Error Analysis
Not recognized
excessively recognized
Error distribution
Restaurants Car
Not
recognized
67,1% 63%
excessively
recognized
32,9% 37%
Error types
1. Technical errors
1.1 Special symbols:
Etalon: Салат "цезарь"
System: Салат "цезарь
1.2 Lower case:
Can’t recognize ie “TO” (technical maintenance in
car domain) and “то” (the particle)
Error types
2. Not recognized
2.1 Shortness
Рублей –> руб. –> р. (rubles -> rub -> R.)
2.2 listings
Овощи, салаты «Цезарь», лосось
(Vegetables, salads "Caesar", salmon)
Error types
3. Partly recognition
3.1. Before head word
“Добавляла вина” (pour wine)
“Официант хамил” (The waiter was rude)
3.2. After head word
“местечко в углу” (a place in the corner)
4. Excessively recognized
4.1 Not always good deal with named entities
Александр (Alexander)
Conclusion
• The performance of our systems was
comparable to the best results of SentiRuEval
participants.
• Realization of these systems demonstrated
that the use of lemmas for the Russian
language as a CRF feature improves the overall
F-measure.
• Subsequently we are going to add statistical
methods as a CRF feature.
Thank you!
Yuliya Rubtsova
yu.rubtsova@gmail.com
study.mokoron.com

More Related Content

PPTX
Conditional Random Fields
PDF
From logistic regression to linear chain CRF
PDF
Cheatsheet supervised-learning
PDF
Multicasting in Linear Deterministic Relay Network by Matrix Completion
PDF
Cheatsheet unsupervised-learning
PDF
Cheatsheet recurrent-neural-networks
PPTX
Introduction to Algorithms and Asymptotic Notation
PPT
Introduction To Algorithm [2]
Conditional Random Fields
From logistic regression to linear chain CRF
Cheatsheet supervised-learning
Multicasting in Linear Deterministic Relay Network by Matrix Completion
Cheatsheet unsupervised-learning
Cheatsheet recurrent-neural-networks
Introduction to Algorithms and Asymptotic Notation
Introduction To Algorithm [2]

What's hot (20)

PPTX
Dag representation of basic blocks
PDF
Regularization
PDF
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
PDF
PECCS 2014
PDF
Regret Minimization in Multi-objective Submodular Function Maximization
PPT
C language programming
PDF
The low-rank basis problem for a matrix subspace
PPTX
08-09 Chapter numerical integration
PPTX
Control flow Graph
PDF
ARIC Team Seminar
PDF
MatLab Basic Tutorial On Plotting
PDF
Digital communication lab lectures
PDF
Distributed ADMM
PDF
programming fortran 77 Slide02
PPT
Code generator
PPTX
Travelling Salesman
PDF
Sample presentation slides template
DOC
Sns pre sem
PDF
DFT and IDFT Matlab Code
PPT
Lecture 03 lexical analysis
Dag representation of basic blocks
Regularization
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
PECCS 2014
Regret Minimization in Multi-objective Submodular Function Maximization
C language programming
The low-rank basis problem for a matrix subspace
08-09 Chapter numerical integration
Control flow Graph
ARIC Team Seminar
MatLab Basic Tutorial On Plotting
Digital communication lab lectures
Distributed ADMM
programming fortran 77 Slide02
Code generator
Travelling Salesman
Sample presentation slides template
Sns pre sem
DFT and IDFT Matlab Code
Lecture 03 lexical analysis
Ad

Viewers also liked (20)

PDF
Understanding Voice of Members via Text Mining – How Linkedin Built a Text An...
PDF
Overview of text mining and NLP (+software)
PDF
Rules masterytraining120319
PPTX
Patologías del Sistema Nervioso
PDF
Daily Newsletter: 20th July, 2011
PPT
财务
PDF
Consejo español para la defensa de la discapacidad y la dependencia.
PDF
Po co ci content marketing
PDF
Abstraction Classes in Software Design
PPTX
Linked Ocean Data - Exploring connections between marine datasets in a Big Da...
PDF
Kabinetschef van Willy Claes was mol voor Amerikanen
PDF
Mobile AR, OOH and the Mirror World
PPT
てすと1
PPTX
lunch and learn presentation
PPT
中正行銷課
PDF
Boletim (2)
PDF
PITCH Program_FinalCROPS
PDF
Gestión 100 días de gobierno
PPTX
Linked In Pg Customer Contact Offering Base Mkt Ver Nov 091[1]
Understanding Voice of Members via Text Mining – How Linkedin Built a Text An...
Overview of text mining and NLP (+software)
Rules masterytraining120319
Patologías del Sistema Nervioso
Daily Newsletter: 20th July, 2011
财务
Consejo español para la defensa de la discapacidad y la dependencia.
Po co ci content marketing
Abstraction Classes in Software Design
Linked Ocean Data - Exploring connections between marine datasets in a Big Da...
Kabinetschef van Willy Claes was mol voor Amerikanen
Mobile AR, OOH and the Mirror World
てすと1
lunch and learn presentation
中正行銷課
Boletim (2)
PITCH Program_FinalCROPS
Gestión 100 días de gobierno
Linked In Pg Customer Contact Offering Base Mkt Ver Nov 091[1]
Ad

Similar to Aspect extraction using conditional random fields [SentiRuEval] (20)

PPTX
APSEC2020 Keynote
PPTX
Robust and Tuneable Family of Gossiping Algorithms
PPTX
C basics
PPTX
C basics
PPTX
Ml3 logistic regression-and_classification_error_metrics
PPT
Memories of Bug Fixes
PDF
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
PPTX
FairBench: A Fairness Assessment Framework
PDF
Unit 2
PPT
Face Identification for Humanoid Robot
PPTX
theory of programming languages by shikra
PPTX
Pertemuan 5.pptx
PPTX
Repair dagstuhl jan2017
PDF
PPT
Designing A Syntax Based Retrieval System03
PPTX
Module 1 ppt class.pptx
PPTX
Module ppt class.pptx
PDF
EE660_Report_YaxinLiu_8448347171
PPTX
Ml2 train test-splits_validation_linear_regression
PDF
imple and new optimization algorithm for solving constrained and unconstraine...
APSEC2020 Keynote
Robust and Tuneable Family of Gossiping Algorithms
C basics
C basics
Ml3 logistic regression-and_classification_error_metrics
Memories of Bug Fixes
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
FairBench: A Fairness Assessment Framework
Unit 2
Face Identification for Humanoid Robot
theory of programming languages by shikra
Pertemuan 5.pptx
Repair dagstuhl jan2017
Designing A Syntax Based Retrieval System03
Module 1 ppt class.pptx
Module ppt class.pptx
EE660_Report_YaxinLiu_8448347171
Ml2 train test-splits_validation_linear_regression
imple and new optimization algorithm for solving constrained and unconstraine...

More from Yuliya Rubtsova (17)

PPTX
Как продать самолет с помощью соц.сетей или социальные сети для бизнеса
PPTX
Entity-oriented sentiment analysis of tweets: results and problems
PPTX
Automatic term extraction of dynamically updated text collections for sentime...
PPT
Измеряй и властвуй или практическая web-аналитика
PPTX
Метод построения корпуса коротких текстов
PPTX
Веб аналитика на практике
PPTX
Mad analyst
PPTX
Курс леций по основам интернет маркетинга и поисковой оптимизации
PPTX
Web analytics в картинках и денежных знаках
PPT
Продвижение мобильных приложений в AppStore и Google Play
PPT
Увеличение конверсии сайта
PPTX
Как из посетителя сделать покупателя
PPTX
Mobile applications market
PPTX
Intranet
PPTX
Networking
PPTX
Usability testing
PPT
Twitter marketing communications
Как продать самолет с помощью соц.сетей или социальные сети для бизнеса
Entity-oriented sentiment analysis of tweets: results and problems
Automatic term extraction of dynamically updated text collections for sentime...
Измеряй и властвуй или практическая web-аналитика
Метод построения корпуса коротких текстов
Веб аналитика на практике
Mad analyst
Курс леций по основам интернет маркетинга и поисковой оптимизации
Web analytics в картинках и денежных знаках
Продвижение мобильных приложений в AppStore и Google Play
Увеличение конверсии сайта
Как из посетителя сделать покупателя
Mobile applications market
Intranet
Networking
Usability testing
Twitter marketing communications

Recently uploaded (20)

PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
. Radiology Case Scenariosssssssssssssss
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPT
protein biochemistry.ppt for university classes
Introduction to Fisheries Biotechnology_Lesson 1.pptx
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
Placing the Near-Earth Object Impact Probability in Context
2. Earth - The Living Planet Module 2ELS
neck nodes and dissection types and lymph nodes levels
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
POSITIONING IN OPERATION THEATRE ROOM.ppt
TOTAL hIP ARTHROPLASTY Presentation.pptx
Comparative Structure of Integument in Vertebrates.pptx
AlphaEarth Foundations and the Satellite Embedding dataset
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
. Radiology Case Scenariosssssssssssssss
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
ECG_Course_Presentation د.محمد صقران ppt
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
Viruses (History, structure and composition, classification, Bacteriophage Re...
protein biochemistry.ppt for university classes

Aspect extraction using conditional random fields [SentiRuEval]

  • 1. Aspect Extraction Using Conditional Random Fields Yuliya Rubtsova Sergey Koshelnikov
  • 2. SentiRuEval-2015 Aspect-basedsentiment analysistask A. Extract explicit aspects from the offered review B. Extract all the aspects from the offered review C. Perform sentiment analysis of the aspects D. Categorize the aspects terms by predefined categories E. Sentiment classification of the whole review on aspect categories
  • 3. SentiRuEval-2015 Aspect-basedsentiment analysistask A. Extract explicit aspects from the offered review B. Extract all the aspects from the offered review C. Perform sentiment analysis of the aspects D. Categorize the aspects terms by predefined categories E. Sentiment classification of the whole review on aspect categories
  • 4. Major approaches to extract aspects Frequency of nouns and/or noun phrases (Hu and Liu, 2004) Simultaneous extraction of both sentiment words (user opinions) and aspects Supervised machine learning (HMM, Jin et al., 2009 and CRF, Jakob and Gurevych, 2010). Unsupervised machine learning or topic modeling (Titov and McDonald, 2008; Brody and Elhadad, 2010)
  • 5. Major approaches to extract aspects Frequency of nouns and/or noun phrases (Hu and Liu, 2004) Simultaneous extraction of both sentiment words (user opinions) and aspects Supervised machine learning (HMM, Jin et al., 2009 and CRF, Jakob and Gurevych, 2010). Unsupervised machine learning or topic modeling (Titov and McDonald, 2008; Brody and Elhadad, 2010)
  • 6. Conditional Random fields (CRF) CRFs are a type of discriminative undirected probabilistic graphical model. It is used to encode known relationships between observations and construct consistent interpretations.
  • 7. Conditional Random fields (CRF) Let G be a graph such that Y = (Yν) ν∈V, so that Y is indexed by the vertices of G. Then (X, Y) is a conditional random field when the random variables Yν, conditioned on X, obey the Markov property with respect to the graph P(yv |YV {v}, X)= P(yv |Yo(v), X),
  • 8. Conditional Random fields (CRF) Where Z(x) is normalization factor, C – set of all graphs’ cliques, fc – set of features, 𝜆𝑖 – factors. P(Y | X) = 1 Z(X) exp( lccÎC å fc (yc, X)),
  • 9. CRF advantages Relaxation of the independence assumptions CRFs avoid the label bias problem
  • 10. System description “s-e” – start of an explicit aspect term, “c-e” – continuation of an explicit aspect term, “s-i” – start of an implicit aspect term, “c-i” – continuation of an implicit aspect term, “s-f” – start of an implicit aspect term, “c-f” – continuation of an implicit aspect term, “O” indicates not an aspect term. Pre-processing
  • 11. System description To extract syntactic features (e.g. POS, lemma) we used TreeTagger for Russian (Sharoff, 2008) We also converted all the capital letters into lowercase Pre-processing
  • 13. System description example Очень дружелюбное место, с порога встречают симпатичные работники, тёплый, уютный интерьер и зажигательная музыка Very friendly place where pretty staff meet from the threshold, warm and cozy interior and incendiary music
  • 14. System description example w[0]=очень w[-1]=null w[1]=дружелюбное pos[0]=r O w[0]=дружелюбное w[-1]=очень w[1]=место pos[0]=a O w[0]=место w[-1]=дружелюбное w[1]=null pos[0]=n s-e w[0]=с w[-1]=null w[1]=порога pos[0]=s O w[0]=порога w[-1]=с w[1]=встречают pos[0]=n O w[0]=встречают w[-1]=порога w[1]=симпатичные pos[0]=v s-e w[0]=симпатичные w[-1]=встречают w[1]=работники pos[0]=a O w[0]=работники w[-1]=симпатичные w[1]=тёплый pos[0]=n O w[0]=тёплый w[-1]=работники w[1]=уютный pos[0]=a O w[0]=уютный w[-1]=тёплый w[1]=интерьер pos[0]=a O w[0]=интерьер w[-1]=уютный w[1]=и pos[0]=n s-e w[0]=и w[-1]=интерьер w[1]=зажигательная pos[0]=c O w[0]=зажигательная w[-1]=и w[1]=музыка pos[0]=a O w[0]=музыка w[-1]=зажигательная w[1]=null pos[0]=n s-e
  • 15. System description System 1: CRF with all the above-mentioned labels. We used s-e, c-e and O labels for explicit aspect extraction to perform Task A and s-e, c-e, s-i, c-i, s-f, c-f, O to extract all the aspects for Task B. System 2: Combination of the results of two CRFs —CRF for extraction of explicit aspect terms and CRF for extraction of implicit aspect terms + sentiment facts terms (not explicit). Task A was performed using System 1 and Task B — using both systems.
  • 16. Results Exact matching and partial matching. Macro F1-measure means in this case calculating F1-measure for every review and averaging the obtained values. Micro F – partial matching, the intersection between gold standard and extracted term was calculated. F-measure
  • 17. Results Task A restaurant domain in comparison to baseline 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Exact matching baseline Word+POS +lemma 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Precision Recall Fmeasure Partial matching baseline Word+POS +lemma
  • 18. Results Task A restaurant domain in comparison to the best results 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Exact matching baseline №1 №2 Word+POS +lemma 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Partial matching baseline №1 №2 Word+POS +lemma
  • 19. Results Task A car domain in comparison to baseline 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Exact matching baseline Word+POS +lemma 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Partial matching baseline Word+POS +lemma
  • 20. Results Task A car domain in comparison to the best results 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Exact matching baseline №1 №2 Word+POS +lemma 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Partial matching baseline №1 №2 Word+POS +lemma
  • 21. Results Task B restaurant domain in comparison to baseline 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Exact matching baseline №1 №2 System 1 Word+POS +lemma System 2 Word+POS +lemma2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Partial matching baseline №1 №2 System 1 Word+POS +lemma System 2 Word+POS +lemma2
  • 22. Results Task B restaurant domain in comparison to the best results 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Exact matching baseline №1 №2 System 1 Word+POS +lemma System 2 Word+POS +lemma2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Partial matching baseline №1 №2 System 1 Word+POS +lemma System 2 Word+POS +lemma2
  • 23. Results Task B car domain in comparison to baseline 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Exact matching baseline №1 №2 System 1 Word+POS +lemma System 2 Word+POS +lemma2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Partial matching baseline №1 №2 System 1 Word+POS +lemma System 2 Word+POS +lemma2
  • 24. Results Task B car domain in comparison to the best results 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Exact matching baseline №1 №2 System 1 Word+POS +lemma System 2 Word+POS +lemma2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Precision Recall Fmeasure Partial matching baseline №1 №2 System 1 Word+POS +lemma System 2 Word+POS +lemma2
  • 26. Error distribution Restaurants Car Not recognized 67,1% 63% excessively recognized 32,9% 37%
  • 27. Error types 1. Technical errors 1.1 Special symbols: Etalon: Салат "цезарь" System: Салат "цезарь 1.2 Lower case: Can’t recognize ie “TO” (technical maintenance in car domain) and “то” (the particle)
  • 28. Error types 2. Not recognized 2.1 Shortness Рублей –> руб. –> р. (rubles -> rub -> R.) 2.2 listings Овощи, салаты «Цезарь», лосось (Vegetables, salads "Caesar", salmon)
  • 29. Error types 3. Partly recognition 3.1. Before head word “Добавляла вина” (pour wine) “Официант хамил” (The waiter was rude) 3.2. After head word “местечко в углу” (a place in the corner) 4. Excessively recognized 4.1 Not always good deal with named entities Александр (Alexander)
  • 30. Conclusion • The performance of our systems was comparable to the best results of SentiRuEval participants. • Realization of these systems demonstrated that the use of lemmas for the Russian language as a CRF feature improves the overall F-measure. • Subsequently we are going to add statistical methods as a CRF feature.