SlideShare a Scribd company logo
Predicting Star Ratings
based on Annotated Reviews
of Mobile Apps
Talk at the 6th International Workshop on Advances in Semantic Information Retrieval
ASIR 2016
Prof. Dr. Dagmar Monett, Hermann Stolte
D. Monett
Reviews and star ratings
2Gdańsk, Poland, September 11 – 14, 2016
Example of reviews and star ratings of the
Evernote App, Google Play Store (07/2016)
D. Monett
Star ratings matter
3Gdańsk, Poland, September 11 – 14, 2016
15% would consider downloading an app with a 2-star rating
50% would consider downloading an app with a 3-star rating
96% would consider downloading an app with a 4-star rating
Source: Aptentive 2015 Consumer Study
The Mobile Marketer‘s Guide to App Store Ratings & Reviews
D. Monett
Star ratings matter
4Gdańsk, Poland, September 11 – 14, 2016
© and source: Aptentive 2015 Consumer Study
The Mobile Marketer‘s Guide to App Store Ratings & Reviews
D. Monett 5Gdańsk, Poland, September 11 – 14, 2016
Our motivation
D. Monett
Some questions…
6Gdańsk, Poland, September 11 – 14, 2016
■ Could we (a program) teach users how to rate
apps consistently with the review they are writing
for a mobile app?
■ I.e., could we (a program) suggest to users the
most adequate star rating they should give to a
product depending on the semantic orientation of
what they have already written in the review?
■ Would it mean an improvement of users'
engagement and satisfaction with the app?
D. Monett 7Gdańsk, Poland, September 11 – 14, 2016
Background
D. Monett 8Gdańsk, Poland, September 11 – 14, 2016
Review rating prediction
■ Also sentiment rating prediction:
■ …a task that deals with the inference of an
author's implied numerical rating, i.e. on the
prediction of a rating score, from a given written
review
■ E.g., recommendation systems often suggest
products based on star ratings of similar
products previously rated by other users
D. Monett 9Gdańsk, Poland, September 11 – 14, 2016
Suggested readings
D. Monett 10Gdańsk, Poland, September 11 – 14, 2016
Other related work
■ Analysing textual reviews and inferring sentiment
polarity –positive/negative/neutral– (Pang et al. 2002;
Liu, 2010)
■ Using not only textual semantics but also other
information, e.g., about the author and/or the
product (Tang et al., 2015; Li et al. 2011)
■ Considering phrase-level sentiment polarity (Qu et
al., 2010)
■ Considering aspect-based opinion mining (Zhang et
al., 2006; Ganu et al., 2013; Klinger & Cimiano, 2013; Sänger, 2015)
D. Monett 11Gdańsk, Poland, September 11 – 14, 2016
Our approach
D. Monett 12Gdańsk, Poland, September 11 – 14, 2016
Our approach
■ We do not deal with aspect identification nor with
sentiment classification
■ We are assuming that these tasks are already
performed before the star ratings are predicted
■ We focus on predicting star ratings based solely
on available annotated, fine-granular opinions
■ I.e., a complement to works like (Sänger, 2015) which
extends (Klinger & Cimiano, 2013) and use a German
annotated corpus of mobile apps
D. Monett 13Gdańsk, Poland, September 11 – 14, 2016
The Data
D. Monett 14Gdańsk, Poland, September 11 – 14, 2016
SCARE Corpus
Mario Sänger, Ulf Leser, Steffen Kemmerer, Peter Adolphs, and Roman Klinger.
SCARE - The Sentiment Corpus of App Reviews with Fine-grained Annotations in
German. In Proceedings of the Tenth International Conference on Language
Resources and Evaluation (LREC'16), Portorož, Slovenia, May 2016. European
Language Resources Association (ELRA).
■ Fine-grained annotations for mobile application
reviews from the Google Play Store
■ 1,760 German application reviews with 2,487
aspects and 3,959 subjective phrases
■ SCARE corpus v.1.0.0 (annotations only)
■ Available at http://www.romanklinger.de/scare/
D. Monett 15Gdańsk, Poland, September 11 – 14, 2016
Analysing the Data
D. Monett 16Gdańsk, Poland, September 11 – 14, 2016
Polarity and star ratings
69.1%
23.1%
Thumbs-up-thumbs-down
(Liu, 2012)
D. Monett
Avg. of labelled star ratings vs.
avg. of subjective phrases polarity
17Gdańsk, Poland, September 11 – 14, 2016
D. Monett
Number of star ratings vs.
number of subjective phrases
18Gdańsk, Poland, September 11 – 14, 2016
D. Monett 19Gdańsk, Poland, September 11 – 14, 2016
Predicting
Star Ratings
D. Monett
Prediction process
20Gdańsk, Poland, September 11 – 14, 2016
D. Monett 21Gdańsk, Poland, September 11 – 14, 2016
We “played” with
different models
D. Monett
Computational models
22Gdańsk, Poland, September 11 – 14, 2016
For example,
x0=1
x1 : no. of subjective phrases with positive polarity
x2 : no. of subjective phrases with negative polarity
x3 : no. of subjective phrases with neutral polarity
D. Monett
Computational models
23Gdańsk, Poland, September 11 – 14, 2016
RSS: review rating score (Ganu et al., 2009, 2013)
D. Monett
Experiments
24Gdańsk, Poland, September 11 – 14, 2016
(1) Assessing the importance of sentiment in the
reviews:
■ Neutral phrases (yes/no)?
■ Reviews with no sentiment (yes/no)?
(2) Using other predictors
■ Each individual experiment is run 10,000 times
■ A Monte Carlo cross-validation: 70% training
dataset and 30% testing dataset, randomly on each
iteration.
D. Monett 25Gdańsk, Poland, September 11 – 14, 2016
Some results
D. Monett
“Best” model, exp. (1)
26Gdańsk, Poland, September 11 – 14, 2016
■ It considers only the average value of the
polarities of a review in one feature:
■ Plus:
■ filtering both subjective phrases with neutral
polarity and reviews with no sentiment
orientation at all
■ No normalisation
D. Monett
Results
27Gdańsk, Poland, September 11 – 14, 2016
D. Monett 28Gdańsk, Poland, September 11 – 14, 2016
Conclusion
D. Monett
Conclusion
29Gdańsk, Poland, September 11 – 14, 2016
■ Textually-derived rating prediction can be
performed well even when only phrase-level
sentiment polarity is available
■ Phrases with neutral sentiment could be filtered
out of the corpus
■ Computing the overall sentiment of a review using
the review rating score (Ganu et al., 2009, 2013) provides
the best star rating predictions
D. Monett
Further work
30Gdańsk, Poland, September 11 – 14, 2016
■ To consider the aspects’ relevance
■ aspect-oriented subjective phrases
■ To analyse the strengths of the opinions (Wilson et al.,
2004)
■ not only positive/negative/neutral sentiment
■ To deal with other types of models different than
linear, multivariate regression ones
D. Monett
Sources
31Gdańsk, Poland, September 11 – 14, 2016
Related work:
- See references list on our paper!
■ https://www.researchgate.net/publication/304244445_Predi
cting_Star_Ratings_based_on_Annotated_Reviews_of_Mo
bile_Apps
dagmar@monettdiaz.com
monettdiaz
Contact:

More Related Content

PDF
Simulating the Fractional Reserve Banking using Agent-based Modelling with Ne...
PDF
Agile or traditional Software Engineering?
PDF
Joint Software Engineering to support STEM Education: Experiences before, dur...
PDF
Walking the path from the MOOC to my classroom: My collection of methods and ...
PDF
MATHEON Center Days: Index determination and structural analysis using Algori...
PDF
Using BDI-extended NetLogo Agents in Undergraduate CS Research and Teaching
PDF
Understanding the Cuban Blogosphere: Retrospective and Perspectives based on ...
PDF
Evolving Lesson Plans to Assist Educators: From Paper-Based to Adaptive Lesso...
Simulating the Fractional Reserve Banking using Agent-based Modelling with Ne...
Agile or traditional Software Engineering?
Joint Software Engineering to support STEM Education: Experiences before, dur...
Walking the path from the MOOC to my classroom: My collection of methods and ...
MATHEON Center Days: Index determination and structural analysis using Algori...
Using BDI-extended NetLogo Agents in Undergraduate CS Research and Teaching
Understanding the Cuban Blogosphere: Retrospective and Perspectives based on ...
Evolving Lesson Plans to Assist Educators: From Paper-Based to Adaptive Lesso...

Viewers also liked (20)

PDF
Teaching Students Collaborative Requirements Engineering. Case Study Red:Wire
PDF
Index Determination in DAEs using the Library indexdet and the ADOL-C Package...
PPTX
Software Requirements Elicitation Methods
PDF
Agile Project-Based Teaching and Learning
PDF
Agent-Based Configuration of (Metaheuristic) Algorithms - Doctoral dissertation
PDF
Experiences in Software Testing (lecture slides)
PDF
Key Issues for Requirements Engineering (lecture slides)
PDF
E-Learning Adoption in a Higher Education Setting: An Empirical Study
PDF
Requirements Engineering Techniques for Eliciting Requirements (lecture slides)
PDF
Software requirement elicitation
PDF
Introduction to Agents and Multi-agent Systems (lecture slides)
PDF
A Structured Approach to Requirements Analysis (lecture slides)
PDF
Methods for Validating and Testing Software Requirements (lecture slides)
PDF
Genetic Algorithms and Ant Colony Optimisation (lecture slides)
PDF
Requirements Engineering Methods for Documenting Requirements (lecture slides)
PDF
Modelling Software Requirements: Important diagrams and templates (lecture sl...
PPTX
How to Gather Requirements
PPTX
Elicitation techniques
PPTX
Software Requirement Elicitation Techniques http://www.imran.xyz
PPTX
Requirements Gathering Best Practice Pack
Teaching Students Collaborative Requirements Engineering. Case Study Red:Wire
Index Determination in DAEs using the Library indexdet and the ADOL-C Package...
Software Requirements Elicitation Methods
Agile Project-Based Teaching and Learning
Agent-Based Configuration of (Metaheuristic) Algorithms - Doctoral dissertation
Experiences in Software Testing (lecture slides)
Key Issues for Requirements Engineering (lecture slides)
E-Learning Adoption in a Higher Education Setting: An Empirical Study
Requirements Engineering Techniques for Eliciting Requirements (lecture slides)
Software requirement elicitation
Introduction to Agents and Multi-agent Systems (lecture slides)
A Structured Approach to Requirements Analysis (lecture slides)
Methods for Validating and Testing Software Requirements (lecture slides)
Genetic Algorithms and Ant Colony Optimisation (lecture slides)
Requirements Engineering Methods for Documenting Requirements (lecture slides)
Modelling Software Requirements: Important diagrams and templates (lecture sl...
How to Gather Requirements
Elicitation techniques
Software Requirement Elicitation Techniques http://www.imran.xyz
Requirements Gathering Best Practice Pack
Ad

More from Dagmar Monett (12)

PDF
Deconstructing the AI Myth: Fallacies and Harms of Algorithmification
PDF
Narratives that speak AI lingua? AI vocabulary in listed companies' annual re...
PDF
Game-based Learning as a Suitable Approach for Teaching Digital Ethical Think...
PDF
University-Industry Collaboration's Next Level: A Comparative Study as Basis ...
PDF
The Changing Landscape of Digital Technologies for Learning
PDF
Will Robots Take all the Jobs? Not yet.
PDF
Coming to terms with intelligence in machines
PDF
The Intelligence Corpus, an Annotated Corpus of Definitions of Intelligence: ...
PDF
Artificial Intelligence: The Promise, the Myth, and a Dose of Reality
PDF
Intelligence, the elusive concept and general capability still not found in m...
PDF
The I in AI (or why there is still none)
PDF
Erfahrungen aus Projektbasiertes Lernen im Informatik Studium - The Missing p...
Deconstructing the AI Myth: Fallacies and Harms of Algorithmification
Narratives that speak AI lingua? AI vocabulary in listed companies' annual re...
Game-based Learning as a Suitable Approach for Teaching Digital Ethical Think...
University-Industry Collaboration's Next Level: A Comparative Study as Basis ...
The Changing Landscape of Digital Technologies for Learning
Will Robots Take all the Jobs? Not yet.
Coming to terms with intelligence in machines
The Intelligence Corpus, an Annotated Corpus of Definitions of Intelligence: ...
Artificial Intelligence: The Promise, the Myth, and a Dose of Reality
Intelligence, the elusive concept and general capability still not found in m...
The I in AI (or why there is still none)
Erfahrungen aus Projektbasiertes Lernen im Informatik Studium - The Missing p...
Ad

Recently uploaded (20)

PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PPTX
1_Introduction to advance data techniques.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
Lecture1 pattern recognition............
PPTX
Database Infoormation System (DBIS).pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Introduction to the R Programming Language
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Business Analytics and business intelligence.pdf
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
1_Introduction to advance data techniques.pptx
Clinical guidelines as a resource for EBP(1).pdf
Reliability_Chapter_ presentation 1221.5784
Lecture1 pattern recognition............
Database Infoormation System (DBIS).pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction to the R Programming Language
SAP 2 completion done . PRESENTATION.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
.pdf is not working space design for the following data for the following dat...
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Business Analytics and business intelligence.pdf
Miokarditis (Inflamasi pada Otot Jantung)
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
STERILIZATION AND DISINFECTION-1.ppthhhbx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx

Predicting Star Ratings based on Annotated Reviewss of Mobile Apps [Slides]

  • 1. Predicting Star Ratings based on Annotated Reviews of Mobile Apps Talk at the 6th International Workshop on Advances in Semantic Information Retrieval ASIR 2016 Prof. Dr. Dagmar Monett, Hermann Stolte
  • 2. D. Monett Reviews and star ratings 2Gdańsk, Poland, September 11 – 14, 2016 Example of reviews and star ratings of the Evernote App, Google Play Store (07/2016)
  • 3. D. Monett Star ratings matter 3Gdańsk, Poland, September 11 – 14, 2016 15% would consider downloading an app with a 2-star rating 50% would consider downloading an app with a 3-star rating 96% would consider downloading an app with a 4-star rating Source: Aptentive 2015 Consumer Study The Mobile Marketer‘s Guide to App Store Ratings & Reviews
  • 4. D. Monett Star ratings matter 4Gdańsk, Poland, September 11 – 14, 2016 © and source: Aptentive 2015 Consumer Study The Mobile Marketer‘s Guide to App Store Ratings & Reviews
  • 5. D. Monett 5Gdańsk, Poland, September 11 – 14, 2016 Our motivation
  • 6. D. Monett Some questions… 6Gdańsk, Poland, September 11 – 14, 2016 ■ Could we (a program) teach users how to rate apps consistently with the review they are writing for a mobile app? ■ I.e., could we (a program) suggest to users the most adequate star rating they should give to a product depending on the semantic orientation of what they have already written in the review? ■ Would it mean an improvement of users' engagement and satisfaction with the app?
  • 7. D. Monett 7Gdańsk, Poland, September 11 – 14, 2016 Background
  • 8. D. Monett 8Gdańsk, Poland, September 11 – 14, 2016 Review rating prediction ■ Also sentiment rating prediction: ■ …a task that deals with the inference of an author's implied numerical rating, i.e. on the prediction of a rating score, from a given written review ■ E.g., recommendation systems often suggest products based on star ratings of similar products previously rated by other users
  • 9. D. Monett 9Gdańsk, Poland, September 11 – 14, 2016 Suggested readings
  • 10. D. Monett 10Gdańsk, Poland, September 11 – 14, 2016 Other related work ■ Analysing textual reviews and inferring sentiment polarity –positive/negative/neutral– (Pang et al. 2002; Liu, 2010) ■ Using not only textual semantics but also other information, e.g., about the author and/or the product (Tang et al., 2015; Li et al. 2011) ■ Considering phrase-level sentiment polarity (Qu et al., 2010) ■ Considering aspect-based opinion mining (Zhang et al., 2006; Ganu et al., 2013; Klinger & Cimiano, 2013; Sänger, 2015)
  • 11. D. Monett 11Gdańsk, Poland, September 11 – 14, 2016 Our approach
  • 12. D. Monett 12Gdańsk, Poland, September 11 – 14, 2016 Our approach ■ We do not deal with aspect identification nor with sentiment classification ■ We are assuming that these tasks are already performed before the star ratings are predicted ■ We focus on predicting star ratings based solely on available annotated, fine-granular opinions ■ I.e., a complement to works like (Sänger, 2015) which extends (Klinger & Cimiano, 2013) and use a German annotated corpus of mobile apps
  • 13. D. Monett 13Gdańsk, Poland, September 11 – 14, 2016 The Data
  • 14. D. Monett 14Gdańsk, Poland, September 11 – 14, 2016 SCARE Corpus Mario Sänger, Ulf Leser, Steffen Kemmerer, Peter Adolphs, and Roman Klinger. SCARE - The Sentiment Corpus of App Reviews with Fine-grained Annotations in German. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), Portorož, Slovenia, May 2016. European Language Resources Association (ELRA). ■ Fine-grained annotations for mobile application reviews from the Google Play Store ■ 1,760 German application reviews with 2,487 aspects and 3,959 subjective phrases ■ SCARE corpus v.1.0.0 (annotations only) ■ Available at http://www.romanklinger.de/scare/
  • 15. D. Monett 15Gdańsk, Poland, September 11 – 14, 2016 Analysing the Data
  • 16. D. Monett 16Gdańsk, Poland, September 11 – 14, 2016 Polarity and star ratings 69.1% 23.1% Thumbs-up-thumbs-down (Liu, 2012)
  • 17. D. Monett Avg. of labelled star ratings vs. avg. of subjective phrases polarity 17Gdańsk, Poland, September 11 – 14, 2016
  • 18. D. Monett Number of star ratings vs. number of subjective phrases 18Gdańsk, Poland, September 11 – 14, 2016
  • 19. D. Monett 19Gdańsk, Poland, September 11 – 14, 2016 Predicting Star Ratings
  • 20. D. Monett Prediction process 20Gdańsk, Poland, September 11 – 14, 2016
  • 21. D. Monett 21Gdańsk, Poland, September 11 – 14, 2016 We “played” with different models
  • 22. D. Monett Computational models 22Gdańsk, Poland, September 11 – 14, 2016 For example, x0=1 x1 : no. of subjective phrases with positive polarity x2 : no. of subjective phrases with negative polarity x3 : no. of subjective phrases with neutral polarity
  • 23. D. Monett Computational models 23Gdańsk, Poland, September 11 – 14, 2016 RSS: review rating score (Ganu et al., 2009, 2013)
  • 24. D. Monett Experiments 24Gdańsk, Poland, September 11 – 14, 2016 (1) Assessing the importance of sentiment in the reviews: ■ Neutral phrases (yes/no)? ■ Reviews with no sentiment (yes/no)? (2) Using other predictors ■ Each individual experiment is run 10,000 times ■ A Monte Carlo cross-validation: 70% training dataset and 30% testing dataset, randomly on each iteration.
  • 25. D. Monett 25Gdańsk, Poland, September 11 – 14, 2016 Some results
  • 26. D. Monett “Best” model, exp. (1) 26Gdańsk, Poland, September 11 – 14, 2016 ■ It considers only the average value of the polarities of a review in one feature: ■ Plus: ■ filtering both subjective phrases with neutral polarity and reviews with no sentiment orientation at all ■ No normalisation
  • 27. D. Monett Results 27Gdańsk, Poland, September 11 – 14, 2016
  • 28. D. Monett 28Gdańsk, Poland, September 11 – 14, 2016 Conclusion
  • 29. D. Monett Conclusion 29Gdańsk, Poland, September 11 – 14, 2016 ■ Textually-derived rating prediction can be performed well even when only phrase-level sentiment polarity is available ■ Phrases with neutral sentiment could be filtered out of the corpus ■ Computing the overall sentiment of a review using the review rating score (Ganu et al., 2009, 2013) provides the best star rating predictions
  • 30. D. Monett Further work 30Gdańsk, Poland, September 11 – 14, 2016 ■ To consider the aspects’ relevance ■ aspect-oriented subjective phrases ■ To analyse the strengths of the opinions (Wilson et al., 2004) ■ not only positive/negative/neutral sentiment ■ To deal with other types of models different than linear, multivariate regression ones
  • 31. D. Monett Sources 31Gdańsk, Poland, September 11 – 14, 2016 Related work: - See references list on our paper! ■ https://www.researchgate.net/publication/304244445_Predi cting_Star_Ratings_based_on_Annotated_Reviews_of_Mo bile_Apps