Predicting Star Ratings based on Annotated Reviewss of Mobile Apps [Slides]

Predicting Star Ratings
based on Annotated Reviews
of Mobile Apps
Talk at the 6th International Workshop on Advances in Semantic Information Retrieval
ASIR 2016
Prof. Dr. Dagmar Monett, Hermann Stolte

D. Monett
Reviews and star ratings
2Gdańsk, Poland, September 11 – 14, 2016
Example of reviews and star ratings of the
Evernote App, Google Play Store (07/2016)

D. Monett
Star ratings matter
15% would consider downloading an app with a 2-star rating
Source: Aptentive 2015 Consumer Study
The Mobile Marketer‘s Guide to App Store Ratings & Reviews

D. Monett
Star ratings matter
© and source: Aptentive 2015 Consumer Study
The Mobile Marketer‘s Guide to App Store Ratings & Reviews

D. Monett 5Gdańsk, Poland, September 11 – 14, 2016
Our motivation

D. Monett
Some questions…
■ Could we (a program) teach users how to rate
apps consistently with the review they are writing
for a mobile app?
■ I.e., could we (a program) suggest to users the
most adequate star rating they should give to a
product depending on the semantic orientation of
what they have already written in the review?
■ Would it mean an improvement of users'
engagement and satisfaction with the app?

Background

Review rating prediction
■ Also sentiment rating prediction:
■ …a task that deals with the inference of an
author's implied numerical rating, i.e. on the
prediction of a rating score, from a given written
review
■ E.g., recommendation systems often suggest
products based on star ratings of similar
products previously rated by other users

Suggested readings

Other related work
■ Analysing textual reviews and inferring sentiment
polarity –positive/negative/neutral– (Pang et al. 2002;
Liu, 2010)
■ Using not only textual semantics but also other
information, e.g., about the author and/or the
product (Tang et al., 2015; Li et al. 2011)
■ Considering phrase-level sentiment polarity (Qu et
al., 2010)
■ Considering aspect-based opinion mining (Zhang et
al., 2006; Ganu et al., 2013; Klinger & Cimiano, 2013; Sänger, 2015)

Our approach

Our approach
■ We do not deal with aspect identification nor with
sentiment classification
■ We are assuming that these tasks are already
performed before the star ratings are predicted
■ We focus on predicting star ratings based solely
on available annotated, fine-granular opinions
■ I.e., a complement to works like (Sänger, 2015) which
extends (Klinger & Cimiano, 2013) and use a German
annotated corpus of mobile apps

The Data

SCARE Corpus
Mario Sänger, Ulf Leser, Steffen Kemmerer, Peter Adolphs, and Roman Klinger.
SCARE - The Sentiment Corpus of App Reviews with Fine-grained Annotations in
German. In Proceedings of the Tenth International Conference on Language
Resources and Evaluation (LREC'16), Portorož, Slovenia, May 2016. European
Language Resources Association (ELRA).
■ Fine-grained annotations for mobile application
reviews from the Google Play Store
■ 1,760 German application reviews with 2,487
aspects and 3,959 subjective phrases
■ SCARE corpus v.1.0.0 (annotations only)
■ Available at http://www.romanklinger.de/scare/

Analysing the Data

Polarity and star ratings
69.1%
23.1%
Thumbs-up-thumbs-down
(Liu, 2012)

D. Monett
Avg. of labelled star ratings vs.
avg. of subjective phrases polarity

D. Monett
Number of star ratings vs.
number of subjective phrases

Predicting
Star Ratings

D. Monett
Prediction process

We “played” with
different models

D. Monett
Computational models
For example,
x0=1
x1 : no. of subjective phrases with positive polarity
x2 : no. of subjective phrases with negative polarity
x3 : no. of subjective phrases with neutral polarity

D. Monett
Computational models
RSS: review rating score (Ganu et al., 2009, 2013)

D. Monett
Experiments
(1) Assessing the importance of sentiment in the
reviews:
■ Neutral phrases (yes/no)?
■ Reviews with no sentiment (yes/no)?
(2) Using other predictors
■ Each individual experiment is run 10,000 times
■ A Monte Carlo cross-validation: 70% training
dataset and 30% testing dataset, randomly on each
iteration.

Some results

D. Monett
“Best” model, exp. (1)
■ It considers only the average value of the
polarities of a review in one feature:
■ Plus:
■ filtering both subjective phrases with neutral
polarity and reviews with no sentiment
orientation at all
■ No normalisation

D. Monett
Results

Conclusion

D. Monett
Conclusion
■ Textually-derived rating prediction can be
performed well even when only phrase-level
sentiment polarity is available
■ Phrases with neutral sentiment could be filtered
out of the corpus
■ Computing the overall sentiment of a review using
the review rating score (Ganu et al., 2009, 2013) provides
the best star rating predictions

D. Monett
Further work
■ To consider the aspects’ relevance
■ aspect-oriented subjective phrases
■ To analyse the strengths of the opinions (Wilson et al.,
2004)
■ not only positive/negative/neutral sentiment
■ To deal with other types of models different than
linear, multivariate regression ones

D. Monett
Sources
Related work:
- See references list on our paper!
■ https://www.researchgate.net/publication/304244445_Predi
cting_Star_Ratings_based_on_Annotated_Reviews_of_Mo
bile_Apps

dagmar@monettdiaz.com
monettdiaz
Contact:

Predicting Star Ratings based on Annotated Reviewss of Mobile Apps [Slides]

More Related Content

Viewers also liked (20)

More from Dagmar Monett (12)

Recently uploaded (20)

Predicting Star Ratings based on Annotated Reviewss of Mobile Apps [Slides]