SlideShare a Scribd company logo
Towards Discovering 
the Role of Emotions 
in Stack Overflow 
N. Novielli, F. Calefato, F. Lanubile 
University of Bari, Italy 
{nicole.novielli, fabio.calefato, filippo.lanubile}@uniba.it
A new way to access knowledge 
SSE@FSE 2014 2
How Do Programmers Ask 
and Answers Questions? 
 Which questions are answered well and 
which ones remain unanswered? 
(Treude et al., ICSE’11), (Asudazzaman et al., MSR’13) 
 Can we predict how long a question will remain 
unanswered? (Asudazzaman et al., MSR’13) 
 What are the main discussion topics? 
(Barua et al., ’12), (Bajaji et al., MSR’14) 
 What are the main factors affecting reputation? 
(Bosu et al., MSR’13)
Emotions in Social 
Computing and SSE 
 Sentiment Analysis on Yahoo! Answers 
(Kucuktunc et al., WSDM’12) 
 Answers perceived as good have a more neutral 
sentiment than others 
 Do developers feel emotions? (Murgia, et al., MSR’14) 
 Apache Software Foundation issue tracker 
 Sentiment Analysis of Commit comments in 
GitHub (Guzman et al., MSR’13) 
 Correlation with day and time, programming language, 
team distribution 
SSE@FSE 2014 4
Research Question 
Getting emotional while asking or 
answering questions in Stack Overflow: 
good or bad? 
 Impact on success of questions 
 Impact on perceived quality of answers 
 Correlation with reputation 
 Correlation with topics 
 … 
SSE@FSE 2014 5
Preliminary study 
 RQ1:To what degree does the emotional 
style of a question affect the probability of 
success? 
 A successful question has an accepted answer 
SSE@FSE 2014 6
SSE@FSE 2014 7
Dataset distribution 
SSE@FSE 2014 8 
No 
accepted 
Answers 
(31%) 
No Answers 
(11%) 
Accepted 
Answers 
(58%) 
Successful 
4,196,125 
questions 
Unsuccessful 
3,013,677 questions
Building the Model 
SSE@FSE 2014 9 
Post Properties 
• Title Length 
• Post Length 
• Code Blocks 
• Day 
• Time 
• Topic 
• # Comments 
Social Factors 
• Question Score 
• Answer Score 
• # Accepted answer 
provided 
• # Answers accepted 
• # Badges 
Affective Factors 
•Sentiment Polarity 
• Polarity of Question/Answer 
• Polarity of Comments 
•Lexical Cues of Affective 
States 
• Positive emotions lexicon 
• Negative emotions lexicon 
• Gratitude 
• Politeness 
• Attitude of doubt 
• … 
Control Model
The Model 
Post Properties Social Factors Affective Factors 
SSE@FSE 2014 10 
Control Model 
Independent variables, logistic regression model 
Dependent variable: success of a question (Y/N)
Post Properties - Metrics 
• Title and Post Length: # words 
• Alhoff at al., @ICWSM’14; Asaduzzaman et al., @MSR’13 
• Used by SO moderators for automatic filtering 
• Code Blocks: yes/no 
• Treude et al., @ICSE’11 
• Day: in {weekday, weekend} 
• Bosu et al., @MSR2013 
• Time: in {morning, afternoon, evening night} 
• Bosu et al., @MSR2013 
• Topic: categorical, using LDA 
• Asaduzzaman et al., @MSR’13; Bosu et al., @MSR’13 
• Harper et al., @CHI’08 
• Barua et al., Empirical Software Engineering 2014 
SSE@FSE 2014 11
Social Factors - Metrics 
• Assessing the reputation of the author of the 
question at the time it is posted 
• High status correlated with success in Reddit.com (Althoff et al., ICWSM’14) 
• Novices’ questions are more likely answered on Stack Overflow 
(Treude et al., ICSE’ 11) 
• Metrics to approximate the author’s 
reputation 
• Question Score: upvotes - downvotes on questions 
• Answer Score: upvotes – downvotes on answers 
• # Accepted answer provided 
• # Answers accepted 
• # Badges: total badges owned 
SSE@FSE 2014 12
Affective Factors 
• Sentiment Polarity 
• Questions/Answers 
• Polarity of Comments 
SSE@FSE 2014 13
Sentiment Analysis Emotion Detection 
Subjective vs. Objective 
Negative vs. Positive 
Classification using Discrete 
Emotion Labels Goal 
ā€˜I can't solve this problem, it’s very 
frustrating’ 
SSE@FSE 2014 14 
Example 
Resources - SentiStrength 
(Thelwall et al., 2012) 
- SentiWordNet 
(Esuli and Sebastiani, 2006) 
- MPQA Lexicon 
(Wilson et al., EMNLP’05) 
- … 
- LIWC 
(Tausczik and Pennebaker, 2010) 
- WordNet Affect 
(Strapparava and Valitutti, 2004) 
- Depeche Mood 
(Staiano and Guerini, ACL’14) 
- … 
Sad, Frustrated 
ā€˜I can't solve this problem, it’s very 
frustrating’ 
Subjective, Negative
Affective Factors 
• Sentiment Polarity 
• Question 
• Polarity of Comments 
• Lexical Cues of Affective States 
• Positive emotions lexicon 
• Negative emotions lexicon 
• Gratitude 
• Politeness 
• Attitude of doubt 
• … 
Future work 
- Sentistrength: http://sentistrength.wlv.ac.uk/ 
SSE@FSE 2014 15
SentiStrength 
 Estimates the strength of both positive and 
negative sentiment in questions and comments 
 Robust also for informal language 
 Used in previous research 
 Sentiment Analysis of commit comments in GitHub 
(Guzman et al., MSR’13) 
 Sentiment Analysis on Yahoo! Answers 
(Kucuktnc et al., WSDM’12) 
SSE@FSE 2014 16
Preliminary results - Post Properties 
17 
Coeff Odds Ratio 
Code Blocks 0.2549 1.29 
# of comments -0.3659 0.69 
Day (Weekend) 0.0131 1.01 
TIME 
Afternoon 0.1418 1.15 
Evening 0.2093 1.23 
Night 0.1085 1.12 
Post LENGTH 
Body Length -0,0004 0.99 
Title Length -0.0039 0.99 
All significant, with a = 0.05 
• Review questions are more 
concrete and get more answers 
(Treude et al., ICSE’11) and vague 
questions remain unanswered 
(Asaduzzaman et al., MSR’13) 
• SO off-peak hours (night): longer 
answer interval and less 
questions posted 
(Barua et al., MSR’13)
Post properties: Topic 
18 
Coeff Odds Ratio 
DATABASES/PERFORMANCE 0.4062 1.50 
WEB PROGRAMMING 0.2725 1.31 
GRAPHICS 0.2415 1.27 
WEB PROGRAMMING/HTTP 0.1441 1.16 
JAVA 0.0029 1.00 
OOP 0.8599 2.36 
MOBILE DEVELOPMENT/iOS 0.2664 1.30 
SOURCE CODE MANAGEMENT 0.2805 1.32 
DATA STRUCTURE/ALGORITHMS 0.7340 2.08 
.NET FRAMEWORK/ASP 0.3442 1.41 
SCRIPTING 0.3649 1.44 
DATABASES/SQL 0.4488 1.57 
WEB APP DEVELOPMENT 0.3330 1.40 
MOBILE DEV/ANDROID 0.1111 1.12 
All significant, with a = 0.05
Success rate per topic 
19 
Topic Success rate 
Number of 
questions Post rate 
OOP 6 70,81% 630258 8,84% 
DATA STRUCTURE/ALGORITHMS 9 67,73% 798713 11,20% 
DATABASES/SQL 12 61,12% 582130 8,16% 
.NET FRAMEWORK/ASP 10 58,73% 518834 7,28% 
SCRIPTING 11 58,54% 497763 6,98% 
WEB APP DEVELOPMENT 13 58,47% 492173 6,90% 
DATABASES/PERFORMANCE 0 57,72% 415825 5,83% 
WEB PROGRAMMING 1 56,59% 536255 7,52% 
SOURCE CODE MANAGEMENT 8 55,37% 373397 5,24% 
GRAPHICS 2 54,37% 383376 5,38% 
MOBILE DEVELOPMENT/iOS 7 53,91% 376517 5,28% 
WEB PROGRAMMING/HTTP 3 52,22% 375510 5,27% 
MOBILE DEV/ANDROID 14 51,50% 432095 6,06% 
JAVA 5 49,35% 235489 3,30% 
WEB AUTHENTICATION/API 4 49,00% 482992 6,77%
Preliminary Results – 
Social Factors 
Coeff Odds Ratio 
User Question Score* -0,0017 0.99 
User Answer Score* -0,0002 0.99 
User Answers Accepted* 0,0047 1.00 
User Questions Accepted* 0,0078 1.00 
Number Of Badges 0,0001 1.0001103 
SSE@FSE 2014 20 
*significant with a = 0.05
Preliminary Results – 
Affective Factors 
Coef Odds Ratio 
SENTIMENT of the QUESTION 
Question Positive Score -0.0248 0.98 
Question Negative Score -0.0083 0.99 
SENTIMENT of the author’s COMMENTS 
Comment Positive Score -0.1813 0.83 
Comment Negative Score -0.1080 0.90 
All significant, with a = 0.05 
SSE@FSE 2014 21
Impact of Positive Sentiment on Success 
Positive polarity of QUESTION Positive polarity of COMMENTS 
22
Impact of Negative Sentiment on Success 
Negative polarity of QUESTION Negative Polarity of COMMENTS 
23
Problems in detecting 
sentiment 
 ā€˜Problem’ lexicon is too peculiar for the domain 
to be considered as a pure expression of 
negative emotions 
 Actually describing emotions 
 ā€˜I have very simple and stupid trouble […] I'm pretty 
confused, explain please, what is wrong?’ (neg=-2) 
 ā€˜Sorry for troubling you guys’ (neg=-2) 
 Simply describing problem 
 What is the best way to kill a critical process? (neg=-2) 
 What is wrong? (neg=-2) 
 Mixed 
 I’m missing a parenthesis . But where? :( (neg=-3) 
24
- Thanks! Preliminary 
qualitative analysis using 
LIWC 
- Positive score = 3 
SSE@FSE 2014 25
Next steps 
 Separate positive emotions from gratitude 
expressions 
 Qualitative analysis using of the first 1000 
questions with highest positive sentiment score 
 Gratitude and politeness are the most frequent cases 
 ā€˜Cheers’, ā€˜Thanks (in advance)’, ā€˜Thank you’, … 
 Gratitude is positively associated to success of 
requests (Althoff et al., 2014) 
26
Next steps 
 Further lexical analysis 
 Assessing the suitability of state-of-the-art tools for 
sentiment analysis 
 Modeling the ā€˜success lexicon’ 
 Classification study: is success predictable? 
 Preliminary results: 0.67 accuracy 
 Investigate other research questions 
 Emotions and perceived quality of answers 
 Emotions and reputation 
 Emotions and topics 
27
Towards Discovering the Role of Emotions in Stack Overflow
Thank you 
N. Novielli, F. Calefato, F. Lanubile 
University of Bari, Italy 
{nicole.novielli, fabio.calefato, filippo.lanubile}@uniba.it

More Related Content

PPTX
The Challenges of Affect Detection in the Social Programmer Ecosystem
PPT
A Preliminary Investigation of the Effect of Social Media on Affective Trust ...
PPTX
Affective Trust as a Predictor of Successful Collaboration in Distributed Sof...
PPTX
[0417] seunghyeong choe
PDF
A Pragmatic Perspective on Software Visualization
PPTX
Psychometrics 2020
PDF
Introduction to Gameful Design Heuristics (CHI 2017)
The Challenges of Affect Detection in the Social Programmer Ecosystem
A Preliminary Investigation of the Effect of Social Media on Affective Trust ...
Affective Trust as a Predictor of Successful Collaboration in Distributed Sof...
[0417] seunghyeong choe
A Pragmatic Perspective on Software Visualization
Psychometrics 2020
Introduction to Gameful Design Heuristics (CHI 2017)

Viewers also liked (20)

PDF
Improving Low Quality Stack Overflow Post Detection
PDF
DOs and DONT’s of Social Analytics
PPTX
Collaborazione nelle comunitĆ  open source: tecniche e strumenti
PDF
What can Bioinformaticians learn from YouTube?
PDF
Kaggle's WISE 2014 challenge
PPTX
Big Data and Social Media Mining in Crisis and Emergency Management
PPTX
Stackoverflow Data Analysis-Homework3
PPT
StackOverflow Architectural Overview
PPTX
NaĆÆve multi label classification of you tube comments using
PPTX
Transferring Software Testing Tools to Practice
PDF
Software Analytics: Towards Software Mining that Matters
PDF
The (R)evolution of Social Media in Software Engineering
PDF
Benevol 2012 Keynote: The Social Software (R)evolution
PDF
FSE 2016 Panel: The State of Software Engineering Research
PDF
Research industry panel review
PPTX
SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biom...
PDF
Crowdsourcing Documentation in Software Engineering
PDF
How Developers Stay Current Using Twitter
PDF
Stack Overflow slides Data Analytics
PDF
Data mining on social networks for students learning experiences
Improving Low Quality Stack Overflow Post Detection
DOs and DONT’s of Social Analytics
Collaborazione nelle comunitĆ  open source: tecniche e strumenti
What can Bioinformaticians learn from YouTube?
Kaggle's WISE 2014 challenge
Big Data and Social Media Mining in Crisis and Emergency Management
Stackoverflow Data Analysis-Homework3
StackOverflow Architectural Overview
NaĆÆve multi label classification of you tube comments using
Transferring Software Testing Tools to Practice
Software Analytics: Towards Software Mining that Matters
The (R)evolution of Social Media in Software Engineering
Benevol 2012 Keynote: The Social Software (R)evolution
FSE 2016 Panel: The State of Software Engineering Research
Research industry panel review
SLE 2012 Keynote: Cognitive and Social Challenges of Ontology Use in the Biom...
Crowdsourcing Documentation in Software Engineering
How Developers Stay Current Using Twitter
Stack Overflow slides Data Analytics
Data mining on social networks for students learning experiences
Ad

Similar to Towards Discovering the Role of Emotions in Stack Overflow (20)

PDF
Filippo Lanubile's talk @IASESE 2018
PPSX
Engaging Students in Distance Learning
PPTX
PR2-Questionnaire.pptx
PPTX
Peerwise and students’ contribution experiences from the field
PPTX
Quantitative vvvvvvvvvvvvvvResearch.pptx
PPT
12_quantitative-research-methodology.ppt
PPT
12_quantitative-research-methodology.ppt
PPT
12_quantitative-research-methodology.ppt
PPT
Qualitative & quantitative-research-ppt.ppt
PPT
12_quantitative-research-methodology.ppt
PPTX
Survey Methodology and Questionnaire Design Theory Part I
PPT
12_quantitative-research-methodology.ppt
PPT
12_quantitative-research-methodology.ppt
PPT
12_quantitative-research-methodology.ppt
PPT
DIY: Research on a shoestring budget
PPT
Questionnaires
PDF
Survey Data Quality Methods for ISSP and DATIS
PPT
Ecer 2011
PPT
Ecer 2011
PPTX
Lesson 5a_Surveys and Measurement 2023.pptx
Filippo Lanubile's talk @IASESE 2018
Engaging Students in Distance Learning
PR2-Questionnaire.pptx
Peerwise and students’ contribution experiences from the field
Quantitative vvvvvvvvvvvvvvResearch.pptx
12_quantitative-research-methodology.ppt
12_quantitative-research-methodology.ppt
12_quantitative-research-methodology.ppt
Qualitative & quantitative-research-ppt.ppt
12_quantitative-research-methodology.ppt
Survey Methodology and Questionnaire Design Theory Part I
12_quantitative-research-methodology.ppt
12_quantitative-research-methodology.ppt
12_quantitative-research-methodology.ppt
DIY: Research on a shoestring budget
Questionnaires
Survey Data Quality Methods for ISSP and DATIS
Ecer 2011
Ecer 2011
Lesson 5a_Surveys and Measurement 2023.pptx
Ad

More from Nicole Novielli (10)

PDF
A Journey Into the Emotions of Software Developers
PDF
Towards Supporting Emotion Awareness of Software Developers
PDF
Keynote@QUATIC - Recognizing Developer's Emotions: Advances and Open Challenges
PPTX
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
PPTX
Emotion Detection Using Noninvasive Low-cost Sensors
PPTX
Evalita2018 iListen - itaLIan Speech acT labEliNg
PPTX
A Benchmark Study on Sentiment Analysis for Software Engineering Research
PPTX
Deep Tweets: from Entity Linking to Sentiment Analysis
PDF
UNIBA at EVALITA 2014-SENTIPOLC Task: Predicting tweet sentiment polarity com...
PPT
Social Network Analysis for Global Software Engineering: Exploring relationsh...
A Journey Into the Emotions of Software Developers
Towards Supporting Emotion Awareness of Software Developers
Keynote@QUATIC - Recognizing Developer's Emotions: Advances and Open Challenges
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
Emotion Detection Using Noninvasive Low-cost Sensors
Evalita2018 iListen - itaLIan Speech acT labEliNg
A Benchmark Study on Sentiment Analysis for Software Engineering Research
Deep Tweets: from Entity Linking to Sentiment Analysis
UNIBA at EVALITA 2014-SENTIPOLC Task: Predicting tweet sentiment polarity com...
Social Network Analysis for Global Software Engineering: Exploring relationsh...

Recently uploaded (20)

PPTX
Computer network topology notes for revision
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
Database Infoormation System (DBIS).pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
annual-report-2024-2025 original latest.
PPTX
Introduction to machine learning and Linear Models
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Computer network topology notes for revision
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Business Acumen Training GuidePresentation.pptx
Introduction-to-Cloud-ComputingFinal.pptx
Miokarditis (Inflamasi pada Otot Jantung)
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Database Infoormation System (DBIS).pptx
IB Computer Science - Internal Assessment.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
annual-report-2024-2025 original latest.
Introduction to machine learning and Linear Models
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...

Towards Discovering the Role of Emotions in Stack Overflow

  • 1. Towards Discovering the Role of Emotions in Stack Overflow N. Novielli, F. Calefato, F. Lanubile University of Bari, Italy {nicole.novielli, fabio.calefato, filippo.lanubile}@uniba.it
  • 2. A new way to access knowledge SSE@FSE 2014 2
  • 3. How Do Programmers Ask and Answers Questions?  Which questions are answered well and which ones remain unanswered? (Treude et al., ICSE’11), (Asudazzaman et al., MSR’13)  Can we predict how long a question will remain unanswered? (Asudazzaman et al., MSR’13)  What are the main discussion topics? (Barua et al., ’12), (Bajaji et al., MSR’14)  What are the main factors affecting reputation? (Bosu et al., MSR’13)
  • 4. Emotions in Social Computing and SSE  Sentiment Analysis on Yahoo! Answers (Kucuktunc et al., WSDM’12)  Answers perceived as good have a more neutral sentiment than others  Do developers feel emotions? (Murgia, et al., MSR’14)  Apache Software Foundation issue tracker  Sentiment Analysis of Commit comments in GitHub (Guzman et al., MSR’13)  Correlation with day and time, programming language, team distribution SSE@FSE 2014 4
  • 5. Research Question Getting emotional while asking or answering questions in Stack Overflow: good or bad?  Impact on success of questions  Impact on perceived quality of answers  Correlation with reputation  Correlation with topics  … SSE@FSE 2014 5
  • 6. Preliminary study  RQ1:To what degree does the emotional style of a question affect the probability of success?  A successful question has an accepted answer SSE@FSE 2014 6
  • 8. Dataset distribution SSE@FSE 2014 8 No accepted Answers (31%) No Answers (11%) Accepted Answers (58%) Successful 4,196,125 questions Unsuccessful 3,013,677 questions
  • 9. Building the Model SSE@FSE 2014 9 Post Properties • Title Length • Post Length • Code Blocks • Day • Time • Topic • # Comments Social Factors • Question Score • Answer Score • # Accepted answer provided • # Answers accepted • # Badges Affective Factors •Sentiment Polarity • Polarity of Question/Answer • Polarity of Comments •Lexical Cues of Affective States • Positive emotions lexicon • Negative emotions lexicon • Gratitude • Politeness • Attitude of doubt • … Control Model
  • 10. The Model Post Properties Social Factors Affective Factors SSE@FSE 2014 10 Control Model Independent variables, logistic regression model Dependent variable: success of a question (Y/N)
  • 11. Post Properties - Metrics • Title and Post Length: # words • Alhoff at al., @ICWSM’14; Asaduzzaman et al., @MSR’13 • Used by SO moderators for automatic filtering • Code Blocks: yes/no • Treude et al., @ICSE’11 • Day: in {weekday, weekend} • Bosu et al., @MSR2013 • Time: in {morning, afternoon, evening night} • Bosu et al., @MSR2013 • Topic: categorical, using LDA • Asaduzzaman et al., @MSR’13; Bosu et al., @MSR’13 • Harper et al., @CHI’08 • Barua et al., Empirical Software Engineering 2014 SSE@FSE 2014 11
  • 12. Social Factors - Metrics • Assessing the reputation of the author of the question at the time it is posted • High status correlated with success in Reddit.com (Althoff et al., ICWSM’14) • Novices’ questions are more likely answered on Stack Overflow (Treude et al., ICSE’ 11) • Metrics to approximate the author’s reputation • Question Score: upvotes - downvotes on questions • Answer Score: upvotes – downvotes on answers • # Accepted answer provided • # Answers accepted • # Badges: total badges owned SSE@FSE 2014 12
  • 13. Affective Factors • Sentiment Polarity • Questions/Answers • Polarity of Comments SSE@FSE 2014 13
  • 14. Sentiment Analysis Emotion Detection Subjective vs. Objective Negative vs. Positive Classification using Discrete Emotion Labels Goal ā€˜I can't solve this problem, it’s very frustrating’ SSE@FSE 2014 14 Example Resources - SentiStrength (Thelwall et al., 2012) - SentiWordNet (Esuli and Sebastiani, 2006) - MPQA Lexicon (Wilson et al., EMNLP’05) - … - LIWC (Tausczik and Pennebaker, 2010) - WordNet Affect (Strapparava and Valitutti, 2004) - Depeche Mood (Staiano and Guerini, ACL’14) - … Sad, Frustrated ā€˜I can't solve this problem, it’s very frustrating’ Subjective, Negative
  • 15. Affective Factors • Sentiment Polarity • Question • Polarity of Comments • Lexical Cues of Affective States • Positive emotions lexicon • Negative emotions lexicon • Gratitude • Politeness • Attitude of doubt • … Future work - Sentistrength: http://sentistrength.wlv.ac.uk/ SSE@FSE 2014 15
  • 16. SentiStrength  Estimates the strength of both positive and negative sentiment in questions and comments  Robust also for informal language  Used in previous research  Sentiment Analysis of commit comments in GitHub (Guzman et al., MSR’13)  Sentiment Analysis on Yahoo! Answers (Kucuktnc et al., WSDM’12) SSE@FSE 2014 16
  • 17. Preliminary results - Post Properties 17 Coeff Odds Ratio Code Blocks 0.2549 1.29 # of comments -0.3659 0.69 Day (Weekend) 0.0131 1.01 TIME Afternoon 0.1418 1.15 Evening 0.2093 1.23 Night 0.1085 1.12 Post LENGTH Body Length -0,0004 0.99 Title Length -0.0039 0.99 All significant, with a = 0.05 • Review questions are more concrete and get more answers (Treude et al., ICSE’11) and vague questions remain unanswered (Asaduzzaman et al., MSR’13) • SO off-peak hours (night): longer answer interval and less questions posted (Barua et al., MSR’13)
  • 18. Post properties: Topic 18 Coeff Odds Ratio DATABASES/PERFORMANCE 0.4062 1.50 WEB PROGRAMMING 0.2725 1.31 GRAPHICS 0.2415 1.27 WEB PROGRAMMING/HTTP 0.1441 1.16 JAVA 0.0029 1.00 OOP 0.8599 2.36 MOBILE DEVELOPMENT/iOS 0.2664 1.30 SOURCE CODE MANAGEMENT 0.2805 1.32 DATA STRUCTURE/ALGORITHMS 0.7340 2.08 .NET FRAMEWORK/ASP 0.3442 1.41 SCRIPTING 0.3649 1.44 DATABASES/SQL 0.4488 1.57 WEB APP DEVELOPMENT 0.3330 1.40 MOBILE DEV/ANDROID 0.1111 1.12 All significant, with a = 0.05
  • 19. Success rate per topic 19 Topic Success rate Number of questions Post rate OOP 6 70,81% 630258 8,84% DATA STRUCTURE/ALGORITHMS 9 67,73% 798713 11,20% DATABASES/SQL 12 61,12% 582130 8,16% .NET FRAMEWORK/ASP 10 58,73% 518834 7,28% SCRIPTING 11 58,54% 497763 6,98% WEB APP DEVELOPMENT 13 58,47% 492173 6,90% DATABASES/PERFORMANCE 0 57,72% 415825 5,83% WEB PROGRAMMING 1 56,59% 536255 7,52% SOURCE CODE MANAGEMENT 8 55,37% 373397 5,24% GRAPHICS 2 54,37% 383376 5,38% MOBILE DEVELOPMENT/iOS 7 53,91% 376517 5,28% WEB PROGRAMMING/HTTP 3 52,22% 375510 5,27% MOBILE DEV/ANDROID 14 51,50% 432095 6,06% JAVA 5 49,35% 235489 3,30% WEB AUTHENTICATION/API 4 49,00% 482992 6,77%
  • 20. Preliminary Results – Social Factors Coeff Odds Ratio User Question Score* -0,0017 0.99 User Answer Score* -0,0002 0.99 User Answers Accepted* 0,0047 1.00 User Questions Accepted* 0,0078 1.00 Number Of Badges 0,0001 1.0001103 SSE@FSE 2014 20 *significant with a = 0.05
  • 21. Preliminary Results – Affective Factors Coef Odds Ratio SENTIMENT of the QUESTION Question Positive Score -0.0248 0.98 Question Negative Score -0.0083 0.99 SENTIMENT of the author’s COMMENTS Comment Positive Score -0.1813 0.83 Comment Negative Score -0.1080 0.90 All significant, with a = 0.05 SSE@FSE 2014 21
  • 22. Impact of Positive Sentiment on Success Positive polarity of QUESTION Positive polarity of COMMENTS 22
  • 23. Impact of Negative Sentiment on Success Negative polarity of QUESTION Negative Polarity of COMMENTS 23
  • 24. Problems in detecting sentiment  ā€˜Problem’ lexicon is too peculiar for the domain to be considered as a pure expression of negative emotions  Actually describing emotions  ā€˜I have very simple and stupid trouble […] I'm pretty confused, explain please, what is wrong?’ (neg=-2)  ā€˜Sorry for troubling you guys’ (neg=-2)  Simply describing problem  What is the best way to kill a critical process? (neg=-2)  What is wrong? (neg=-2)  Mixed  I’m missing a parenthesis . But where? :( (neg=-3) 24
  • 25. - Thanks! Preliminary qualitative analysis using LIWC - Positive score = 3 SSE@FSE 2014 25
  • 26. Next steps  Separate positive emotions from gratitude expressions  Qualitative analysis using of the first 1000 questions with highest positive sentiment score  Gratitude and politeness are the most frequent cases  ā€˜Cheers’, ā€˜Thanks (in advance)’, ā€˜Thank you’, …  Gratitude is positively associated to success of requests (Althoff et al., 2014) 26
  • 27. Next steps  Further lexical analysis  Assessing the suitability of state-of-the-art tools for sentiment analysis  Modeling the ā€˜success lexicon’  Classification study: is success predictable?  Preliminary results: 0.67 accuracy  Investigate other research questions  Emotions and perceived quality of answers  Emotions and reputation  Emotions and topics 27
  • 29. Thank you N. Novielli, F. Calefato, F. Lanubile University of Bari, Italy {nicole.novielli, fabio.calefato, filippo.lanubile}@uniba.it

Editor's Notes

  • #20: How do this relate with previous research on this domain? How do this relate with reputation and expert distribution in the Stack Overflow community?