SlideShare a Scribd company logo
Data Augmentation for Improving
Emotion Recognition in Software
Engineering Communication
Mia Mohammad Imran
ASE 2022 - Research Paper
Yashasvi Jain
Preetha Chatterjee Kostadin Damevski
1
● Developers often show emotions (joy, anger, etc) in their communications.
Motivation
Toxic 🤬
Appreciation 🙏
2
“@[USER] Thank you, Stephen. I hope in
the future Angular will become even better
and easier to understand. However, first of
all, I am grateful to Angular for making me
grow as a developer.”
Soooooooooooo you’re setting Angular on
fire and saying bold shit in bold like the
Angular team don’t care about you cause you
found relative pathing has an issue is an odd
area
Motivation
● General purpose emotion classification tools are not effective to Software
Engineering corpora.
● Researchers developed SE-specific tools to recognize emotions.
○ These tools do not perform very well [1]. On a StackOverflow dataset:
■ Joy: F1-score ranges between 0.37 to 0.47.
■ Fear: F1-score ranges between 0.22 to 0.40.
● Most likely problem: lack of large high-quality datasets on software developers
emotions in communication channel.
[1] Chen et al. “Emoji-powered sentiment and emotion detection from software developers' communication data.” TOSEM, 2021
3
Data Collection
● Selected 4 popular OSS repositories with over 50k GitHub stars.
4
[1] Biswas et al., “Achieving reliable sentiment analysis in the software engineering domain using bert.” ICSME, 2020.
● Total 2000 comments (1000 positive & 1000 negative)
Emotion Categorization
● There are a number of models of emotions.
● Most popular in SE is Shaver’s emotion categorization.
○ 6 primary categories:
■ Anger 😡
■ Love ❤
■ Fear 😨
■ Joy 😊
■ Sadness 😥
■ Surprise 😲
○ 25 secondary categories and over 100 tertiary categories.
5
Emotion Categorization: Shaver’s Categories
● 6 primary categories:
○ Anger 😡
○ Love ❤
○ Fear 😨
○ Joy 😊
○ Sadness 😥
○ Surprise 😲
● 25 secondary categories and over
100 tertiary categories.
❤
6
Shaver’s Categories Are Not a Perfect Match
● “I’m curious about this - can you give more context on what exactly goes
wrong? Perhaps if that causes bugs this should be prohibited instead?"
○ Expresses Curiosity 🤔
● “And, I am a little confused, if there is not any special folder, according to the
module resolution [URL] How could file find the correct modules? Did I miss
something?”
○ Expresses Confusion 😕
7
Shaver’s Categories Are Not a Perfect Match
● To mitigate the problem, we combine a recent text-based emotion classification
tool GoEmotions (2020) by Google which has 27 categories.
○ Provided a mapping between their categories and primary emotions:
■ 👍 Approval to 😊 Joy
■ 👎 Disapproval to 😡 Anger
■ 🤔 Curiosity to 😲 Surprise
8
Studied Tools for Emotion Classification in SE
ESEM-E [1] SVM Unigram, bigram
EMTk [2] SVM Unigram, bigram, emotion lexicon, polarity, mood
SEntiMoji [3] Transfer learning DeepMoji representation model
[1] Murgia et al., “An exploratory qualitative and quantitative analysis of emotions in issue report comments of open source systems.”, ESEM, 2018
[2] Calefato et al., “Emtk-the emotion mining toolkit.” SEmotion, 2019
[3] Chen et al. “Emoji-powered sentiment and emotion detection from software developers' communication data.” TOSEM, 2021
9
How Do the Tools Perform
● F1-score similar across all three tools.
● Overall precision significantly higher than recall.
○ The tools predicted conservatively.
■ choosing to predict more utterances lacking a certain emotion.
10
How Do the Tools Perform
● The false positive instances are broadly spread.
● Vast majority (58%) of the false negative instances are shared among the tools.
11
Error Analysis of FNs
● Analyzed 176 FN instances using Novielli et al.’s categorization [1].
[1] Novielli, Nicole et al. "A benchmark study on sentiment analysis for software engineering research." 2018 MSR. 12
Error Analysis of FNs
● General Error: the inability to recognize lexical cues that occur in the text.
○ “that’s awesome, I’ve been needing this for a while”
● Implicit Sentiment Polarity: humans use common knowledge to recognize
emotions that the tools miss.
○ “This was actually causing this test-case not to be executed!”
13
Data Augmentation
● Hypothesis: More training data should improve some error categories.
● Data Augmentation: a technique for creating new training instances by
targeted modification.
● The new instance is:
○ different from the source instance.
○ label invariant.
“awesome! I'm
glad you know
about this
trick.”
“awesome! I'm
happy you
know about
this trick.”
Data Augmentation
14
Data Augmentation: Unconstrained Strategy
● Four operators: insert, substitute, delete and shuffle.
● Used BART [1] generative model for insert and substitute operations.
15
[1] Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and
Comprehension.” ACL, 2020
Data Augmentation: Unconstrained Strategy
● Unconstrained Strategy sometimes introduce noise.
○ Source:
“This looks good, thanks for clarifying the docs.”
○ Augmented:
“This looks worse, thanks for clarifying the docs.”
16
Data Augmentation: Lexicon-based Strategy
● Insert or Substitute word using an SE-specific emotion lexicon.
○ Emotion of the word is same as the annotation of the utterance.
● The SE-specific emotion lexicon comes from Mäntylä et al. [1].
[1] Mäntylä et al., “Bootstrapping a lexicon for emotional arousal in software engineering.” MSR, 2017
“This looks good,
thanks for clarifying
the docs.”
“This looks
wonderful, thanks
for clarifying the
docs.”
word from ‘Joy’ Lexicon
17
Data Augmentation: Polarity-based Strategy
● Same four operators as Unconstrained Strategy.
○ Delete word only if it has neutral polarity.
Positive Emotions Negative Emotions Ambiguous Emotions
Love Anger Surprise
Joy Fear
Sadness
Increase or
Preserve
Positive polarity
Increase or
Preserve
Negative polarity
No changes in
polarity
18
Data Augmentation: Results
● Overall Polarity strategy performed best.
19
Data Augmentation: Takeaway
● Helps to identify insufficient lexical cues.
○ “that’s awesome, I’ve been needing this for a while”
● Data augmentation does not seem to help in identifying implicit emotions.
○ “This was actually causing this test-case not to be executed!”
● Polarity strategy worked best, likely because it provided a balance between:
○ completely unconstrained augmentation and highly constrained
augmentation.
20
Summary of Contributions
● Manually annotated 2000 GitHub utterances.
● Extension of emotion taxonomy.
● Qualitative error analysis of three existing SE emotion classification tools.
● Demonstration and evaluation of three data augmentation approaches.
● Annotation instructions, annotated dataset, and source codes for data
augmentation are publicly available.
Questions/Thoughts/Collaboration Ideas to:
Mia Mohammad Imran, imranm3@vcu.edu
21

More Related Content

PPTX
Emotion Classification In Software Engineering Texts: A Comparative Analysis ...
PPTX
2016 datascience emotion analysis - english version
PPTX
The Challenges of Affect Detection in the Social Programmer Ecosystem
PPT
Towards Discovering the Role of Emotions in Stack Overflow
PDF
S33100107
PDF
A General Architecture for an Emotion-aware Content-based Recommender System
PPTX
Uncovering the Causes of Emotions in Software Developer Communication Using Z...
PPTX
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis
Emotion Classification In Software Engineering Texts: A Comparative Analysis ...
2016 datascience emotion analysis - english version
The Challenges of Affect Detection in the Social Programmer Ecosystem
Towards Discovering the Role of Emotions in Stack Overflow
S33100107
A General Architecture for an Emotion-aware Content-based Recommender System
Uncovering the Causes of Emotions in Software Developer Communication Using Z...
To Label or Not? Advances and Open Challenges in SE-specific Sentiment Analysis

Similar to Data Augmentation for Improving Emotion Recognition in Software Engineering Communication (20)

PDF
Issues in Sentiment analysis
PPT
Emotion Classification Using Massive Examples Extracted From The Web
PDF
EMOTION DETECTION FROM TEXT
PPTX
Emotion mining in text
PDF
F0363942
PDF
Emotion detection from text documents
ODP
Emotion detection from text using data mining and text mining
PDF
Emergence of Things Felt
PDF
Cognitive Reasoning and Inferences through Psychologically based Personalised...
PPTX
Emotion Analysis in Software Ecosystems
PDF
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
PDF
Stuck and Frustrated or In Flow and Happy: Sensing Developers’ Emotions and P...
PDF
NLP Hackathon ppt.pptx.pdf
PPTX
Emotion Detection in text
PPTX
Computing Emotions
PPTX
Emotion Detection from Tweets Using Ensemble Models (1).pptx
PDF
New research articles 2020 december issue- international journal of compute...
PPTX
Improving Data Quality with Active Learning for Emotion Analysis
PDF
Survey of Various Approaches of Emotion Detection Via Multimodal Approach
Issues in Sentiment analysis
Emotion Classification Using Massive Examples Extracted From The Web
EMOTION DETECTION FROM TEXT
Emotion mining in text
F0363942
Emotion detection from text documents
Emotion detection from text using data mining and text mining
Emergence of Things Felt
Cognitive Reasoning and Inferences through Psychologically based Personalised...
Emotion Analysis in Software Ecosystems
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
Stuck and Frustrated or In Flow and Happy: Sensing Developers’ Emotions and P...
NLP Hackathon ppt.pptx.pdf
Emotion Detection in text
Computing Emotions
Emotion Detection from Tweets Using Ensemble Models (1).pptx
New research articles 2020 december issue- international journal of compute...
Improving Data Quality with Active Learning for Emotion Analysis
Survey of Various Approaches of Emotion Detection Via Multimodal Approach
Ad

More from Preetha Chatterjee (10)

PDF
Incivility in Open Source Projects: A Comprehensive Annotated Dataset of Lock...
PDF
Exploring ChatGPT for Toxicity Detection in GitHub
PDF
Interpersonal Trust in OSS: Exploring Dimensions of Trust in GitHub Pull Requ...
PPTX
Automatic Identification of Informative Code in Stack Overflow Posts
PPTX
Automatically Identifying the Quality of Developer Chats for Post Hoc Use
PPTX
Finding Help with Programming Errors: An Exploratory Study of Novice Software...
PPTX
Extracting Archival-Quality Information from Software-Related Chats
PPTX
Mining Code Examples with Descriptive Text from Software Artifacts
PPTX
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
PDF
Extracting Code Segments and Their Descriptions from Research Articles
Incivility in Open Source Projects: A Comprehensive Annotated Dataset of Lock...
Exploring ChatGPT for Toxicity Detection in GitHub
Interpersonal Trust in OSS: Exploring Dimensions of Trust in GitHub Pull Requ...
Automatic Identification of Informative Code in Stack Overflow Posts
Automatically Identifying the Quality of Developer Chats for Post Hoc Use
Finding Help with Programming Errors: An Exploratory Study of Novice Software...
Extracting Archival-Quality Information from Software-Related Chats
Mining Code Examples with Descriptive Text from Software Artifacts
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
Extracting Code Segments and Their Descriptions from Research Articles
Ad

Recently uploaded (20)

PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPTX
ai tools demonstartion for schools and inter college
PDF
top salesforce developer skills in 2025.pdf
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
L1 - Introduction to python Backend.pptx
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
medical staffing services at VALiNTRY
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
How to Migrate SBCGlobal Email to Yahoo Easily
Understanding Forklifts - TECH EHS Solution
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Design an Analysis of Algorithms I-SECS-1021-03
ai tools demonstartion for schools and inter college
top salesforce developer skills in 2025.pdf
VVF-Customer-Presentation2025-Ver1.9.pptx
2025 Textile ERP Trends: SAP, Odoo & Oracle
L1 - Introduction to python Backend.pptx
Reimagine Home Health with the Power of Agentic AI​
wealthsignaloriginal-com-DS-text-... (1).pdf
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
medical staffing services at VALiNTRY
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...

Data Augmentation for Improving Emotion Recognition in Software Engineering Communication

  • 1. Data Augmentation for Improving Emotion Recognition in Software Engineering Communication Mia Mohammad Imran ASE 2022 - Research Paper Yashasvi Jain Preetha Chatterjee Kostadin Damevski 1
  • 2. ● Developers often show emotions (joy, anger, etc) in their communications. Motivation Toxic 🤬 Appreciation 🙏 2 “@[USER] Thank you, Stephen. I hope in the future Angular will become even better and easier to understand. However, first of all, I am grateful to Angular for making me grow as a developer.” Soooooooooooo you’re setting Angular on fire and saying bold shit in bold like the Angular team don’t care about you cause you found relative pathing has an issue is an odd area
  • 3. Motivation ● General purpose emotion classification tools are not effective to Software Engineering corpora. ● Researchers developed SE-specific tools to recognize emotions. ○ These tools do not perform very well [1]. On a StackOverflow dataset: ■ Joy: F1-score ranges between 0.37 to 0.47. ■ Fear: F1-score ranges between 0.22 to 0.40. ● Most likely problem: lack of large high-quality datasets on software developers emotions in communication channel. [1] Chen et al. “Emoji-powered sentiment and emotion detection from software developers' communication data.” TOSEM, 2021 3
  • 4. Data Collection ● Selected 4 popular OSS repositories with over 50k GitHub stars. 4 [1] Biswas et al., “Achieving reliable sentiment analysis in the software engineering domain using bert.” ICSME, 2020. ● Total 2000 comments (1000 positive & 1000 negative)
  • 5. Emotion Categorization ● There are a number of models of emotions. ● Most popular in SE is Shaver’s emotion categorization. ○ 6 primary categories: ■ Anger 😡 ■ Love ❤ ■ Fear 😨 ■ Joy 😊 ■ Sadness 😥 ■ Surprise 😲 ○ 25 secondary categories and over 100 tertiary categories. 5
  • 6. Emotion Categorization: Shaver’s Categories ● 6 primary categories: ○ Anger 😡 ○ Love ❤ ○ Fear 😨 ○ Joy 😊 ○ Sadness 😥 ○ Surprise 😲 ● 25 secondary categories and over 100 tertiary categories. ❤ 6
  • 7. Shaver’s Categories Are Not a Perfect Match ● “I’m curious about this - can you give more context on what exactly goes wrong? Perhaps if that causes bugs this should be prohibited instead?" ○ Expresses Curiosity 🤔 ● “And, I am a little confused, if there is not any special folder, according to the module resolution [URL] How could file find the correct modules? Did I miss something?” ○ Expresses Confusion 😕 7
  • 8. Shaver’s Categories Are Not a Perfect Match ● To mitigate the problem, we combine a recent text-based emotion classification tool GoEmotions (2020) by Google which has 27 categories. ○ Provided a mapping between their categories and primary emotions: ■ 👍 Approval to 😊 Joy ■ 👎 Disapproval to 😡 Anger ■ 🤔 Curiosity to 😲 Surprise 8
  • 9. Studied Tools for Emotion Classification in SE ESEM-E [1] SVM Unigram, bigram EMTk [2] SVM Unigram, bigram, emotion lexicon, polarity, mood SEntiMoji [3] Transfer learning DeepMoji representation model [1] Murgia et al., “An exploratory qualitative and quantitative analysis of emotions in issue report comments of open source systems.”, ESEM, 2018 [2] Calefato et al., “Emtk-the emotion mining toolkit.” SEmotion, 2019 [3] Chen et al. “Emoji-powered sentiment and emotion detection from software developers' communication data.” TOSEM, 2021 9
  • 10. How Do the Tools Perform ● F1-score similar across all three tools. ● Overall precision significantly higher than recall. ○ The tools predicted conservatively. ■ choosing to predict more utterances lacking a certain emotion. 10
  • 11. How Do the Tools Perform ● The false positive instances are broadly spread. ● Vast majority (58%) of the false negative instances are shared among the tools. 11
  • 12. Error Analysis of FNs ● Analyzed 176 FN instances using Novielli et al.’s categorization [1]. [1] Novielli, Nicole et al. "A benchmark study on sentiment analysis for software engineering research." 2018 MSR. 12
  • 13. Error Analysis of FNs ● General Error: the inability to recognize lexical cues that occur in the text. ○ “that’s awesome, I’ve been needing this for a while” ● Implicit Sentiment Polarity: humans use common knowledge to recognize emotions that the tools miss. ○ “This was actually causing this test-case not to be executed!” 13
  • 14. Data Augmentation ● Hypothesis: More training data should improve some error categories. ● Data Augmentation: a technique for creating new training instances by targeted modification. ● The new instance is: ○ different from the source instance. ○ label invariant. “awesome! I'm glad you know about this trick.” “awesome! I'm happy you know about this trick.” Data Augmentation 14
  • 15. Data Augmentation: Unconstrained Strategy ● Four operators: insert, substitute, delete and shuffle. ● Used BART [1] generative model for insert and substitute operations. 15 [1] Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension.” ACL, 2020
  • 16. Data Augmentation: Unconstrained Strategy ● Unconstrained Strategy sometimes introduce noise. ○ Source: “This looks good, thanks for clarifying the docs.” ○ Augmented: “This looks worse, thanks for clarifying the docs.” 16
  • 17. Data Augmentation: Lexicon-based Strategy ● Insert or Substitute word using an SE-specific emotion lexicon. ○ Emotion of the word is same as the annotation of the utterance. ● The SE-specific emotion lexicon comes from Mäntylä et al. [1]. [1] Mäntylä et al., “Bootstrapping a lexicon for emotional arousal in software engineering.” MSR, 2017 “This looks good, thanks for clarifying the docs.” “This looks wonderful, thanks for clarifying the docs.” word from ‘Joy’ Lexicon 17
  • 18. Data Augmentation: Polarity-based Strategy ● Same four operators as Unconstrained Strategy. ○ Delete word only if it has neutral polarity. Positive Emotions Negative Emotions Ambiguous Emotions Love Anger Surprise Joy Fear Sadness Increase or Preserve Positive polarity Increase or Preserve Negative polarity No changes in polarity 18
  • 19. Data Augmentation: Results ● Overall Polarity strategy performed best. 19
  • 20. Data Augmentation: Takeaway ● Helps to identify insufficient lexical cues. ○ “that’s awesome, I’ve been needing this for a while” ● Data augmentation does not seem to help in identifying implicit emotions. ○ “This was actually causing this test-case not to be executed!” ● Polarity strategy worked best, likely because it provided a balance between: ○ completely unconstrained augmentation and highly constrained augmentation. 20
  • 21. Summary of Contributions ● Manually annotated 2000 GitHub utterances. ● Extension of emotion taxonomy. ● Qualitative error analysis of three existing SE emotion classification tools. ● Demonstration and evaluation of three data augmentation approaches. ● Annotation instructions, annotated dataset, and source codes for data augmentation are publicly available. Questions/Thoughts/Collaboration Ideas to: Mia Mohammad Imran, imranm3@vcu.edu 21