SlideShare a Scribd company logo
Automatically Identifying the Quality of
Developer Chats for Post Hoc Use
0
Preprint: https://preethac.github.io/files/TOSEM21.pdf
@PreethaChatterj
preethac@drexel.edu https://preethac.github.io
Transactions on Software Engineering and Methodology (TOSEM)
Journal-first presentation at ASE 2022
Preetha Chatterjee Kostadin Damevski Nicholas A. Kraft Lori Pollock
1
Why Analyze Quality of Information?
Built-in mechanisms in Q&A forums
Accepted answers, vote counts, user reputation
Prior research [Sillito ‘12, Duijn ‘15, Yang ’16]
Conciseness of answers, code readability
❌
No quality assessment
mechanism for chats
First step to building effective data-driven software tools
 API recommendation systems
 Virtual assistants for programming help
 Enhance quality of search engines
Consider this Conversation
Author Utterance
Alexia Hi, I have a file with following contents
1234 alphabet /vag/one/arun > 1454 bigdata /home/two/ogra > 5684 apple /vinay/three/dire,
but i want the output to be like
1234 alphabet one > 1454 bigdata two > 5684 apple three
Elaina sed −r ’s|(.+)/[ˆ/]+/([ˆ/]+)/.+|12|g’
Corina Even though I dont have anything to do with this question, could you explain the logic behind the
answer? The formatting sentence seem so random
Elaina ‘sed -r‘ is an extended mode, so + is enabled (matches one or more characters, unlike * that matches
zero or more); s///g or s|||g or any symbol instead of | is how a basic replacing expression is
constructed.
The first field is what to match, the second is what to replace it with.;(.+) /[ˆ/]+/([ˆ/]+) /.+ (.+)/
matches anything from the start until the first / and puts found characters in the first group (1);
[ˆ/]+/ matches anything that is not a slash, and then a slash (‘vag/‘ or ‘home/‘); ([ˆ/]+)/ matches the
same thing, but puts the stuff found in-between slashes in the second group 2; and then .+ matches
whatever comes next to the end of line; and the second field tells sed to replace the line with 12,
so our saved groups side-by-side: the first group was everything before the first slash, and the
second group was the stuff between 2nd and 3rd slashes
Corina Ah ok, thanks a lot for the explanation!
2
• Concise
• Details of the problem and solution
• Indication of answer acceptance
Now Consider This Conversation
3
Author Utterance
Cody Hello guys I got a huge problem
Holli Cody: ask away
Cody We’ve been ask as assignment the implementation of Dijkstra’s and Bellman Ford’s algorithm for
calculating the shortest path in a given graph
Holli So what’s the issue?; run into a problem?
Cody I don’t really know how to start and that’s my problem
… ….
Darrin Cody: how much experience do you have writing code?; for example, there are quite a number of
existing examples of the algorithms you’re talking about
Cody basic i’m just starting
Darrin ok; can you describe the steps on how you execute the algorithm?; and do you understand why
those steps are necessary?; if so, then the next step you take is translating your written description
of the process into pseudocode; once you have a reasonable sequence of actions, you then
implement the pseudocode in your language of choice; frankly, the first two items are always the
most difficult; because it requires you to understand the problem domain; once you understand it,
making it work is usually much less effort
Rachel Cody: oof graph theory for a beginner. do you understand how those algorithms work, ?
Cody Yes I understand how those work
Darrin just having trouble translating described steps to code?
• Lengthy
• Lacks relevant details of the problem
• Too much noise
Post Hoc Quality Conversations
A conversation is considered post hoc quality based on
the availability and ease of identifying information
to gain useful software-related knowledge
4
Recruited human judges to
analyze 400 conversations
 Logistic Regression
 Stochastic Gradient
Boosted Trees
 Random Forest
 Sequential Neural
Network
Automatically Identify Post Hoc Quality Conversations
Developer
Chats Extraction of
Features
Classification
 Knowledge
Seeking/Sharing
 Contextual
 Succinct
 Well Written
 Participant Experience
Prediction of
Quality
Binary Prediction
• Post Hoc
• Non Post Hoc
Features to Identify Post Hoc Quality Conversation
Knowledge
Seeking /
Sharing
Succinct
Well
written
Contextual
Attributes of conversation
1. Primary question?
2. Knowledge-seeking question?
3. Accepted answers?
4. #Authors
1. #API Mentions
2. #URL
3. Code
4. Code Description
5. Size of code
6. Error Message
7. #Software Specific terms
1. #Utterances
2. #Sentences
3. #Words
4. Time Span
5. #Text Speaks
6. #Questions
7. Unique Information
8. Avg Shortest Path
9. Avg Graph Degree
Attributes of conversation
1. #Misspellings
2. #Incomplete Sentences
3. Readability Metrics
1. Questioner Experience
2. Participants Experience
Participant
Experience
Gold Set
7
Community
(Slack Channels)
#Conv
pythondev#help 400
clojurians#clojure 400
elmlang#beginners 400
elmlang#general 400
racket#general 400
Total 2k
Evaluation Methodology
# Post Hoc = 1310
# Non Post Hoc = 690
RQ1: How effective are machine learning-based techniques for
automatic identification of post hoc quality developer chats?
8
Evaluation Results
 Stochastic Gradient Boosted Trees (SGBT)
 Logistic Regression (LR)
 Random Forest (RF)
 Sequential Neural Network (SNN)
Baseline: Software-related conversations
based on presence of code
RQ1: How effective are machine learning-based techniques for
automatic identification of post hoc quality developer chats?
9
Machine learning-based techniques outperform heuristic-based baseline
SNN provides best performance, with F1 and AUC = 0.86, MCC = 0.55
Evaluation Results
RQ2: Which features result in more effective automatic identification?
Top 8 Features Information Gain
#Utterances 0.204
#Sentences 0.182
Software-specific Terms 0.174
#Words 0.171
#Authors 0.158
Time Span 0.156
Participants’ Experience 0.151
Avg Graph Degree 0.146
10
Evaluation Results
Length Coherence Topic of discussion Participant knowledge
RQ3: What types of conversations are difficult to automatically
detect as post hoc quality using our techniques?
11
False Negative False Positive
Evaluation Results
RQ3: What types of conversations are difficult to automatically
detect as post hoc quality using our techniques?
12
False Negative False Positive
FP: classifiers struggled distinguishing conversations based on the quality of answers
FN: very short conversations (3-4 utterances), not enough content for our features
Evaluation Results
Summary: Identifying Post Hoc Quality Conversations
Machine learning-based approach to automatically identify
post hoc quality developer conversations
• Best performance using Sequential Neural Network
with F-measure and AUC of 0.86, MCC of 0.55
• Most informative quality features:
Length, coherence, topic of discussion, and participant experience
13
Significance: Advances the field of information mining by using high-quality
information from developer chats
 Efficient information gathering towards building software maintenance tools
 Enrich existing knowledge-bases and community knowledge

More Related Content

PPTX
Extracting Archival-Quality Information from Software-Related Chats
PDF
1435488539 221998
PDF
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
PDF
User intent formalization AIware 2024.pdf
PPT
01.intro
PPTX
Automatic Identification of Informative Code in Stack Overflow Posts
PPTX
Ask me anything: A Conversational Interface to Augment Information Security w...
PDF
Raising the Bar
Extracting Archival-Quality Information from Software-Related Chats
1435488539 221998
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
User intent formalization AIware 2024.pdf
01.intro
Automatic Identification of Informative Code in Stack Overflow Posts
Ask me anything: A Conversational Interface to Augment Information Security w...
Raising the Bar

Similar to Automatically Identifying the Quality of Developer Chats for Post Hoc Use (20)

PPTX
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
PDF
My life as a cyborg
PDF
Agents for SW development - Berkeley LLM AI Agents MOOC
PDF
DeepPavlov 2019
PDF
DataMind interactive learning: Dublin R User Group: September 2013
PDF
Build your own Language - Why and How?
PDF
AI for Program Specifications Berkeley May 2025.pdf
PDF
2023-My AI Experience - Colm Dunphy.pdf
DOCX
Resume upto august 2016
PDF
Xen Project Contributor Training - Part 1 introduction v1.0
PPTX
Behavior Driven Development
PDF
Dstc6 an introduction
 
PDF
Hacking - CEH Cheat Sheet Exercises.pdf
PPTX
Infosys Interview Questions And Answers 2023
PDF
PDF
AI for Program Specifications UW PLSE 2025 - final.pdf
PDF
Btech IT Sem VII and VIII-1 (1).pdf
PDF
Surviving the technical interview
KEY
Visualising conversation around #c4thepromise
PPTX
Transferring Software Testing Tools to Practice
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
My life as a cyborg
Agents for SW development - Berkeley LLM AI Agents MOOC
DeepPavlov 2019
DataMind interactive learning: Dublin R User Group: September 2013
Build your own Language - Why and How?
AI for Program Specifications Berkeley May 2025.pdf
2023-My AI Experience - Colm Dunphy.pdf
Resume upto august 2016
Xen Project Contributor Training - Part 1 introduction v1.0
Behavior Driven Development
Dstc6 an introduction
 
Hacking - CEH Cheat Sheet Exercises.pdf
Infosys Interview Questions And Answers 2023
AI for Program Specifications UW PLSE 2025 - final.pdf
Btech IT Sem VII and VIII-1 (1).pdf
Surviving the technical interview
Visualising conversation around #c4thepromise
Transferring Software Testing Tools to Practice
Ad

More from Preetha Chatterjee (7)

PDF
Incivility in Open Source Projects: A Comprehensive Annotated Dataset of Lock...
PDF
Exploring ChatGPT for Toxicity Detection in GitHub
PDF
Interpersonal Trust in OSS: Exploring Dimensions of Trust in GitHub Pull Requ...
PDF
Data Augmentation for Improving Emotion Recognition in Software Engineering C...
PPTX
Finding Help with Programming Errors: An Exploratory Study of Novice Software...
PPTX
Mining Code Examples with Descriptive Text from Software Artifacts
PDF
Extracting Code Segments and Their Descriptions from Research Articles
Incivility in Open Source Projects: A Comprehensive Annotated Dataset of Lock...
Exploring ChatGPT for Toxicity Detection in GitHub
Interpersonal Trust in OSS: Exploring Dimensions of Trust in GitHub Pull Requ...
Data Augmentation for Improving Emotion Recognition in Software Engineering C...
Finding Help with Programming Errors: An Exploratory Study of Novice Software...
Mining Code Examples with Descriptive Text from Software Artifacts
Extracting Code Segments and Their Descriptions from Research Articles
Ad

Recently uploaded (20)

PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
history of c programming in notes for students .pptx
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
System and Network Administration Chapter 2
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
medical staffing services at VALiNTRY
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
Transform Your Business with a Software ERP System
PDF
Nekopoi APK 2025 free lastest update
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
CHAPTER 2 - PM Management and IT Context
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Which alternative to Crystal Reports is best for small or large businesses.pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Softaken Excel to vCard Converter Software.pdf
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
history of c programming in notes for students .pptx
Operating system designcfffgfgggggggvggggggggg
System and Network Administration Chapter 2
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
medical staffing services at VALiNTRY
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Transform Your Business with a Software ERP System
Nekopoi APK 2025 free lastest update
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Wondershare Filmora 15 Crack With Activation Key [2025
VVF-Customer-Presentation2025-Ver1.9.pptx
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
How to Migrate SBCGlobal Email to Yahoo Easily
CHAPTER 2 - PM Management and IT Context

Automatically Identifying the Quality of Developer Chats for Post Hoc Use

  • 1. Automatically Identifying the Quality of Developer Chats for Post Hoc Use 0 Preprint: https://preethac.github.io/files/TOSEM21.pdf @PreethaChatterj preethac@drexel.edu https://preethac.github.io Transactions on Software Engineering and Methodology (TOSEM) Journal-first presentation at ASE 2022 Preetha Chatterjee Kostadin Damevski Nicholas A. Kraft Lori Pollock
  • 2. 1 Why Analyze Quality of Information? Built-in mechanisms in Q&A forums Accepted answers, vote counts, user reputation Prior research [Sillito ‘12, Duijn ‘15, Yang ’16] Conciseness of answers, code readability ❌ No quality assessment mechanism for chats First step to building effective data-driven software tools  API recommendation systems  Virtual assistants for programming help  Enhance quality of search engines
  • 3. Consider this Conversation Author Utterance Alexia Hi, I have a file with following contents 1234 alphabet /vag/one/arun > 1454 bigdata /home/two/ogra > 5684 apple /vinay/three/dire, but i want the output to be like 1234 alphabet one > 1454 bigdata two > 5684 apple three Elaina sed −r ’s|(.+)/[ˆ/]+/([ˆ/]+)/.+|12|g’ Corina Even though I dont have anything to do with this question, could you explain the logic behind the answer? The formatting sentence seem so random Elaina ‘sed -r‘ is an extended mode, so + is enabled (matches one or more characters, unlike * that matches zero or more); s///g or s|||g or any symbol instead of | is how a basic replacing expression is constructed. The first field is what to match, the second is what to replace it with.;(.+) /[ˆ/]+/([ˆ/]+) /.+ (.+)/ matches anything from the start until the first / and puts found characters in the first group (1); [ˆ/]+/ matches anything that is not a slash, and then a slash (‘vag/‘ or ‘home/‘); ([ˆ/]+)/ matches the same thing, but puts the stuff found in-between slashes in the second group 2; and then .+ matches whatever comes next to the end of line; and the second field tells sed to replace the line with 12, so our saved groups side-by-side: the first group was everything before the first slash, and the second group was the stuff between 2nd and 3rd slashes Corina Ah ok, thanks a lot for the explanation! 2 • Concise • Details of the problem and solution • Indication of answer acceptance
  • 4. Now Consider This Conversation 3 Author Utterance Cody Hello guys I got a huge problem Holli Cody: ask away Cody We’ve been ask as assignment the implementation of Dijkstra’s and Bellman Ford’s algorithm for calculating the shortest path in a given graph Holli So what’s the issue?; run into a problem? Cody I don’t really know how to start and that’s my problem … …. Darrin Cody: how much experience do you have writing code?; for example, there are quite a number of existing examples of the algorithms you’re talking about Cody basic i’m just starting Darrin ok; can you describe the steps on how you execute the algorithm?; and do you understand why those steps are necessary?; if so, then the next step you take is translating your written description of the process into pseudocode; once you have a reasonable sequence of actions, you then implement the pseudocode in your language of choice; frankly, the first two items are always the most difficult; because it requires you to understand the problem domain; once you understand it, making it work is usually much less effort Rachel Cody: oof graph theory for a beginner. do you understand how those algorithms work, ? Cody Yes I understand how those work Darrin just having trouble translating described steps to code? • Lengthy • Lacks relevant details of the problem • Too much noise
  • 5. Post Hoc Quality Conversations A conversation is considered post hoc quality based on the availability and ease of identifying information to gain useful software-related knowledge 4 Recruited human judges to analyze 400 conversations
  • 6.  Logistic Regression  Stochastic Gradient Boosted Trees  Random Forest  Sequential Neural Network Automatically Identify Post Hoc Quality Conversations Developer Chats Extraction of Features Classification  Knowledge Seeking/Sharing  Contextual  Succinct  Well Written  Participant Experience Prediction of Quality Binary Prediction • Post Hoc • Non Post Hoc
  • 7. Features to Identify Post Hoc Quality Conversation Knowledge Seeking / Sharing Succinct Well written Contextual Attributes of conversation 1. Primary question? 2. Knowledge-seeking question? 3. Accepted answers? 4. #Authors 1. #API Mentions 2. #URL 3. Code 4. Code Description 5. Size of code 6. Error Message 7. #Software Specific terms 1. #Utterances 2. #Sentences 3. #Words 4. Time Span 5. #Text Speaks 6. #Questions 7. Unique Information 8. Avg Shortest Path 9. Avg Graph Degree Attributes of conversation 1. #Misspellings 2. #Incomplete Sentences 3. Readability Metrics 1. Questioner Experience 2. Participants Experience Participant Experience
  • 8. Gold Set 7 Community (Slack Channels) #Conv pythondev#help 400 clojurians#clojure 400 elmlang#beginners 400 elmlang#general 400 racket#general 400 Total 2k Evaluation Methodology # Post Hoc = 1310 # Non Post Hoc = 690
  • 9. RQ1: How effective are machine learning-based techniques for automatic identification of post hoc quality developer chats? 8 Evaluation Results  Stochastic Gradient Boosted Trees (SGBT)  Logistic Regression (LR)  Random Forest (RF)  Sequential Neural Network (SNN) Baseline: Software-related conversations based on presence of code
  • 10. RQ1: How effective are machine learning-based techniques for automatic identification of post hoc quality developer chats? 9 Machine learning-based techniques outperform heuristic-based baseline SNN provides best performance, with F1 and AUC = 0.86, MCC = 0.55 Evaluation Results
  • 11. RQ2: Which features result in more effective automatic identification? Top 8 Features Information Gain #Utterances 0.204 #Sentences 0.182 Software-specific Terms 0.174 #Words 0.171 #Authors 0.158 Time Span 0.156 Participants’ Experience 0.151 Avg Graph Degree 0.146 10 Evaluation Results Length Coherence Topic of discussion Participant knowledge
  • 12. RQ3: What types of conversations are difficult to automatically detect as post hoc quality using our techniques? 11 False Negative False Positive Evaluation Results
  • 13. RQ3: What types of conversations are difficult to automatically detect as post hoc quality using our techniques? 12 False Negative False Positive FP: classifiers struggled distinguishing conversations based on the quality of answers FN: very short conversations (3-4 utterances), not enough content for our features Evaluation Results
  • 14. Summary: Identifying Post Hoc Quality Conversations Machine learning-based approach to automatically identify post hoc quality developer conversations • Best performance using Sequential Neural Network with F-measure and AUC of 0.86, MCC of 0.55 • Most informative quality features: Length, coherence, topic of discussion, and participant experience 13 Significance: Advances the field of information mining by using high-quality information from developer chats  Efficient information gathering towards building software maintenance tools  Enrich existing knowledge-bases and community knowledge

Editor's Notes

  • #2: Thank you for providing me the opportunity to present my research. The topic of my talk is <>
  • #3: Understanding the quality of the information in the mining source is the first step and essential for building effective data-driven software tools. Some developer communications such as Stack Overflow, which has been widely used as a mining resource, contains built-in mechanisms of quality assessment such as <>. Beyond the built-in mechanisms, researchers have proposed ways to assess the quality of information in Q&A forums <>. In chats, there is no formal mechanism or analyses for quality assessment. Quality feedback is signaled in the flow of the conversation, mostly using textual clues or emojis.
  • #4: Chat conversations vary significantly in quality. To understand our notion of quality in chats lets consider this conv. This is a conversation where Alexia seeks help about automatically modifying the format of text in a file. Elaina suggests a regular expression as the answer, why that would work. This also has an indication of answer acceptance. conversation(a) makes it easy to read and understand, and indicators of answer acceptance give the readers a sense of verification and confidence in the correctness of the information. TS: Now lets look at another example.
  • #5: This is a conversation where Cory asks suggestion on how to implement a specific algorithm, but lacks relevant details about the problem. As a result there are too much clarification questions and noise. This is a lengthy conversation, and this slide shows only a part of it due to space constraint. Overall, the lengthiness and presence of noise in conversation makes it difficult to extract specific information for both software engineers and mining tools. not suitable for task-based tools such as question and answer (Q&A) extraction.
  • #6: To automatically assess quality, we adopted a data-driven approach. We conducted a human study, where we recruited human judges to analyze the quality of conversations. We found that <> Based on the results from the study, we found the following characteristics of good or post-hoc quality, i.e. conversations containing useful information for mining or reading after the conversation has ended. References to other resources…explanation of suggested code or proposed solution
  • #7: Formulated the problem as supervised binary classification task. First, we extract the values of 5 sets of features, then we trained multiple classifiers, such as, logistic regression, Ensemble-based machine learning techniques (such as Random Forest, Stochastic Gradient Boosted Trees), and a simple SNN. % are shown by previous research to perform well on related software engineering tasks, such as detection of low quality questions on Stack Overflow. Sequential Neural Network (SNN) : Our model has three hidden layers; each layer consists of 64 neurons (twice the number of our features) and uses relu activation function. % logarithmic loss function and Adam optimization algorithm for gradient descent.
  • #8: Here I describe the 5 sets of features in more detail, to identify PH quality conversations. KS: We determine if a conversation is of type knowledge seeking/sharing by analyzing its form. CX: analyze content SC: analyze structure WW: analyze structure More experienced people are likely to contribute better quality information.
  • #9: In total, we evaluated our technique on 2k human annotated developer conversations from five programming communities on Slack. Total = 38K utterances, contributed by 1.4k users The gold set consisted of 1310 PH and 690 NPH, which means our dataset was slightly unbalanced.
  • #10: This figure shows the evaluation measures F1, AUC, MCC for each method. Since this is the first work to automatically assess quality in chats, we compare our ML-based techniques with a heuristic-based baseline. Q&A:Code considers a conversation as software-related based on containing at least one code segment.
  • #11: %Surprisingly, heuristic-based classifier perform reasonably well…MCC (Matthews correlation coefficient), that adjusts for class imbalance. We observe that <>, particularly for MCC that adjusts for class imbalance. Although all ML perform reasonably well, SNN provides best performance. % Other measures are sensitive to class imbalance. The higher the correlation between true and predicted values, the better the prediction.  Matthews Correlation Coefficient (MCC) involves values of all the four quadrants of a confusion matrix and returns a balanced measure.
  • #12: For RQ2, we determined the information gain of each feature in our feature sets. The table shows the top eight features across all feature sets. Overall we see the features which result in more effective identification are related to <>. For e.g., the median no of utterances in a PH conv is 14, while NPH is 4, which indicates that too short conversations do not presumably provide useful information. For e.g., to calculate the coherence of discussion, We created a graph for each conversation that consists of utterances as nodes and weighted undirected edges indicating strength of the relationship between two utterances. Higher average graph degree indicates that each utterance is well-connected to the other utterances in the conversation, thus more coherent.
  • #13: To perform classification error analysis, we qualitatively analyzed the common False Positives (FP) and False Negatives (FN) across all techniques.
  • #14: our classifiers struggled distinguishing conversations based on the quality of the answers provided. For example, some FP conversations were not completely answered or the proposed solution did not seem to work as indicated by follow-on discussion. observed that most are very short with an average of 3-4 utterances. These conversations are misclassified since they do not offer a lot of content for our features. Additionally, we are not able to correctly classify conversations that do not typically start with a specific question since most of our features are based on Q&A conversations.
  • #15: <Automatically identifying PH quality conversations>. This work takes the first step towards the research of quality assessment with developer chat communities. that can contribute to efficient information gathering for building software maintenance tools and enrich existing knowledge-bases and community knowledge.