SlideShare a Scribd company logo
Mining from Open Answers in Questionnaire Data KDD 01 San Francisco CA LISA Copyright ACM 2001 Hang Li* NEC Corporation [email_address] Kenji Yamanishi NEC Corporation [email_address]
Agenda Analysis of open answers Rule Analysis Classification Rules Association Rules Algorithm Correspondence Analysis Mining Result
Analysis of open answers Automatically summarize open answers Automatically mine useful information from open answers. Survey Analyzer system to analyze open answers (SA.) Two statistical learning Rule learning (Rule analysis) Correspondence Analysis
Rule Analysis – Classification Rules A number of categories  containing a number of texts. Automatically acquire rules from the categorizes texts. Classify new texts on the basis of the acquired rules. SA View each analysis target as a category View open answers  associated with the target as texts.
Rule Analysis (Cont.)
Rule Analysis – Algorithm (SC) SA learn classification rules or association rules by Stochastic Complexity (SC) MLD (Minimum Description Length) principle. Rectangles : 10 open answers Analysis target: T Some contain a specific word: W △ SC > 0 is positive, that is most likely to have given rise to the data.
Correspondence Analysis
Relationship between Rule analysis and Correspondence analysis Rule analysis  Employs a conditional probability model: P(Y|X) Provides the facts in detail. (Table 2, 3). Correspondence analysis  Employs a joint probability model: P(Y, X) Yields the entire structure. (Position map.) Y : analysis target X: words
Mining result With Car Data With Eye-drap Data With Beverage Data
Advantage of the mining system It is much faster and less costly way to summarize or mine from open questions. SA is the first system that can performing rule analysis and correspondence analysis. New statistical learning methodology base on Stochastic Complexity. SA has successfully been used in the mining of various types of questionnaire data.

More Related Content

PPT
Real Time Competitive Marketing Intelligence
PPT
Mining Product Reputations On the Web
PPT
Data Mining and the Web_Past_Present and Future
PDF
Probabilistic Information Retrieval
PDF
Topic modeling of marketing scientific papers: An experimental survey
PDF
Machine Language and Pattern Analysis IEEE 2015 Projects
PDF
10 Algorithms in data mining
Real Time Competitive Marketing Intelligence
Mining Product Reputations On the Web
Data Mining and the Web_Past_Present and Future
Probabilistic Information Retrieval
Topic modeling of marketing scientific papers: An experimental survey
Machine Language and Pattern Analysis IEEE 2015 Projects
10 Algorithms in data mining

What's hot (20)

PPTX
Information retrieval 7 boolean model
PPTX
Information Retrieval
PPTX
Data Patterns - A Native Open Source Data Profiling Tool for HPCC Systems
PDF
M phil-computer-science-machine-language-and-pattern-analysis-projects
PPTX
Information Retrieval-1
PPTX
Data Mining: Mining ,associations, and correlations
PPTX
Searching Techniques and Analysis
PPTX
Recommenders, Topics, and Text
PPTX
9. Searching & Sorting - Data Structures using C++ by Varsha Patil
PPTX
Recommendations for selection process automation in systematic reviews
PPTX
Query expansion for search improvement by faizulhaque
PPT
Data Processing
PDF
Applications: Prediction
PPTX
3. Stack - Data Structures using C++ by Varsha Patil
PDF
Searching in metric spaces
PPTX
Data Mining: Mining stream time series and sequence data
PPTX
Searching techniques in Data Structure And Algorithm
PDF
Searching and Sorting Techniques in Data Structure
PDF
Extraction of Data Using Comparable Entity Mining
PPTX
5. Queue - Data Structures using C++ by Varsha Patil
Information retrieval 7 boolean model
Information Retrieval
Data Patterns - A Native Open Source Data Profiling Tool for HPCC Systems
M phil-computer-science-machine-language-and-pattern-analysis-projects
Information Retrieval-1
Data Mining: Mining ,associations, and correlations
Searching Techniques and Analysis
Recommenders, Topics, and Text
9. Searching & Sorting - Data Structures using C++ by Varsha Patil
Recommendations for selection process automation in systematic reviews
Query expansion for search improvement by faizulhaque
Data Processing
Applications: Prediction
3. Stack - Data Structures using C++ by Varsha Patil
Searching in metric spaces
Data Mining: Mining stream time series and sequence data
Searching techniques in Data Structure And Algorithm
Searching and Sorting Techniques in Data Structure
Extraction of Data Using Comparable Entity Mining
5. Queue - Data Structures using C++ by Varsha Patil
Ad

Viewers also liked (18)

PPT
Lecture 11
PPT
Lecture 01
PPT
Lecture 20 combinatorics o
PPT
Aptitude average
PPT
Lecture 22
PPT
Lecture 15 sequences
PPT
Lecture 14(d)
PPT
Probability
PPT
Lecture 08
PPT
Lecture 04
PPTX
Lecture 35 prob
PPT
Lecture 21
PPT
Lecture 24
PPT
Lecture 02
PPT
Lecture 37 cond prob
PPTX
Lecture 36 laws of prob
PPT
Lecture 36
PPT
Lecture 44
Lecture 11
Lecture 01
Lecture 20 combinatorics o
Aptitude average
Lecture 22
Lecture 15 sequences
Lecture 14(d)
Probability
Lecture 08
Lecture 04
Lecture 35 prob
Lecture 21
Lecture 24
Lecture 02
Lecture 37 cond prob
Lecture 36 laws of prob
Lecture 36
Lecture 44
Ad

Similar to Mining from Open Answers in Questionnaire Data (8)

PDF
Hl2513421351
PDF
Hl2513421351
PDF
A web content mining application for detecting relevant pages using Jaccard ...
PDF
Analyzing undergraduate students’ performance in various perspectives using d...
PDF
Using data mining methods knowledge discovery for text mining
PDF
Card Sorting- Information Architecture Technique
PDF
LNAI 2682 Declarative Data Mining Using SQL3 1st Edition by Hasan Jamil ISBN ...
PDF
Paper id 26201475
Hl2513421351
Hl2513421351
A web content mining application for detecting relevant pages using Jaccard ...
Analyzing undergraduate students’ performance in various perspectives using d...
Using data mining methods knowledge discovery for text mining
Card Sorting- Information Architecture Technique
LNAI 2682 Declarative Data Mining Using SQL3 1st Edition by Hasan Jamil ISBN ...
Paper id 26201475

More from feiwin (7)

PPT
2007/7/25 Proposal update
PPT
2006/11/20 Proposal
PPT
2006/10/16 Proposal
PPT
Scalable Discovery Of Hidden Emails From Large Folders
PPT
An Integrated Framework on Mining Logs Files for Computing System Management
PPT
Finding Similar Files in Large Document Repositories
PPT
Email Data Cleaning
2007/7/25 Proposal update
2006/11/20 Proposal
2006/10/16 Proposal
Scalable Discovery Of Hidden Emails From Large Folders
An Integrated Framework on Mining Logs Files for Computing System Management
Finding Similar Files in Large Document Repositories
Email Data Cleaning

Recently uploaded (20)

PDF
Unit 1 Cost Accounting - Cost sheet
PDF
Laughter Yoga Basic Learning Workshop Manual
PPTX
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
PDF
MSPs in 10 Words - Created by US MSP Network
PDF
DOC-20250806-WA0002._20250806_112011_0000.pdf
PDF
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
PDF
Chapter 5_Foreign Exchange Market in .pdf
PDF
A Brief Introduction About Julia Allison
PPTX
ICG2025_ICG 6th steering committee 30-8-24.pptx
PDF
WRN_Investor_Presentation_August 2025.pdf
PDF
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
PPTX
5 Stages of group development guide.pptx
PDF
Business model innovation report 2022.pdf
PPTX
Amazon (Business Studies) management studies
PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PDF
Training And Development of Employee .pdf
PPTX
New Microsoft PowerPoint Presentation - Copy.pptx
PDF
How to Get Funding for Your Trucking Business
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PDF
IFRS Notes in your pocket for study all the time
Unit 1 Cost Accounting - Cost sheet
Laughter Yoga Basic Learning Workshop Manual
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
MSPs in 10 Words - Created by US MSP Network
DOC-20250806-WA0002._20250806_112011_0000.pdf
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
Chapter 5_Foreign Exchange Market in .pdf
A Brief Introduction About Julia Allison
ICG2025_ICG 6th steering committee 30-8-24.pptx
WRN_Investor_Presentation_August 2025.pdf
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
5 Stages of group development guide.pptx
Business model innovation report 2022.pdf
Amazon (Business Studies) management studies
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
Training And Development of Employee .pdf
New Microsoft PowerPoint Presentation - Copy.pptx
How to Get Funding for Your Trucking Business
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
IFRS Notes in your pocket for study all the time

Mining from Open Answers in Questionnaire Data

  • 1. Mining from Open Answers in Questionnaire Data KDD 01 San Francisco CA LISA Copyright ACM 2001 Hang Li* NEC Corporation [email_address] Kenji Yamanishi NEC Corporation [email_address]
  • 2. Agenda Analysis of open answers Rule Analysis Classification Rules Association Rules Algorithm Correspondence Analysis Mining Result
  • 3. Analysis of open answers Automatically summarize open answers Automatically mine useful information from open answers. Survey Analyzer system to analyze open answers (SA.) Two statistical learning Rule learning (Rule analysis) Correspondence Analysis
  • 4. Rule Analysis – Classification Rules A number of categories containing a number of texts. Automatically acquire rules from the categorizes texts. Classify new texts on the basis of the acquired rules. SA View each analysis target as a category View open answers associated with the target as texts.
  • 6. Rule Analysis – Algorithm (SC) SA learn classification rules or association rules by Stochastic Complexity (SC) MLD (Minimum Description Length) principle. Rectangles : 10 open answers Analysis target: T Some contain a specific word: W △ SC > 0 is positive, that is most likely to have given rise to the data.
  • 8. Relationship between Rule analysis and Correspondence analysis Rule analysis Employs a conditional probability model: P(Y|X) Provides the facts in detail. (Table 2, 3). Correspondence analysis Employs a joint probability model: P(Y, X) Yields the entire structure. (Position map.) Y : analysis target X: words
  • 9. Mining result With Car Data With Eye-drap Data With Beverage Data
  • 10. Advantage of the mining system It is much faster and less costly way to summarize or mine from open questions. SA is the first system that can performing rule analysis and correspondence analysis. New statistical learning methodology base on Stochastic Complexity. SA has successfully been used in the mining of various types of questionnaire data.