SlideShare a Scribd company logo
Introduction toXLMiner™Data Reduction and explorationXLMiner and Microsoft Office are registered trademarks of the respective owners.
Data Exploration And ReductionData Exploration and reduction is used when the data set to be mined is very large and may contain large number of variables that are very correlated or unrelated to the outcome we are working at. Using the tools in XLMiner, one can reduce the size of the data set or explore the data set to formulate hypothesis that can be worth testing.There are two techniques for this purpose:Principle Component Analysis:The PCA is a mathematical function that is used to transform a number of correlated variables into a smaller number of uncorrelated variables. These uncorrelated variables are called Principal Components. Thus, we get a data set which has a lesser number of variables but the variability of data is maintained since the first principle component takes into consideration the maximum amount of variation in data and others after it consider slightly lesser amounts of variability into accountCluster Analysis: Cluster analysis is also called data segmentation. Its primary objective is to assign objects to the same clusters such that those within a cluster have marked similarities and those in different clusters have marked differenceshttp://dataminingtools.net
Data Exploration And Reduction- Principle Component Analysishttp://dataminingtools.net
Data Exploration And ReductionFixed #components : You can specify a fixed number here.Smallest #components explaining :  This option lets you specify a percentage, and XLMiner�will calculate the minimum number of principal components required to account for that percentage of variance. Do not select it herehttp://dataminingtools.net
Data Exploration And Reduction- Outputhttp://dataminingtools.net
Data Exploration And Reduction-Cluster AnalysisCluster analysis can be done in two ways:k-Means Clustering: - In k-means clustering, the clustering procedure begins with a single cluster that is successively split into two clusters. This continues till the required number of clusters is obtained.2.Hierarchical Cluster Analysis: - Hierarchical clustering itself can be done in two ways – agglomerative and divisive clustering. In agglomerative clustering, as the name suggests, distinct objects are combined to form a group of objects having some similarities. In divisive clustering, objects are grouped into finer groups successively. http://dataminingtools.net
Data Exploration And Reduction – K-Means ClusteringSelect the variables to be selected as input. Deselect the rows that contain Headers (Here TYPE var)http://dataminingtools.net
Data Exploration And Reduction – K-Means ClusteringEnter the number of clusters you ant the data set to be divided into and the number of iterations to be performed while creating the clusters. You may also specify number of starts and seedhttp://dataminingtools.net
Data Exploration And Reduction – K-Means Clustering (Output)XLMiner calculates the squares of the distances and chooses the least value as the Best Starting point .http://dataminingtools.net
Data Exploration And Reduction – K-Means Clustering (Output)This shows the distance of each row from the clusters. See how the rows are put into the cluster from which the a row has least distance .http://dataminingtools.net
Data Exploration And Reduction – Hierarchical clustering In hierarchical  clustering, the mean of all the values is calculated and the set is split into two from there. Then the mean for these sets is calculates and split into two .This process continues until the requires number of clusters are not formed.Hierarchical clustering itself can be done in two ways – agglomerative and divisive clustering. In agglomerative clustering, as the name suggests, distinct objects are combined to form a group of objects having some similarities. In divisive clustering, objects are grouped into finer groups successively. http://dataminingtools.net
Data Exploration And Reduction – Hierarchical Clusteringhttp://dataminingtools.net
Data Exploration And Reduction – Hierarchical ClusteringSelect “Normalize Data” and then select from any one of the five clustering procedures available.http://dataminingtools.net
Data Exploration And Reduction – Hierarchical ClusteringThis output details the history of the cluster formation.  Initially, each individual case is considered its own cluster (with just itself as a member), so we start off with # clusters = # cases (21 in the example above). At stage 1, above, clusters (i.e. cases) 10 and 13 were found to be closer together than any other two clusters (i.e. cases), so they are joined together in a cluster called Cluster 10.  So now we have one cluster that has two cases (cases 10 and 13), and 19 other clusters that still have just one case in each.  At stage 2, clusters 7 and 12 are found to be closer together than any other two clusters, so they are joined together into cluster 7.The cluster ID is thus the lowest case number of the cases belonging to that cluster. This process continues until there is just one cluster.  At various stages of the clustering process, there are different numbers of clusters.  A graph called a dendrogram lets you visualize this:http://dataminingtools.net
Data Exploration And Reduction – Hierarchical Clusteringhttp://dataminingtools.net
Data Exploration And Reduction – Hierarchical ClusteringThis shows the assignment of cases to clusters(we selected 8 clusters)http://dataminingtools.net
Thank youFor more visit:http://dataminingtools.nethttp://dataminingtools.net
Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net

More Related Content

PPT
Xlminer demo
PPTX
XL-MINER: Associations
PPTX
XL-MINER:Partition
PPTX
Introduction To XL-Miner
PPTX
XL-MINER:Prediction
PPTX
XL Miner: Classification
PPTX
XL-MINER: Data Utilities
PPTX
Dma unit 2
Xlminer demo
XL-MINER: Associations
XL-MINER:Partition
Introduction To XL-Miner
XL-MINER:Prediction
XL Miner: Classification
XL-MINER: Data Utilities
Dma unit 2

What's hot (18)

PPTX
WEKA: Data Mining Input Concepts Instances And Attributes
PPT
Data preprocessing
PPT
Data Processing-Presentation
DOC
Data Mining: Data Preprocessing
PPTX
Data reduction
PPTX
data generalization and summarization
PPT
Data structures and Alogarithims
PPTX
Assignmentdatamining
PPTX
Types of datastructures
PPSX
Lecture 1
PPTX
Data Mining: Data processing
PPT
Basic terminologies
PPT
Introduction to Data Mining
PPTX
Data structure (basics)
PPT
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
PPTX
Data preprocessing
PPT
Data preprocessing in Data Mining
PPT
Data struters
WEKA: Data Mining Input Concepts Instances And Attributes
Data preprocessing
Data Processing-Presentation
Data Mining: Data Preprocessing
Data reduction
data generalization and summarization
Data structures and Alogarithims
Assignmentdatamining
Types of datastructures
Lecture 1
Data Mining: Data processing
Basic terminologies
Introduction to Data Mining
Data structure (basics)
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data preprocessing
Data preprocessing in Data Mining
Data struters
Ad

Viewers also liked (20)

PPTX
XL-MINER:Prediction
PPTX
Data Mining: Mining ,associations, and correlations
PPTX
AI: AI & Searching
PPTX
Data Mining: Mining stream time series and sequence data
PPTX
Exploring Data
PPTX
Data Mining: Graph mining and social network analysis
PPTX
AI: AI & Problem Solving
PPTX
Cluster analysis
PPTX
Data warehouse and olap technology
PPTX
XL-Miner: Classification
PPTX
XL-Miner: Time Series
PPTX
XL-MINER:Data Utilities
PPTX
XL-MINER:Introduction To Xl Miner
PPTX
Areas of machine leanring
PPTX
XL MINER: Associations
PPTX
"k-means-clustering" presentation @ Papers We Love Bucharest
PPTX
XL-MINER:Partition
PDF
Prueba de corridas arriba y abajo de la media
PPTX
A Simple Tutorial on Conjoint and Cluster Analysis
PDF
Customer Clustering For Retail Marketing
XL-MINER:Prediction
Data Mining: Mining ,associations, and correlations
AI: AI & Searching
Data Mining: Mining stream time series and sequence data
Exploring Data
Data Mining: Graph mining and social network analysis
AI: AI & Problem Solving
Cluster analysis
Data warehouse and olap technology
XL-Miner: Classification
XL-Miner: Time Series
XL-MINER:Data Utilities
XL-MINER:Introduction To Xl Miner
Areas of machine leanring
XL MINER: Associations
"k-means-clustering" presentation @ Papers We Love Bucharest
XL-MINER:Partition
Prueba de corridas arriba y abajo de la media
A Simple Tutorial on Conjoint and Cluster Analysis
Customer Clustering For Retail Marketing
Ad

Similar to XL-MINER: Data Exploration (20)

PDF
CLUSTERING IN DATA MINING.pdf
PPT
Clustering & classification
PDF
Chapter 5.pdf
PDF
Enhanced Clustering Algorithm for Processing Online Data
PDF
Lx3520322036
PPTX
Introduction to Datamining Concept and Techniques
PDF
Data Mining: Cluster Analysis
PDF
Literature Survey: Clustering Technique
PPT
Clustering
PDF
Az36311316
PDF
K-Means Clustering Explained_ Algorithm And Sklearn Implementation _ by Mariu...
PDF
An Efficient Clustering Method for Aggregation on Data Fragments
PPTX
Clustering in data Mining (Data Mining)
PDF
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
PDF
Clustering in Machine Learning.pdf
PPTX
Presentation on K-Means Clustering
PDF
84cc04ff77007e457df6aa2b814d2346bf1b
PDF
Cancer data partitioning with data structure and difficulty independent clust...
PPTX
K- means clustering method based Data Mining of Network Shared Resources .pptx
PPTX
K- means clustering method based Data Mining of Network Shared Resources .pptx
CLUSTERING IN DATA MINING.pdf
Clustering & classification
Chapter 5.pdf
Enhanced Clustering Algorithm for Processing Online Data
Lx3520322036
Introduction to Datamining Concept and Techniques
Data Mining: Cluster Analysis
Literature Survey: Clustering Technique
Clustering
Az36311316
K-Means Clustering Explained_ Algorithm And Sklearn Implementation _ by Mariu...
An Efficient Clustering Method for Aggregation on Data Fragments
Clustering in data Mining (Data Mining)
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
Clustering in Machine Learning.pdf
Presentation on K-Means Clustering
84cc04ff77007e457df6aa2b814d2346bf1b
Cancer data partitioning with data structure and difficulty independent clust...
K- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptx

More from DataminingTools Inc (20)

PPTX
Terminology Machine Learning
PPTX
Techniques Machine Learning
PPTX
Machine learning Introduction
PPTX
AI: Planning and AI
PPTX
AI: Logic in AI 2
PPTX
AI: Logic in AI
PPTX
AI: Learning in AI 2
PPTX
AI: Learning in AI
PPTX
AI: Introduction to artificial intelligence
PPTX
AI: Belief Networks
PPTX
Data Mining: Text and web mining
PPTX
Data Mining: Outlier analysis
PPTX
Data Mining: clustering and analysis
PPTX
Data mining: Classification and prediction
PPTX
Data Mining: Classification and analysis
PPTX
Data Mining: Key definitions
PPTX
Data Mining: Data cube computation and data generalization
PPTX
Data Mining: Applying data mining
PPTX
Data Mining: Application and trends in data mining
PPTX
MS SQL SERVER: Using the data mining tools
Terminology Machine Learning
Techniques Machine Learning
Machine learning Introduction
AI: Planning and AI
AI: Logic in AI 2
AI: Logic in AI
AI: Learning in AI 2
AI: Learning in AI
AI: Introduction to artificial intelligence
AI: Belief Networks
Data Mining: Text and web mining
Data Mining: Outlier analysis
Data Mining: clustering and analysis
Data mining: Classification and prediction
Data Mining: Classification and analysis
Data Mining: Key definitions
Data Mining: Data cube computation and data generalization
Data Mining: Applying data mining
Data Mining: Application and trends in data mining
MS SQL SERVER: Using the data mining tools

Recently uploaded (20)

PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Approach and Philosophy of On baking technology
PPTX
A Presentation on Artificial Intelligence
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
MYSQL Presentation for SQL database connectivity
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
NewMind AI Monthly Chronicles - July 2025
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Digital-Transformation-Roadmap-for-Companies.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Approach and Philosophy of On baking technology
A Presentation on Artificial Intelligence
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Network Security Unit 5.pdf for BCA BBA.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
20250228 LYD VKU AI Blended-Learning.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
MYSQL Presentation for SQL database connectivity
The AUB Centre for AI in Media Proposal.docx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Building Integrated photovoltaic BIPV_UPV.pdf
NewMind AI Weekly Chronicles - August'25 Week I

XL-MINER: Data Exploration

  • 1. Introduction toXLMiner™Data Reduction and explorationXLMiner and Microsoft Office are registered trademarks of the respective owners.
  • 2. Data Exploration And ReductionData Exploration and reduction is used when the data set to be mined is very large and may contain large number of variables that are very correlated or unrelated to the outcome we are working at. Using the tools in XLMiner, one can reduce the size of the data set or explore the data set to formulate hypothesis that can be worth testing.There are two techniques for this purpose:Principle Component Analysis:The PCA is a mathematical function that is used to transform a number of correlated variables into a smaller number of uncorrelated variables. These uncorrelated variables are called Principal Components. Thus, we get a data set which has a lesser number of variables but the variability of data is maintained since the first principle component takes into consideration the maximum amount of variation in data and others after it consider slightly lesser amounts of variability into accountCluster Analysis: Cluster analysis is also called data segmentation. Its primary objective is to assign objects to the same clusters such that those within a cluster have marked similarities and those in different clusters have marked differenceshttp://dataminingtools.net
  • 3. Data Exploration And Reduction- Principle Component Analysishttp://dataminingtools.net
  • 4. Data Exploration And ReductionFixed #components : You can specify a fixed number here.Smallest #components explaining :  This option lets you specify a percentage, and XLMiner�will calculate the minimum number of principal components required to account for that percentage of variance. Do not select it herehttp://dataminingtools.net
  • 5. Data Exploration And Reduction- Outputhttp://dataminingtools.net
  • 6. Data Exploration And Reduction-Cluster AnalysisCluster analysis can be done in two ways:k-Means Clustering: - In k-means clustering, the clustering procedure begins with a single cluster that is successively split into two clusters. This continues till the required number of clusters is obtained.2.Hierarchical Cluster Analysis: - Hierarchical clustering itself can be done in two ways – agglomerative and divisive clustering. In agglomerative clustering, as the name suggests, distinct objects are combined to form a group of objects having some similarities. In divisive clustering, objects are grouped into finer groups successively. http://dataminingtools.net
  • 7. Data Exploration And Reduction – K-Means ClusteringSelect the variables to be selected as input. Deselect the rows that contain Headers (Here TYPE var)http://dataminingtools.net
  • 8. Data Exploration And Reduction – K-Means ClusteringEnter the number of clusters you ant the data set to be divided into and the number of iterations to be performed while creating the clusters. You may also specify number of starts and seedhttp://dataminingtools.net
  • 9. Data Exploration And Reduction – K-Means Clustering (Output)XLMiner calculates the squares of the distances and chooses the least value as the Best Starting point .http://dataminingtools.net
  • 10. Data Exploration And Reduction – K-Means Clustering (Output)This shows the distance of each row from the clusters. See how the rows are put into the cluster from which the a row has least distance .http://dataminingtools.net
  • 11. Data Exploration And Reduction – Hierarchical clustering In hierarchical clustering, the mean of all the values is calculated and the set is split into two from there. Then the mean for these sets is calculates and split into two .This process continues until the requires number of clusters are not formed.Hierarchical clustering itself can be done in two ways – agglomerative and divisive clustering. In agglomerative clustering, as the name suggests, distinct objects are combined to form a group of objects having some similarities. In divisive clustering, objects are grouped into finer groups successively. http://dataminingtools.net
  • 12. Data Exploration And Reduction – Hierarchical Clusteringhttp://dataminingtools.net
  • 13. Data Exploration And Reduction – Hierarchical ClusteringSelect “Normalize Data” and then select from any one of the five clustering procedures available.http://dataminingtools.net
  • 14. Data Exploration And Reduction – Hierarchical ClusteringThis output details the history of the cluster formation.  Initially, each individual case is considered its own cluster (with just itself as a member), so we start off with # clusters = # cases (21 in the example above). At stage 1, above, clusters (i.e. cases) 10 and 13 were found to be closer together than any other two clusters (i.e. cases), so they are joined together in a cluster called Cluster 10.  So now we have one cluster that has two cases (cases 10 and 13), and 19 other clusters that still have just one case in each.  At stage 2, clusters 7 and 12 are found to be closer together than any other two clusters, so they are joined together into cluster 7.The cluster ID is thus the lowest case number of the cases belonging to that cluster. This process continues until there is just one cluster.  At various stages of the clustering process, there are different numbers of clusters.  A graph called a dendrogram lets you visualize this:http://dataminingtools.net
  • 15. Data Exploration And Reduction – Hierarchical Clusteringhttp://dataminingtools.net
  • 16. Data Exploration And Reduction – Hierarchical ClusteringThis shows the assignment of cases to clusters(we selected 8 clusters)http://dataminingtools.net
  • 17. Thank youFor more visit:http://dataminingtools.nethttp://dataminingtools.net
  • 18. Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net