SlideShare a Scribd company logo
Clustering and Analysis in Data Mining
What is Clustering?The process of grouping a set of physical or abstract objects into classes of similar objects is called clustering.
Why Clustering?ScalabilityAbility to deal with different types of attributesDiscovery of clusters with arbitrary shapeMinimal requirements for domain knowledge to determine input parametersAbility to deal with noisy dataIncremental clustering and insensitivity to the order of input records:High dimensionalityConstraint-based clusteringInterpretability and usability
 Data types in Cluster AnalysisData matrix (or object-by-variable structure)Interval-Scaled VariablesBinary VariablesA categorical variableA discrete ordinal variableA ratio-scaled variable
Methods used in clustering:Partitioning method.Hierarchical method.Data Density based method.Grid based method.Model Based method.
Hierarchical methods in clustering   There are two types of hierarchical clustering methods:Agglomerative hierarchical clusteringDivisive hierarchical clustering
Agglomerative hierarchical clusteringThis bottom-up strategy starts by placing each object in its own cluster and then merges these atomic clusters into larger and larger clusters, until all of the objects are in a single cluster or until certain termination conditions are satisfied.
Divisive hierarchical clusteringThis top-down strategy does the reverse of agglomerative hierarchical clustering by starting with all objects in one cluster. It subdivides the cluster into smaller and smaller pieces, until each object forms a cluster on its own or until it satisfies certain termination conditions, such as a desired number of clusters is obtained or the diameter of each cluster is within a certain threshold.
Density-Based methods in clusteringDBSCAN: A Density-Based Clustering Method Based on Connected Regions withSufficiently High DensityOPTICS: Ordering Points to Identify the Clustering StructureDENCLUE: Clustering Based on Density Distribution Functions
Grid-Based methods in clusteringSTING: Statistical information gridSTING is a grid-based multi resolution clustering technique in which the spatial area is divided into rectangular cells.Wave Cluster: Clustering Using Wavelet TransformationWave Cluster is a multi resolution clustering algorithm that first summarizes the data by imposing a multidimensional grid structure onto the data space. It then uses a wavelet transformation to transform the original feature space, finding dense regions in the transformed space
Model-Based Clustering MethodsExpectation-MaximizationConceptual ClusteringNeural Network Approach
Methods of Clustering High-Dimensional DataCLIQUE: A Dimension-Growth Subspace Clustering MethodCLIQUE (CLustering In QUEst) was the first algorithm proposed for dimension-growth subspace clustering in high-dimensional space.PROCLUS: A Dimension-Reduction Subspace Clustering MethodPROCLUS (PROjected CLUStering) is a typical dimension-reduction subspace clustering method. That is, instead of starting from single-dimensional spaces, it starts by finding an initial approximation of the clusters in the high-dimensional attribute space. Each dimension is then assigned a weight for each cluster, and the updated weights are used in the next iteration to regenerate the clusters.
Constraint-Based Cluster Analysis    Constraint-based clustering finds clusters that satisfy user-specified preferences or constraints, few categories of constraints are :Constraints on individual objectsConstraints on the selection of clustering parametersConstraints on distance or similarity functionsUser-specified constraints on the properties of individual clustersSemi-supervised clustering based on “partial” supervision
Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net

More Related Content

PPT
Cluster analysis
PPTX
Clustering in Data Mining
PPTX
Data Mining: clustering and analysis
PPTX
Data clustring
PPTX
Clustering in data Mining (Data Mining)
PPTX
Cluster Analysis Introduction
PPT
Clustering
PPT
Capter10 cluster basic
Cluster analysis
Clustering in Data Mining
Data Mining: clustering and analysis
Data clustring
Clustering in data Mining (Data Mining)
Cluster Analysis Introduction
Clustering
Capter10 cluster basic

What's hot (20)

PPT
Cluster analysis
PPTX
Introduction to Clustering algorithm
PPT
3.5 model based clustering
PPTX
Types of clustering and different types of clustering algorithms
PPTX
Cluster analysis
 
PPTX
Machine learning clustering
PPTX
Clusters techniques
PPTX
05 Clustering in Data Mining
PPT
Chap8 basic cluster_analysis
PPT
Chapter 11 cluster advanced : web and text mining
PPT
3.1 clustering
PPT
Clustering
PPTX
Cluster analysis
PPT
Dataa miining
PDF
Current clustering techniques
PPT
Clustering
PDF
Data clustering
PPTX
Clustering
PPT
What is cluster analysis
PPT
Cluster analysis
Introduction to Clustering algorithm
3.5 model based clustering
Types of clustering and different types of clustering algorithms
Cluster analysis
 
Machine learning clustering
Clusters techniques
05 Clustering in Data Mining
Chap8 basic cluster_analysis
Chapter 11 cluster advanced : web and text mining
3.1 clustering
Clustering
Cluster analysis
Dataa miining
Current clustering techniques
Clustering
Data clustering
Clustering
What is cluster analysis
Ad

Similar to Data Mining: clustering and analysis (20)

PPTX
UNIT - 4: Data Warehousing and Data Mining
PDF
Data mining
PPTX
clustering and distance metrics.pptx
PDF
Paper id 26201478
PDF
Literature Survey: Clustering Technique
PPTX
Clustering in Machine Learning, a process of grouping.
PPT
DM_clustering.ppt
PPT
upd Unit-v -Cluster Analysis (1) (1).ppt
PPT
cluster analysis
PDF
Chapter 10.1,2,3 pdf.pdf
PDF
Du35687693
PPTX
METHODS OF CLUSTER ANALYSIS.pptx
PPTX
1. METHODS OF CLUSTER ANALYSIS.pptx
PPTX
Clustering on DSS
DOCX
Cluster analysis foundations.docx
PPTX
Advanced database and data mining & clustering concepts
PPTX
pratik meshram-Unit 5 (contemporary mkt r sch)
PPTX
Clustering: Grouping all Data for Insights
PPT
Cluster_saumitra.ppt
UNIT - 4: Data Warehousing and Data Mining
Data mining
clustering and distance metrics.pptx
Paper id 26201478
Literature Survey: Clustering Technique
Clustering in Machine Learning, a process of grouping.
DM_clustering.ppt
upd Unit-v -Cluster Analysis (1) (1).ppt
cluster analysis
Chapter 10.1,2,3 pdf.pdf
Du35687693
METHODS OF CLUSTER ANALYSIS.pptx
1. METHODS OF CLUSTER ANALYSIS.pptx
Clustering on DSS
Cluster analysis foundations.docx
Advanced database and data mining & clustering concepts
pratik meshram-Unit 5 (contemporary mkt r sch)
Clustering: Grouping all Data for Insights
Cluster_saumitra.ppt
Ad

More from Datamining Tools (20)

PPTX
Data Mining: Text and web mining
PPTX
Data Mining: Outlier analysis
PPTX
Data Mining: Mining stream time series and sequence data
PPTX
Data Mining: Mining ,associations, and correlations
PPTX
Data Mining: Graph mining and social network analysis
PPTX
Data Mining: Data warehouse and olap technology
PPTX
Data MIning: Data processing
PPTX
Data mining: Classification and Prediction
PPTX
Data Mining: Data mining classification and analysis
PPTX
Data Mining: Data mining and key definitions
PPTX
Data Mining: Data cube computation and data generalization
PPTX
Data Mining: Applying data mining
PPTX
Data Mining: Application and trends in data mining
PPTX
AI: Planning and AI
PPTX
AI: Logic in AI 2
PPTX
AI: Logic in AI
PPTX
AI: Learning in AI 2
PPTX
AI: Learning in AI
PPTX
AI: Introduction to artificial intelligence
PPTX
AI: Belief Networks
Data Mining: Text and web mining
Data Mining: Outlier analysis
Data Mining: Mining stream time series and sequence data
Data Mining: Mining ,associations, and correlations
Data Mining: Graph mining and social network analysis
Data Mining: Data warehouse and olap technology
Data MIning: Data processing
Data mining: Classification and Prediction
Data Mining: Data mining classification and analysis
Data Mining: Data mining and key definitions
Data Mining: Data cube computation and data generalization
Data Mining: Applying data mining
Data Mining: Application and trends in data mining
AI: Planning and AI
AI: Logic in AI 2
AI: Logic in AI
AI: Learning in AI 2
AI: Learning in AI
AI: Introduction to artificial intelligence
AI: Belief Networks

Recently uploaded (20)

PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Encapsulation theory and applications.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Big Data Technologies - Introduction.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPT
Teaching material agriculture food technology
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Machine learning based COVID-19 study performance prediction
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Diabetes mellitus diagnosis method based random forest with bat algorithm
Encapsulation theory and applications.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Spectral efficient network and resource selection model in 5G networks
Dropbox Q2 2025 Financial Results & Investor Presentation
Big Data Technologies - Introduction.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Building Integrated photovoltaic BIPV_UPV.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Unlocking AI with Model Context Protocol (MCP)
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Teaching material agriculture food technology
Understanding_Digital_Forensics_Presentation.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Review of recent advances in non-invasive hemoglobin estimation
Machine learning based COVID-19 study performance prediction
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Electronic commerce courselecture one. Pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?

Data Mining: clustering and analysis

  • 1. Clustering and Analysis in Data Mining
  • 2. What is Clustering?The process of grouping a set of physical or abstract objects into classes of similar objects is called clustering.
  • 3. Why Clustering?ScalabilityAbility to deal with different types of attributesDiscovery of clusters with arbitrary shapeMinimal requirements for domain knowledge to determine input parametersAbility to deal with noisy dataIncremental clustering and insensitivity to the order of input records:High dimensionalityConstraint-based clusteringInterpretability and usability
  • 4.  Data types in Cluster AnalysisData matrix (or object-by-variable structure)Interval-Scaled VariablesBinary VariablesA categorical variableA discrete ordinal variableA ratio-scaled variable
  • 5. Methods used in clustering:Partitioning method.Hierarchical method.Data Density based method.Grid based method.Model Based method.
  • 6. Hierarchical methods in clustering There are two types of hierarchical clustering methods:Agglomerative hierarchical clusteringDivisive hierarchical clustering
  • 7. Agglomerative hierarchical clusteringThis bottom-up strategy starts by placing each object in its own cluster and then merges these atomic clusters into larger and larger clusters, until all of the objects are in a single cluster or until certain termination conditions are satisfied.
  • 8. Divisive hierarchical clusteringThis top-down strategy does the reverse of agglomerative hierarchical clustering by starting with all objects in one cluster. It subdivides the cluster into smaller and smaller pieces, until each object forms a cluster on its own or until it satisfies certain termination conditions, such as a desired number of clusters is obtained or the diameter of each cluster is within a certain threshold.
  • 9. Density-Based methods in clusteringDBSCAN: A Density-Based Clustering Method Based on Connected Regions withSufficiently High DensityOPTICS: Ordering Points to Identify the Clustering StructureDENCLUE: Clustering Based on Density Distribution Functions
  • 10. Grid-Based methods in clusteringSTING: Statistical information gridSTING is a grid-based multi resolution clustering technique in which the spatial area is divided into rectangular cells.Wave Cluster: Clustering Using Wavelet TransformationWave Cluster is a multi resolution clustering algorithm that first summarizes the data by imposing a multidimensional grid structure onto the data space. It then uses a wavelet transformation to transform the original feature space, finding dense regions in the transformed space
  • 12. Methods of Clustering High-Dimensional DataCLIQUE: A Dimension-Growth Subspace Clustering MethodCLIQUE (CLustering In QUEst) was the first algorithm proposed for dimension-growth subspace clustering in high-dimensional space.PROCLUS: A Dimension-Reduction Subspace Clustering MethodPROCLUS (PROjected CLUStering) is a typical dimension-reduction subspace clustering method. That is, instead of starting from single-dimensional spaces, it starts by finding an initial approximation of the clusters in the high-dimensional attribute space. Each dimension is then assigned a weight for each cluster, and the updated weights are used in the next iteration to regenerate the clusters.
  • 13. Constraint-Based Cluster Analysis Constraint-based clustering finds clusters that satisfy user-specified preferences or constraints, few categories of constraints are :Constraints on individual objectsConstraints on the selection of clustering parametersConstraints on distance or similarity functionsUser-specified constraints on the properties of individual clustersSemi-supervised clustering based on “partial” supervision
  • 14. Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net