SlideShare a Scribd company logo
2
Most read
5
Most read
Types of clustering:
Clustering can be divided into different categories based on different criteria
• 1.Hard clustering: A given data point in n-dimensional space only belongs to one cluster. This is also known as exclusive
clustering. The K-Means clustering mechanism is an example of hard clustering.
• 2.Soft clustering: A given data point can belong to more than one cluster in soft clustering. This is also known as overlapping
clustering. The Fuzzy K-Means algorithm is a good example of soft clustering.
• 3.Hierarchial clustering: In hierarchical clustering, a hierarchy of clusters is built using the top-down (divisive) or bottom-up
(agglomerative) approach.
• 4. Flat clustering: Is a simple technique where no hierarchy is present.
• 5.Model-based clustering: In model-based clustering, data is modeled using a standard statistical model to work with different
distributions. The idea is to find a model that best fits the data.
Different clustering algorithms
• Fuzzy K-Means:
• The K-Means algorithm is for hard clustering. In hard clustering, one data point belongs only to one cluster. However,
there can be situations where one point belongs to more than one cluster. For example, a news article may belong to
both the Technology and Current Affairs categories. In that case, we need a soft clustering mechanism.
• The Fuzzy K-Means algorithm implements soft clustering. It generates overlapping clusters. Each point has a probability of
belonging to each cluster, based on the distance from each centroid.
• In this example, we apply the Fuzzy K-Means algorithm for dataset(22 80 ,25 75 ,28 85 ,55 150,50 145 ,53 153 ,38 115 )
The outcome of the example is given in the following figure. Note that the newly added data point (someone
who had medium weight and height) belongs to cluster 3 in 0.52 probability and to cluster 1 in 0.47
probability, whereas other data points (people who are either large or small) belongs to nearly 0.9 to a
particular cluster.
Streaming K-Means
• If the volume of data is too large to be stored in the main memory available, the K-Means algorithm is not suitable, as it's
batch processing mechanism iterates over all the data points. Also, the K-Means algorithm is sensitive to the noise and outliers
in data.
• Streaming K-Means algorithms has provided a solution for these problems by operating in two steps, as follows:
• The streaming step
• The ball K-Means step
• The idea is to read data points sequentially, storing very few data points in memory.
• Then, after the first step, a better representative set of weighted data points is produced for further processing.
• The final K number of clusters is produced in the ball K-Means step. During the second step, potential outliers are eliminated.
Spectral clustering
• The spectral clustering algorithm is helpful in hard, nonconvex clustering problems. It clusters points using
the eigenvectors of matrices derived from data.
Dirichlet clustering
• The Fuzzy K-Means and K-Means algorithms model clusters as spheres (circles in n-dimensional space.) K-Means assumes a
common fixed variance. Further, K-Means does not model the data point distribution.
• A normal data distribution should be there for the K-Means and Fuzzy K-Means algorithms to process effectively. If the data
distribution is different, for example, an asymmetrical normal distribution (different standard deviations), the K-Means
algorithm will not perform well and will not give good results.
• Dirichlet clustering can be applied to model different data distributions (data points that are not in normal distribution)
effectively. Dirichlet clustering fits a model over a dataset and tunes parameters to adjust the model's parameters to correctly
fit the data. This approach is suitable to address the hierarchical-clustering problem.

More Related Content

PPTX
Clusters techniques
PPTX
Clustering in Data Mining
PPTX
PPTX
Data Mining: clustering and analysis
PPTX
Clustering in data Mining (Data Mining)
PPTX
Machine learning clustering
PPT
K mean-clustering algorithm
PPT
Cluster analysis
Clusters techniques
Clustering in Data Mining
Data Mining: clustering and analysis
Clustering in data Mining (Data Mining)
Machine learning clustering
K mean-clustering algorithm
Cluster analysis

What's hot (20)

PPTX
Data Mining
PPTX
Data mining tasks
PPTX
05 Clustering in Data Mining
PPTX
Introduction to Clustering algorithm
PPTX
Apriori algorithm
PPT
2.4 rule based classification
PPTX
Performance analysis(Time & Space Complexity)
PDF
K - Nearest neighbor ( KNN )
PDF
Data Mining: Association Rules Basics
PPTX
Data mining technique (decision tree)
PPTX
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
PPTX
Concurrency Control in Distributed Database.
PPTX
Concurrency Control in Database Management System
PPTX
K MEANS CLUSTERING
PPT
Decision tree
PPTX
Decision tree induction \ Decision Tree Algorithm with Example| Data science
PPTX
Clustering
PPTX
UNIT - 4: Data Warehousing and Data Mining
PPTX
Data clustring
PPTX
K-Nearest Neighbor Classifier
Data Mining
Data mining tasks
05 Clustering in Data Mining
Introduction to Clustering algorithm
Apriori algorithm
2.4 rule based classification
Performance analysis(Time & Space Complexity)
K - Nearest neighbor ( KNN )
Data Mining: Association Rules Basics
Data mining technique (decision tree)
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
Concurrency Control in Distributed Database.
Concurrency Control in Database Management System
K MEANS CLUSTERING
Decision tree
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Clustering
UNIT - 4: Data Warehousing and Data Mining
Data clustring
K-Nearest Neighbor Classifier
Ad

Viewers also liked (12)

PDF
12 งานนำสนอ cluster analysis
PPTX
Cluster analysis
PPTX
Belief Networks & Bayesian Classification
PPTX
Association Analysis
PPT
Chap8 basic cluster_analysis
PDF
Clustering: A Survey
PDF
Bayesian Networks - A Brief Introduction
PPTX
Bayesian Belief Networks for dummies
PDF
Clustering training
PDF
K means Clustering
PPT
K means Clustering Algorithm
12 งานนำสนอ cluster analysis
Cluster analysis
Belief Networks & Bayesian Classification
Association Analysis
Chap8 basic cluster_analysis
Clustering: A Survey
Bayesian Networks - A Brief Introduction
Bayesian Belief Networks for dummies
Clustering training
K means Clustering
K means Clustering Algorithm
Ad

Similar to Types of clustering and different types of clustering algorithms (20)

PDF
clustering using different methods in .pdf
PDF
PPT s10-machine vision-s2
PPTX
DS9 - Clustering.pptx
PPTX
Machine Learning : Clustering - Cluster analysis.pptx
PDF
CSA 3702 machine learning module 3
PPTX
Unsupervised learning clustering
PPTX
UNIT_V_Cluster Analysis.pptx
PDF
Unsupervised Learning in Machine Learning
PPTX
K MEANS CLUSTERING - UNSUPERVISED LEARNING
PDF
Chapter7 clustering types concepts algorithms.pdf
PPT
2002_Spring_CS525_Lggggggfdtfffdfgecture_2.ppt
PPTX
machine learning - Clustering in R
PPTX
Unsupervised learning (clustering)
PPTX
unitvclusteranalysis-221214135407-1956d6ef.pptx
PDF
[ML]-Unsupervised-learning_Unit2.ppt.pdf
PPTX
CLUSTER ANALYSIS ALGORITHMS.pptx
PPTX
Data mining techniques unit v
PDF
k-mean-clustering.pdf
PPT
26-Clustering MTech-2017.ppt
PPTX
Different Algorithms used in classification [Auto-saved].pptx
clustering using different methods in .pdf
PPT s10-machine vision-s2
DS9 - Clustering.pptx
Machine Learning : Clustering - Cluster analysis.pptx
CSA 3702 machine learning module 3
Unsupervised learning clustering
UNIT_V_Cluster Analysis.pptx
Unsupervised Learning in Machine Learning
K MEANS CLUSTERING - UNSUPERVISED LEARNING
Chapter7 clustering types concepts algorithms.pdf
2002_Spring_CS525_Lggggggfdtfffdfgecture_2.ppt
machine learning - Clustering in R
Unsupervised learning (clustering)
unitvclusteranalysis-221214135407-1956d6ef.pptx
[ML]-Unsupervised-learning_Unit2.ppt.pdf
CLUSTER ANALYSIS ALGORITHMS.pptx
Data mining techniques unit v
k-mean-clustering.pdf
26-Clustering MTech-2017.ppt
Different Algorithms used in classification [Auto-saved].pptx

Recently uploaded (20)

PPTX
OOP with Java - Java Introduction (Basics)
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PDF
PPT on Performance Review to get promotions
PDF
Digital Logic Computer Design lecture notes
PPTX
Geodesy 1.pptx...............................................
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
web development for engineering and engineering
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Sustainable Sites - Green Building Construction
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Construction Project Organization Group 2.pptx
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
composite construction of structures.pdf
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
OOP with Java - Java Introduction (Basics)
Arduino robotics embedded978-1-4302-3184-4.pdf
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPT on Performance Review to get promotions
Digital Logic Computer Design lecture notes
Geodesy 1.pptx...............................................
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
UNIT 4 Total Quality Management .pptx
web development for engineering and engineering
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
bas. eng. economics group 4 presentation 1.pptx
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Sustainable Sites - Green Building Construction
CYBER-CRIMES AND SECURITY A guide to understanding
Construction Project Organization Group 2.pptx
Foundation to blockchain - A guide to Blockchain Tech
composite construction of structures.pdf
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx

Types of clustering and different types of clustering algorithms

  • 1. Types of clustering: Clustering can be divided into different categories based on different criteria • 1.Hard clustering: A given data point in n-dimensional space only belongs to one cluster. This is also known as exclusive clustering. The K-Means clustering mechanism is an example of hard clustering. • 2.Soft clustering: A given data point can belong to more than one cluster in soft clustering. This is also known as overlapping clustering. The Fuzzy K-Means algorithm is a good example of soft clustering. • 3.Hierarchial clustering: In hierarchical clustering, a hierarchy of clusters is built using the top-down (divisive) or bottom-up (agglomerative) approach. • 4. Flat clustering: Is a simple technique where no hierarchy is present. • 5.Model-based clustering: In model-based clustering, data is modeled using a standard statistical model to work with different distributions. The idea is to find a model that best fits the data.
  • 2. Different clustering algorithms • Fuzzy K-Means: • The K-Means algorithm is for hard clustering. In hard clustering, one data point belongs only to one cluster. However, there can be situations where one point belongs to more than one cluster. For example, a news article may belong to both the Technology and Current Affairs categories. In that case, we need a soft clustering mechanism. • The Fuzzy K-Means algorithm implements soft clustering. It generates overlapping clusters. Each point has a probability of belonging to each cluster, based on the distance from each centroid. • In this example, we apply the Fuzzy K-Means algorithm for dataset(22 80 ,25 75 ,28 85 ,55 150,50 145 ,53 153 ,38 115 ) The outcome of the example is given in the following figure. Note that the newly added data point (someone who had medium weight and height) belongs to cluster 3 in 0.52 probability and to cluster 1 in 0.47 probability, whereas other data points (people who are either large or small) belongs to nearly 0.9 to a particular cluster.
  • 3. Streaming K-Means • If the volume of data is too large to be stored in the main memory available, the K-Means algorithm is not suitable, as it's batch processing mechanism iterates over all the data points. Also, the K-Means algorithm is sensitive to the noise and outliers in data. • Streaming K-Means algorithms has provided a solution for these problems by operating in two steps, as follows: • The streaming step • The ball K-Means step • The idea is to read data points sequentially, storing very few data points in memory. • Then, after the first step, a better representative set of weighted data points is produced for further processing. • The final K number of clusters is produced in the ball K-Means step. During the second step, potential outliers are eliminated.
  • 4. Spectral clustering • The spectral clustering algorithm is helpful in hard, nonconvex clustering problems. It clusters points using the eigenvectors of matrices derived from data.
  • 5. Dirichlet clustering • The Fuzzy K-Means and K-Means algorithms model clusters as spheres (circles in n-dimensional space.) K-Means assumes a common fixed variance. Further, K-Means does not model the data point distribution. • A normal data distribution should be there for the K-Means and Fuzzy K-Means algorithms to process effectively. If the data distribution is different, for example, an asymmetrical normal distribution (different standard deviations), the K-Means algorithm will not perform well and will not give good results. • Dirichlet clustering can be applied to model different data distributions (data points that are not in normal distribution) effectively. Dirichlet clustering fits a model over a dataset and tunes parameters to adjust the model's parameters to correctly fit the data. This approach is suitable to address the hierarchical-clustering problem.