SlideShare a Scribd company logo
An Introduction
What’s in it for you?
What is Clustering?
What is Hierarchical Clustering?
How Hierarchical Clustering works?
Distance Measure
What is Agglomerative Clustering?
What is Divisive Clustering?
What is Clustering?
What is Clustering?
I have 20 places to cover in 4 days!
What is Clustering?
How will I manage to cover all?
What is Clustering?
You can make use of clustering by
grouping the data into four clusters
What is Clustering?
Each of these clusters will have places
which are close by
What is Clustering?
Then each day you can visit one group
and cover all places in the group
What is Clustering?
Great!
What is Clustering?
It will group places with least distance
The method of dividing the objects into clusters which are similar between them and are dissimilar
to the objects belonging to another cluster
What is Clustering?
It will group places with least distance
The method of dividing the objects into clusters which are similar between them and are dissimilar
to the objects belonging to another cluster
Partial
Clustering
Hierarchical
Clustering
What is Clustering?
It will group places with least distance
The method of dividing the objects into clusters which are similar between them and are dissimilar
to the objects belonging to another cluster
Partial
Clustering
Hierarchical
Clustering
Agglomerative Divisive
What is Clustering?
It will group places with least distance
The method of dividing the objects into clusters which are similar between them and are dissimilar
to the objects belonging to another cluster
Partial
Clustering
Hierarchical
Clustering
Agglomerative Divisive K-means Fuzzy C-Means
What is Clustering?
Applications of Clustering
Customer
Segmentation
What is Clustering?
Customer
Segmentation Insurance
Applications of Clustering
What is Clustering?
Insurance City Planning
Applications of Clustering
Customer
Segmentation
Hierarchical Clustering
What is Hierarchical Clustering?
It will group places with least distance
Let’s consider that we have a set of cars and we have to group similar ones together
What is Hierarchical Clustering?
It will group places with least distance
Hierarchical Clustering creates a tree like structure and group similar objects together
What is Hierarchical Clustering?
It will group places with least distance
The grouping is done till we reach the last cluster
What is Hierarchical Clustering?
It will group places with least distance
Hierarchical Clustering is separating data into different groups based on some measure of similarity
Types of Hierarchical Clustering
It will group places with least distance
Agglomerative
It is known as Bottom-up approach
Types of Hierarchical Clustering
It will group places with least distance
Agglomerative Divisive
It is known as Top Down approach
How Hierarchical Clustering works?
What is Hierarchical Clustering?
Convergence
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
Termination
Grouping
Measure the
distance
• Let’s consider we have few points on a plane
What is Hierarchical Clustering?
Convergence
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
Termination
Grouping
Measure the
distance
• Each data point is a cluster of its own
What is Hierarchical Clustering?
Convergence
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
Termination
Grouping
Measure the
distance
• Each data point is a cluster of its own
• We try to find the least distance between two data points/cluster
What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
• The two nearest clusters/datapoints are merged together
Termination
Grouping
Measure the
distance
What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
• The two nearest clusters/datapoints are merged together
Termination
Grouping
Measure the
distance
P2 P1
• This is represented in a tree like structure called Dendrogram
What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
• The two nearest clusters/datapoints are merged together
Termination
Grouping
Measure the
distance
• This is represented in a tree like structure called Dendrogram
P3P2 P1 P4
What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
P5 P6
• The two nearest clusters/datapoints are merged together
Termination
Grouping
Measure the
distance
• This is represented in a tree like structure called Dendrogram
P5 P6P3 P4P2 P1
What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
P5 P6
• The two nearest clusters/datapoints are merged together
Termination
Grouping
Measure the
distance
• This is represented in a tree like structure called Dendrogram
P5 P6P3 P4P2 P1
What is Hierarchical Clustering?
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P3
P4
P5 P6
0
0.2
0.4
0.6
0.8
1
1.2
0 0.5 1 1.5
Y-Values
P6
P3
P4
P6
• We terminate when we are left with only one clusters
Termination
Grouping
Measure the
distance
P6P3P2 P1
P
P5P4
What is Hierarchical Clustering?
It will group places with least distance
An algorithm that builds hierarchy of clusters
0
1
2
3
4
5
6
0 2 4 6 8
Y-Values
P1P2
P5 P6
P3
P4
P5 P6 P2 P1 P3 P4
?
How do we measure the distance
between the data points?
Distance Measure
Distance Measure
Distance measure will determine the similarity between two elements and it will influence the shape of
the clusters
Distance Measure
Euclidean
distance
measure
Distance measure will determine the similarity between two elements and it will influence the shape of
the clusters
Distance Measure
Euclidean
distance
measure
Squared Euclidean
distance measure
Distance measure will determine the similarity between two elements and it will influence the shape of
the clusters
Distance Measure
Euclidean
distance
measure
Manhattan
distance
measure
Squared Euclidean
distance measure
Distance measure will determine the similarity between two elements and it will influence the shape of
the clusters
Distance Measure
Euclidean
distance
measure
Manhattan
distance
measure
Squared Euclidean
distance measure
Cosine distance
measure
Distance measure will determine the similarity between two elements and it will influence the shape of
the clusters
Euclidean Distance Measure
• The Euclidean distance is the "ordinary" straight line
• It is the distance between two points in Euclidean space
d=√ 𝑖=1
𝑛
( 𝑞𝑖− )2
p
q
Euclidian
Distance
𝑝𝑖
Option 02
Euclidean distance
measure
01
Squared euclidean
distance measure
02
Manhattan distance
measure
03
Cosine distance
measure
04
Squared Euclidean Distance Measure
The Euclidean squared distance metric uses the same equation as the
Euclidean distance metric, but does not take the square root.
d= 𝑖=1
𝑛
( 𝑞𝑖− )2
𝑝𝑖
Option 02
Euclidean distance
measure
01
Squared euclidean
distance measure
02
Manhattan distance
measure
03
Cosine distance
measure
04
Manhattan Distance Measure
Option 02
Euclidean distance
measure
01
Squared euclidean
distance measure
02
Manhattan distance
measure
03
Cosine distance
measure
04
The Manhattan distance is the simple sum of the horizontal and vertical
components or the distance between two points measured along axes at right angles
d= 𝑖=1
𝑛
| 𝑞 𝑥− |
p
q
Manhattan
Distance
𝑝 𝑥 +|𝑞 𝑦− |𝑝 𝑦
(x,y)
(x,y)
Cosine Distance Measure
Option 02
Euclidean distance
measure
01
Squared euclidean
distance measure
02
Manhattan distance
measure
03
Cosine distance
measure
04
The cosine distance similarity measures the angle between the two vectors
p
q
Cosine
Distance
𝑖=0
𝑛−1
𝑞𝑖−
𝑖=0
𝑛−1
(𝑞𝑖)2
× 𝑖=0
𝑛−1
(𝑝𝑖)2
d=
𝑝 𝑥
Agglomerative Clustering
What is Agglomerative Clustering?
It will group places with least distance
Agglomerative Clustering begins with each element as a separate cluster and merge them into larger clusters
What is Agglomerative Clustering?
It will group places with least distance
There are three key questions that needs to be answered
How do we represent a cluster of more than one point?
What is Agglomerative Clustering?
It will group places with least distance
There are three key questions that needs to be answered
How do we determine the nearness of clusters?
How do we represent a cluster of more than one point?
What is Agglomerative Clustering?
It will group places with least distance
There are three key questions that needs to be answered
How do we represent a cluster of more than one point?
How do we determine the nearness of clusters?
When to stop combining clusters?
What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
?
How do we
represent a cluster
of more than one
point?
What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
We make use of
centroids which is
the average of it’s
points
What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
(4.5,0.5)
What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
(4.5,0.5)
(1,1)
What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
(4.5,0.5)
(4.7,1.3)
(1,1)
What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
(4.5,0.5)
(4.7,1.3)
(1,1)
What is Agglomerative Clustering?
It will group places with least distance
(1,2)
(2,1)
(0,0)
(4,1)
(5,3)
(5,0)
Let’s assume that we have 6 points in a Euclidean space
(1.5,1.5)
(4.5,0.5)
(4.7,1.3)
(1,1)
?
When to stop
combining clusters?
What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 1: Pick a number of clusters(k) upfront
We decide the number of clusters required in the beginning and we terminate when we
reach the value(k)
What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Possible Challenges
 This only makes sense when we know about the data
Approach 1: Pick a number of clusters(k) upfront
We decide the number of clusters required in the beginning and we terminate when we
reach the value(k)
What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
But, how is cohesion
defined?
What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
Approach 3.1: Diameter of a cluster
• Diameter is the maximum distance between any pair of points in cluster
What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
Approach 3.1: Diameter of a cluster
• Diameter is the maximum distance between any pair of points in cluster
• We terminate when the diameter of a new cluster exceeds the threshold
What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
Approach 3.1: Radius of a cluster
What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
Approach 3.1: Radius of a cluster
• Radius is the maximum distance of a point from centroid
What is Agglomerative Clustering?
It will group places with least distance
There are many approaches to it
Approach 2: Stop when the next merge would create a cluster with low “cohesion”
We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
?
Approach 3.1: Radius of a cluster
• Radius is the maximum distance of a point from centroid
• We terminate when the diameter of a new cluster exceeds the threshold
Divisive Clustering
What is Divisive Clustering?
It will group places with least distance
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
Step 2
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• Split it into different clustersStep 2
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 2
Step 1
• Start with a single cluster composed of all the data points
• This can be done using Monothethic divisive methods
• Split it into different clusters
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• Split this into different clusters
• This can be done using Monothethic divisive methods
Step 2
?
What is monothetic divisive method?
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• There are two ways to do this
1. Monothethic divisive methods
2. Polythetic divisive methods
?
A,B,C,D,E,F
• Obtain all possible splits into two clusters
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
• Obtain all possible splits into two clusters
A,B,C,D,E,F
C,D,E,F
A,B
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• Split this into different clusters
• There are two ways to do this
?
• Obtain all possible splits into two clusters
A,B,C,D,E,F
A,D,F
C,D,E,F
A,B
B,C,E
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• Split this into different clusters
• There are two ways to do this
1. Monothethic divisive methods
2. Polythetic divisive methods
?
• Obtain all possible splits into two clusters
A,B,C,D,E,F
A,D,F
C,D,E,F
A,B
B,C,E
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
A,B,C
D,E,F
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• There are two ways to do this
1. Monothethic divisive methods
2. Polythetic divisive methods
?
• For each split compute cluster sum of squares
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
• There are two ways to do this
1. Monothethic divisive methods
2. Polythetic divisive methods
?
• For each split compute cluster sum of squares
• We select the cluster with largest sum of squares
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
• Let’s assume that the sum of squared distance is largest for 3rd split
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
• We divide it into two clusters
A,B,C
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
A,B,C D,E,F
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
• We divide it into two clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
A,B,C D,E,F
A B,C
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
• We divide it into two clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
A,B,C D,E,F
A B,C
A,B,C,D,E,F
A,B,C D,E,F
A B,C D E,F
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
• We divide it into two clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
A,B,C D,E,F
A B,C D E,F
A B C
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
• We divide it into two clusters
What is Divisive Clustering?
It will group places with least distance
Convergence
Step 1
• Start with a single cluster composed of all the data points
?
• We terminate when every data point is it’s own cluster
A,B,C D,E,F
A B,C D E,F
A B C D E F
A,B,C,D,E,F
Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
Demo: Hierarchical Clustering
Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
?Steps?
Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• Import the dataset
Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• Create a scatter plot
• Import the dataset
Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• Create a scatter plot
• Import the dataset
• Normalize the data
Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• Create a scatter plot
• Import the dataset
• Normalize the data
• Calculate Euclidean Distance
Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• Create a scatter plot
• Import the dataset
• Normalize the data
• Calculate Euclidean Distance
• Create a dendogram
Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• Create a scatter plot
• Import the dataset
• Normalize the data
• Calculate Euclidean Distance
• Create a dendogram
• Cluster into groups
Demo: Hierarchical Clustering
Problem Statement
• To group petroleum companies based on their sales
Steps?
• output
So what’s
your next step?
So what’s
your next step?

More Related Content

PPT
K mean-clustering
PPTX
Kmeans
PPTX
Machine learning clustering
PDF
K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...
PPTX
K-means Clustering
PDF
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
PPTX
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
PPTX
Unsupervised learning clustering
K mean-clustering
Kmeans
Machine learning clustering
K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...
K-means Clustering
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Unsupervised learning clustering

What's hot (20)

PPTX
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
PPTX
Hierarchical clustering
PPTX
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
PPT
Cluster analysis
PPTX
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
PDF
An introduction to Machine Learning
PDF
Dimensionality Reduction
PPTX
Data Mining: clustering and analysis
PDF
Hierarchical Clustering
PPTX
Classification in data mining
PPTX
Presentation on K-Means Clustering
PPT
Data preprocessing
PPTX
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
PPTX
Presentation on unsupervised learning
PPT
introduction to data mining tutorial
PPTX
Data mining presentation.ppt
PPTX
Cluster Analysis
PDF
Hierarchical clustering
PPTX
Machine Learning
PPTX
K-Means Clustering Algorithm.pptx
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
Hierarchical clustering
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Cluster analysis
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
An introduction to Machine Learning
Dimensionality Reduction
Data Mining: clustering and analysis
Hierarchical Clustering
Classification in data mining
Presentation on K-Means Clustering
Data preprocessing
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Presentation on unsupervised learning
introduction to data mining tutorial
Data mining presentation.ppt
Cluster Analysis
Hierarchical clustering
Machine Learning
K-Means Clustering Algorithm.pptx
Ad

Similar to Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clustering Example |Simplilearn (20)

PDF
12. Clustering.pdf for the students of aktu.
PDF
ch_5_dm clustering in data mining.......
PPTX
Data mining and warehousing
PPTX
Unsupervised Learning-Clustering Algorithms.pptx
PDF
Clustering.pdf
ODP
Hierarchical Clustering With KSAI
PPTX
Clusters (4).pptx
PPTX
Data mining Techniques
PPT
clustering and their types explanation of data mining
PPT
Chap8 basic cluster_analysis
PDF
iiit delhi unsupervised pdf.pdf
PPT
Slide-TIF311-DM-10-11.ppt
PPT
Slide-TIF311-DM-10-11.ppt
PDF
Clustering - Machine Learning Techniques
PDF
cluster-Notes.pdf
PPTX
Algorithms used in AIML and the need for aiml basic use cases
PPTX
TYPES OF CLUSTERING.pptx
PPTX
Clustering on DSS
PPTX
Clustering part 1
12. Clustering.pdf for the students of aktu.
ch_5_dm clustering in data mining.......
Data mining and warehousing
Unsupervised Learning-Clustering Algorithms.pptx
Clustering.pdf
Hierarchical Clustering With KSAI
Clusters (4).pptx
Data mining Techniques
clustering and their types explanation of data mining
Chap8 basic cluster_analysis
iiit delhi unsupervised pdf.pdf
Slide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.ppt
Clustering - Machine Learning Techniques
cluster-Notes.pdf
Algorithms used in AIML and the need for aiml basic use cases
TYPES OF CLUSTERING.pptx
Clustering on DSS
Clustering part 1
Ad

More from Simplilearn (20)

PPTX
Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
PPTX
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
PPTX
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
PPTX
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
PPTX
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
PPTX
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
PPTX
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
PPTX
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
PPTX
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
PPTX
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
PPTX
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
PPTX
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
PPTX
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
PPTX
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
PPTX
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
PPTX
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
PPTX
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
PPTX
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
PPTX
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
PPTX
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...

Recently uploaded (20)

PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Classroom Observation Tools for Teachers
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
master seminar digital applications in india
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Insiders guide to clinical Medicine.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
Business Ethics Teaching Materials for college
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Microbial diseases, their pathogenesis and prophylaxis
Module 4: Burden of Disease Tutorial Slides S2 2025
Classroom Observation Tools for Teachers
VCE English Exam - Section C Student Revision Booklet
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
O5-L3 Freight Transport Ops (International) V1.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
master seminar digital applications in india
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Insiders guide to clinical Medicine.pdf
PPH.pptx obstetrics and gynecology in nursing
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Renaissance Architecture: A Journey from Faith to Humanism
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Business Ethics Teaching Materials for college

Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clustering Example |Simplilearn

  • 2. What’s in it for you? What is Clustering? What is Hierarchical Clustering? How Hierarchical Clustering works? Distance Measure What is Agglomerative Clustering? What is Divisive Clustering?
  • 4. What is Clustering? I have 20 places to cover in 4 days!
  • 5. What is Clustering? How will I manage to cover all?
  • 6. What is Clustering? You can make use of clustering by grouping the data into four clusters
  • 7. What is Clustering? Each of these clusters will have places which are close by
  • 8. What is Clustering? Then each day you can visit one group and cover all places in the group
  • 10. What is Clustering? It will group places with least distance The method of dividing the objects into clusters which are similar between them and are dissimilar to the objects belonging to another cluster
  • 11. What is Clustering? It will group places with least distance The method of dividing the objects into clusters which are similar between them and are dissimilar to the objects belonging to another cluster Partial Clustering Hierarchical Clustering
  • 12. What is Clustering? It will group places with least distance The method of dividing the objects into clusters which are similar between them and are dissimilar to the objects belonging to another cluster Partial Clustering Hierarchical Clustering Agglomerative Divisive
  • 13. What is Clustering? It will group places with least distance The method of dividing the objects into clusters which are similar between them and are dissimilar to the objects belonging to another cluster Partial Clustering Hierarchical Clustering Agglomerative Divisive K-means Fuzzy C-Means
  • 14. What is Clustering? Applications of Clustering Customer Segmentation
  • 15. What is Clustering? Customer Segmentation Insurance Applications of Clustering
  • 16. What is Clustering? Insurance City Planning Applications of Clustering Customer Segmentation
  • 18. What is Hierarchical Clustering? It will group places with least distance Let’s consider that we have a set of cars and we have to group similar ones together
  • 19. What is Hierarchical Clustering? It will group places with least distance Hierarchical Clustering creates a tree like structure and group similar objects together
  • 20. What is Hierarchical Clustering? It will group places with least distance The grouping is done till we reach the last cluster
  • 21. What is Hierarchical Clustering? It will group places with least distance Hierarchical Clustering is separating data into different groups based on some measure of similarity
  • 22. Types of Hierarchical Clustering It will group places with least distance Agglomerative It is known as Bottom-up approach
  • 23. Types of Hierarchical Clustering It will group places with least distance Agglomerative Divisive It is known as Top Down approach
  • 25. What is Hierarchical Clustering? Convergence 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P3 P4 P5 P6 Termination Grouping Measure the distance • Let’s consider we have few points on a plane
  • 26. What is Hierarchical Clustering? Convergence 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P3 P4 P5 P6 Termination Grouping Measure the distance • Each data point is a cluster of its own
  • 27. What is Hierarchical Clustering? Convergence 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P3 P4 P5 P6 Termination Grouping Measure the distance • Each data point is a cluster of its own • We try to find the least distance between two data points/cluster
  • 28. What is Hierarchical Clustering? 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P3 P4 P5 P6 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P5 P6 P3 P4 • The two nearest clusters/datapoints are merged together Termination Grouping Measure the distance
  • 29. What is Hierarchical Clustering? 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P3 P4 P5 P6 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P5 P6 P3 P4 • The two nearest clusters/datapoints are merged together Termination Grouping Measure the distance P2 P1 • This is represented in a tree like structure called Dendrogram
  • 30. What is Hierarchical Clustering? 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P3 P4 P5 P6 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P5 P6 P3 P4 • The two nearest clusters/datapoints are merged together Termination Grouping Measure the distance • This is represented in a tree like structure called Dendrogram P3P2 P1 P4
  • 31. What is Hierarchical Clustering? 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P3 P4 P5 P6 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P5 P6 P3 P4 P5 P6 • The two nearest clusters/datapoints are merged together Termination Grouping Measure the distance • This is represented in a tree like structure called Dendrogram P5 P6P3 P4P2 P1
  • 32. What is Hierarchical Clustering? 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P3 P4 P5 P6 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P5 P6 P3 P4 P5 P6 • The two nearest clusters/datapoints are merged together Termination Grouping Measure the distance • This is represented in a tree like structure called Dendrogram P5 P6P3 P4P2 P1
  • 33. What is Hierarchical Clustering? 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P3 P4 P5 P6 0 0.2 0.4 0.6 0.8 1 1.2 0 0.5 1 1.5 Y-Values P6 P3 P4 P6 • We terminate when we are left with only one clusters Termination Grouping Measure the distance P6P3P2 P1 P P5P4
  • 34. What is Hierarchical Clustering? It will group places with least distance An algorithm that builds hierarchy of clusters 0 1 2 3 4 5 6 0 2 4 6 8 Y-Values P1P2 P5 P6 P3 P4 P5 P6 P2 P1 P3 P4 ? How do we measure the distance between the data points?
  • 36. Distance Measure Distance measure will determine the similarity between two elements and it will influence the shape of the clusters
  • 37. Distance Measure Euclidean distance measure Distance measure will determine the similarity between two elements and it will influence the shape of the clusters
  • 38. Distance Measure Euclidean distance measure Squared Euclidean distance measure Distance measure will determine the similarity between two elements and it will influence the shape of the clusters
  • 39. Distance Measure Euclidean distance measure Manhattan distance measure Squared Euclidean distance measure Distance measure will determine the similarity between two elements and it will influence the shape of the clusters
  • 40. Distance Measure Euclidean distance measure Manhattan distance measure Squared Euclidean distance measure Cosine distance measure Distance measure will determine the similarity between two elements and it will influence the shape of the clusters
  • 41. Euclidean Distance Measure • The Euclidean distance is the "ordinary" straight line • It is the distance between two points in Euclidean space d=√ 𝑖=1 𝑛 ( 𝑞𝑖− )2 p q Euclidian Distance 𝑝𝑖 Option 02 Euclidean distance measure 01 Squared euclidean distance measure 02 Manhattan distance measure 03 Cosine distance measure 04
  • 42. Squared Euclidean Distance Measure The Euclidean squared distance metric uses the same equation as the Euclidean distance metric, but does not take the square root. d= 𝑖=1 𝑛 ( 𝑞𝑖− )2 𝑝𝑖 Option 02 Euclidean distance measure 01 Squared euclidean distance measure 02 Manhattan distance measure 03 Cosine distance measure 04
  • 43. Manhattan Distance Measure Option 02 Euclidean distance measure 01 Squared euclidean distance measure 02 Manhattan distance measure 03 Cosine distance measure 04 The Manhattan distance is the simple sum of the horizontal and vertical components or the distance between two points measured along axes at right angles d= 𝑖=1 𝑛 | 𝑞 𝑥− | p q Manhattan Distance 𝑝 𝑥 +|𝑞 𝑦− |𝑝 𝑦 (x,y) (x,y)
  • 44. Cosine Distance Measure Option 02 Euclidean distance measure 01 Squared euclidean distance measure 02 Manhattan distance measure 03 Cosine distance measure 04 The cosine distance similarity measures the angle between the two vectors p q Cosine Distance 𝑖=0 𝑛−1 𝑞𝑖− 𝑖=0 𝑛−1 (𝑞𝑖)2 × 𝑖=0 𝑛−1 (𝑝𝑖)2 d= 𝑝 𝑥
  • 46. What is Agglomerative Clustering? It will group places with least distance Agglomerative Clustering begins with each element as a separate cluster and merge them into larger clusters
  • 47. What is Agglomerative Clustering? It will group places with least distance There are three key questions that needs to be answered How do we represent a cluster of more than one point?
  • 48. What is Agglomerative Clustering? It will group places with least distance There are three key questions that needs to be answered How do we determine the nearness of clusters? How do we represent a cluster of more than one point?
  • 49. What is Agglomerative Clustering? It will group places with least distance There are three key questions that needs to be answered How do we represent a cluster of more than one point? How do we determine the nearness of clusters? When to stop combining clusters?
  • 50. What is Agglomerative Clustering? It will group places with least distance (1,2) (2,1) (0,0) (4,1) (5,3) (5,0) Let’s assume that we have 6 points in a Euclidean space
  • 51. What is Agglomerative Clustering? It will group places with least distance (1,2) (2,1) (0,0) (4,1) (5,3) (5,0) Let’s assume that we have 6 points in a Euclidean space ? How do we represent a cluster of more than one point?
  • 52. What is Agglomerative Clustering? It will group places with least distance (1,2) (2,1) (0,0) (4,1) (5,3) (5,0) Let’s assume that we have 6 points in a Euclidean space We make use of centroids which is the average of it’s points
  • 53. What is Agglomerative Clustering? It will group places with least distance (1,2) (2,1) (0,0) (4,1) (5,3) (5,0) Let’s assume that we have 6 points in a Euclidean space
  • 54. What is Agglomerative Clustering? It will group places with least distance (1,2) (2,1) (0,0) (4,1) (5,3) (5,0) Let’s assume that we have 6 points in a Euclidean space (1.5,1.5)
  • 55. What is Agglomerative Clustering? It will group places with least distance (1,2) (2,1) (0,0) (4,1) (5,3) (5,0) Let’s assume that we have 6 points in a Euclidean space (1.5,1.5)
  • 56. What is Agglomerative Clustering? It will group places with least distance (1,2) (2,1) (0,0) (4,1) (5,3) (5,0) Let’s assume that we have 6 points in a Euclidean space (1.5,1.5) (4.5,0.5)
  • 57. What is Agglomerative Clustering? It will group places with least distance (1,2) (2,1) (0,0) (4,1) (5,3) (5,0) Let’s assume that we have 6 points in a Euclidean space (1.5,1.5) (4.5,0.5) (1,1)
  • 58. What is Agglomerative Clustering? It will group places with least distance (1,2) (2,1) (0,0) (4,1) (5,3) (5,0) Let’s assume that we have 6 points in a Euclidean space (1.5,1.5) (4.5,0.5) (4.7,1.3) (1,1)
  • 59. What is Agglomerative Clustering? It will group places with least distance (1,2) (2,1) (0,0) (4,1) (5,3) (5,0) Let’s assume that we have 6 points in a Euclidean space (1.5,1.5) (4.5,0.5) (4.7,1.3) (1,1)
  • 60. What is Agglomerative Clustering? It will group places with least distance (1,2) (2,1) (0,0) (4,1) (5,3) (5,0) Let’s assume that we have 6 points in a Euclidean space (1.5,1.5) (4.5,0.5) (4.7,1.3) (1,1) ? When to stop combining clusters?
  • 61. What is Agglomerative Clustering? It will group places with least distance There are many approaches to it
  • 62. What is Agglomerative Clustering? It will group places with least distance There are many approaches to it Approach 1: Pick a number of clusters(k) upfront We decide the number of clusters required in the beginning and we terminate when we reach the value(k)
  • 63. What is Agglomerative Clustering? It will group places with least distance There are many approaches to it Possible Challenges  This only makes sense when we know about the data Approach 1: Pick a number of clusters(k) upfront We decide the number of clusters required in the beginning and we terminate when we reach the value(k)
  • 64. What is Agglomerative Clustering? It will group places with least distance There are many approaches to it Approach 2: Stop when the next merge would create a cluster with low “cohesion”
  • 65. What is Agglomerative Clustering? It will group places with least distance There are many approaches to it Approach 2: Stop when the next merge would create a cluster with low “cohesion” We keep clustering till the next merge of clusters creates a bad cluster/low cohesion
  • 66. What is Agglomerative Clustering? It will group places with least distance There are many approaches to it Approach 2: Stop when the next merge would create a cluster with low “cohesion” We keep clustering till the next merge of clusters creates a bad cluster/low cohesion ? But, how is cohesion defined?
  • 67. What is Agglomerative Clustering? It will group places with least distance There are many approaches to it Approach 2: Stop when the next merge would create a cluster with low “cohesion” We keep clustering till the next merge of clusters creates a bad cluster/low cohesion ? Approach 3.1: Diameter of a cluster • Diameter is the maximum distance between any pair of points in cluster
  • 68. What is Agglomerative Clustering? It will group places with least distance There are many approaches to it Approach 2: Stop when the next merge would create a cluster with low “cohesion” We keep clustering till the next merge of clusters creates a bad cluster/low cohesion ? Approach 3.1: Diameter of a cluster • Diameter is the maximum distance between any pair of points in cluster • We terminate when the diameter of a new cluster exceeds the threshold
  • 69. What is Agglomerative Clustering? It will group places with least distance There are many approaches to it Approach 2: Stop when the next merge would create a cluster with low “cohesion” We keep clustering till the next merge of clusters creates a bad cluster/low cohesion ? Approach 3.1: Radius of a cluster
  • 70. What is Agglomerative Clustering? It will group places with least distance There are many approaches to it Approach 2: Stop when the next merge would create a cluster with low “cohesion” We keep clustering till the next merge of clusters creates a bad cluster/low cohesion ? Approach 3.1: Radius of a cluster • Radius is the maximum distance of a point from centroid
  • 71. What is Agglomerative Clustering? It will group places with least distance There are many approaches to it Approach 2: Stop when the next merge would create a cluster with low “cohesion” We keep clustering till the next merge of clusters creates a bad cluster/low cohesion ? Approach 3.1: Radius of a cluster • Radius is the maximum distance of a point from centroid • We terminate when the diameter of a new cluster exceeds the threshold
  • 73. What is Divisive Clustering? It will group places with least distance Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 74. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points Step 2 Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 75. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points • Split it into different clustersStep 2 Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 76. What is Divisive Clustering? It will group places with least distance Convergence Step 2 Step 1 • Start with a single cluster composed of all the data points • This can be done using Monothethic divisive methods • Split it into different clusters Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 77. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points • Split this into different clusters • This can be done using Monothethic divisive methods Step 2 ? What is monothetic divisive method? Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 78. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points • There are two ways to do this 1. Monothethic divisive methods 2. Polythetic divisive methods ? A,B,C,D,E,F • Obtain all possible splits into two clusters Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 79. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points ? • Obtain all possible splits into two clusters A,B,C,D,E,F C,D,E,F A,B Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 80. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points • Split this into different clusters • There are two ways to do this ? • Obtain all possible splits into two clusters A,B,C,D,E,F A,D,F C,D,E,F A,B B,C,E Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 81. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points • Split this into different clusters • There are two ways to do this 1. Monothethic divisive methods 2. Polythetic divisive methods ? • Obtain all possible splits into two clusters A,B,C,D,E,F A,D,F C,D,E,F A,B B,C,E Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters A,B,C D,E,F
  • 82. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points • There are two ways to do this 1. Monothethic divisive methods 2. Polythetic divisive methods ? • For each split compute cluster sum of squares Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 83. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points • There are two ways to do this 1. Monothethic divisive methods 2. Polythetic divisive methods ? • For each split compute cluster sum of squares • We select the cluster with largest sum of squares Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 84. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points ? • Let’s assume that the sum of squared distance is largest for 3rd split A,B,C,D,E,F Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 85. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points ? • We divide it into two clusters A,B,C A,B,C,D,E,F Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 86. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points ? A,B,C D,E,F A,B,C,D,E,F Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters • We divide it into two clusters
  • 87. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points ? A,B,C D,E,F A B,C A,B,C,D,E,F Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters • We divide it into two clusters
  • 88. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points ? A,B,C D,E,F A B,C A,B,C,D,E,F A,B,C D,E,F A B,C D E,F A,B,C,D,E,F Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters • We divide it into two clusters
  • 89. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points ? A,B,C D,E,F A B,C D E,F A B C A,B,C,D,E,F Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters • We divide it into two clusters
  • 90. What is Divisive Clustering? It will group places with least distance Convergence Step 1 • Start with a single cluster composed of all the data points ? • We terminate when every data point is it’s own cluster A,B,C D,E,F A B,C D E,F A B C D E F A,B,C,D,E,F Divisive Clustering approach begins with the whole set and proceeds to divide it into smaller clusters
  • 92. Demo: Hierarchical Clustering Problem Statement • To group petroleum companies based on their sales
  • 93. Demo: Hierarchical Clustering Problem Statement • To group petroleum companies based on their sales ?Steps?
  • 94. Demo: Hierarchical Clustering Problem Statement • To group petroleum companies based on their sales Steps? • Import the dataset
  • 95. Demo: Hierarchical Clustering Problem Statement • To group petroleum companies based on their sales Steps? • Create a scatter plot • Import the dataset
  • 96. Demo: Hierarchical Clustering Problem Statement • To group petroleum companies based on their sales Steps? • Create a scatter plot • Import the dataset • Normalize the data
  • 97. Demo: Hierarchical Clustering Problem Statement • To group petroleum companies based on their sales Steps? • Create a scatter plot • Import the dataset • Normalize the data • Calculate Euclidean Distance
  • 98. Demo: Hierarchical Clustering Problem Statement • To group petroleum companies based on their sales Steps? • Create a scatter plot • Import the dataset • Normalize the data • Calculate Euclidean Distance • Create a dendogram
  • 99. Demo: Hierarchical Clustering Problem Statement • To group petroleum companies based on their sales Steps? • Create a scatter plot • Import the dataset • Normalize the data • Calculate Euclidean Distance • Create a dendogram • Cluster into groups
  • 100. Demo: Hierarchical Clustering Problem Statement • To group petroleum companies based on their sales Steps? • output

Editor's Notes