Swipe
Clustering
Clustering is the task of dividing the population or data
points into a number of groups such that data points in
the same groups are more similar to other data points
in the same group than those in other groups. In simple
words, the aim is to segregate groups with similar
traits and assign them into clusters.
Clustering
Let’s understand this with an example. Suppose, you
are the head of a rental store and wish to understand
preferences of your costumers to scale up your
business. Is it possible for you to look at details of
each costumer and devise a unique business strategy
for each one of them? Definitely not. But, what you
can do is to cluster all of your costumers into say 10
groups based on their purchasing habits and use a
separate strategy for costumers in each of these 10
groups. And this is what we call clustering.
Overview
Hard Clustering: In hard clustering, each data
point either belongs to a cluster completely or
not. For example, in the above example each
customer is put into one group out of the 10
groups.
Soft Clustering: In soft clustering, instead of
putting each data point into a separate cluster, a
probability or likelihood of that data point to be in
those clusters is assigned. For example, from the
above scenario each costumer is assigned a
probability to be in either of 10 clusters of the
retail store.
Types of Clustering
Types of clustering algorithms
Connectivity models
Centroid models
Distribution models
Density Models
Since the task of clustering is subjective, the means
that can be used for achieving this goal are plenty.
Every methodology follows a different set of rules for
defining the ‘similarity’ among data points.
K-means clustering
K-means clustering is one of the simplest and popular
unsupervised machine learning algorithms. ... In
other words, the K-means algorithm identifies k
number of centroids, and then allocates every data
point to the nearest cluster, while keeping the
centroids as small as possible.
Hierarchical clustering
Hierarchical clustering, also known as hierarchical
cluster analysis, is an algorithm that groups similar
objects into groups called clusters. The endpoint is a
set of clusters, where each cluster is distinct from
each other cluster, and the objects within each
cluster are broadly similar to each other.
Hierarchical clustering can’t handle big data well
but K Means clustering can. This is because the
time complexity of K Means is linear i.e. O(n) while
that of hierarchical clustering is quadratic i.e.
O(n2).
In K Means clustering, since we start with random
choice of clusters, the results produced by running
the algorithm multiple times might differ. While
results are reproducible in Hierarchical clustering.
Difference between K Means and Hierarchical
clustering
K Means is found to work well when the shape of
the clusters is hyper spherical (like circle in 2D,
sphere in 3D).
K Means clustering requires prior knowledge of K
i.e. no. of clusters you want to divide your data
into. But, you can stop at whatever number of
clusters you find appropriate in hierarchical
clustering by interpreting the dendrogram
Recommendation engines
Market segmentation
Social network analysis
Search result grouping
Medical imaging
Image segmentation
Anomaly detection
Clustering has a large no. of applications spread
across various domains. Some of the most popular
applications of clustering are:
Applications of Clustering
Classification and regression
trees (CART)
Neural Networks
Stay Tuned with
Topics for next Post

More Related Content

PDF
Hierarchical clustering
PDF
PPT
Cluster analysis
PPTX
Data clustring
PPTX
Clustering
PPTX
Data Mining: clustering and analysis
PPTX
Data partitioning
PPT
cluster analysis
Hierarchical clustering
Cluster analysis
Data clustring
Clustering
Data Mining: clustering and analysis
Data partitioning
cluster analysis

What's hot (20)

PPT
Cluster analysis
PPTX
Clustering in data Mining (Data Mining)
PPTX
Cluster Analysis Introduction
PPT
Chap8 basic cluster_analysis
PPT
Cluster analysis
PPTX
Clustering in Data Mining
PDF
Unsupervised learning clustering
PPTX
Introduction to Clustering algorithm
PPTX
Clustering, k-means clustering
PPT
Chapter 11 cluster advanced : web and text mining
PPTX
Hierarchical Clustering
PDF
Machine Learning Clustering
PPTX
Classification and Clustering
PPT
What is cluster analysis
PPT
3.1 clustering
PPTX
Cluster Analysis
PPT
Clustering & classification
PPTX
Clusters techniques
PDF
Summary statistics
Cluster analysis
Clustering in data Mining (Data Mining)
Cluster Analysis Introduction
Chap8 basic cluster_analysis
Cluster analysis
Clustering in Data Mining
Unsupervised learning clustering
Introduction to Clustering algorithm
Clustering, k-means clustering
Chapter 11 cluster advanced : web and text mining
Hierarchical Clustering
Machine Learning Clustering
Classification and Clustering
What is cluster analysis
3.1 clustering
Cluster Analysis
Clustering & classification
Clusters techniques
Summary statistics
Ad

Similar to Clustering (20)

PDF
4.Unit 4 ML Q&A.pdf machine learning qb
PDF
Clustering - Machine Learning Techniques
PDF
Unsupervised Learning in Machine Learning
PPT
Lecture#14 Clustering in querie eees.ppt
PPTX
Clustering algorithms Type in image segmentation .pptx
PPTX
Customer segmentation.pptx
PPTX
Hierarchical Clustering in Data Mining
PDF
Chapter 5.pdf
PDF
Clustering[306] [Read-Only].pdf
PPTX
XL-MINER:Data Exploration
PPTX
XL-MINER: Data Exploration
PPTX
For iiii year students of cse ML-UNIT-V.pptx
PDF
CLUSTERING IN DATA MINING.pdf
PDF
Clustering in Machine Learning.pdf
PPTX
clustering ppt.pptx
PDF
ClusteringClusteringClusteringClustering.pdf
PDF
Hierarchical clustering.pdf
PPTX
Poggi analytics - clustering - 1
PPT
15857 cse422 unsupervised-learning
PPTX
K MEANS CLUSTERING
4.Unit 4 ML Q&A.pdf machine learning qb
Clustering - Machine Learning Techniques
Unsupervised Learning in Machine Learning
Lecture#14 Clustering in querie eees.ppt
Clustering algorithms Type in image segmentation .pptx
Customer segmentation.pptx
Hierarchical Clustering in Data Mining
Chapter 5.pdf
Clustering[306] [Read-Only].pdf
XL-MINER:Data Exploration
XL-MINER: Data Exploration
For iiii year students of cse ML-UNIT-V.pptx
CLUSTERING IN DATA MINING.pdf
Clustering in Machine Learning.pdf
clustering ppt.pptx
ClusteringClusteringClusteringClustering.pdf
Hierarchical clustering.pdf
Poggi analytics - clustering - 1
15857 cse422 unsupervised-learning
K MEANS CLUSTERING
Ad

More from Learnbay Datascience (20)

PDF
Top data science projects
PDF
Python my SQL - create table
PDF
Python my SQL - create database
PDF
Python my sql database connection
PDF
Python - mySOL
PDF
AI - Issues and Terminology
PDF
AI - Fuzzy Logic Systems
PDF
AI - working of an ns
PDF
Artificial Intelligence- Neural Networks
PDF
AI - Robotics
PDF
Applications of expert system
PDF
Components of expert systems
PDF
Artificial intelligence - expert systems
PDF
AI - natural language processing
PDF
Ai popular search algorithms
PDF
AI - Agents & Environments
PDF
Artificial intelligence - research areas
PDF
Artificial intelligence composed
PDF
Artificial intelligence intelligent systems
PDF
Applications of ai
Top data science projects
Python my SQL - create table
Python my SQL - create database
Python my sql database connection
Python - mySOL
AI - Issues and Terminology
AI - Fuzzy Logic Systems
AI - working of an ns
Artificial Intelligence- Neural Networks
AI - Robotics
Applications of expert system
Components of expert systems
Artificial intelligence - expert systems
AI - natural language processing
Ai popular search algorithms
AI - Agents & Environments
Artificial intelligence - research areas
Artificial intelligence composed
Artificial intelligence intelligent systems
Applications of ai

Recently uploaded (20)

DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PDF
Journal of Dental Science - UDMY (2021).pdf
PPTX
Core Concepts of Personalized Learning and Virtual Learning Environments
PDF
Environmental Education MCQ BD2EE - Share Source.pdf
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PPTX
Education and Perspectives of Education.pptx
PPTX
What’s under the hood: Parsing standardized learning content for AI
PDF
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
PDF
Mucosal Drug Delivery system_NDDS_BPHARMACY__SEM VII_PCI.pdf
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
HVAC Specification 2024 according to central public works department
PPTX
Virtual and Augmented Reality in Current Scenario
PDF
International_Financial_Reporting_Standa.pdf
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
Empowerment Technology for Senior High School Guide
PDF
English Textual Question & Ans (12th Class).pdf
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI .pdf
Cambridge-Practice-Tests-for-IELTS-12.docx
Journal of Dental Science - UDMY (2021).pdf
Core Concepts of Personalized Learning and Virtual Learning Environments
Environmental Education MCQ BD2EE - Share Source.pdf
A powerpoint presentation on the Revised K-10 Science Shaping Paper
Education and Perspectives of Education.pptx
What’s under the hood: Parsing standardized learning content for AI
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
Mucosal Drug Delivery system_NDDS_BPHARMACY__SEM VII_PCI.pdf
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
HVAC Specification 2024 according to central public works department
Virtual and Augmented Reality in Current Scenario
International_Financial_Reporting_Standa.pdf
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
What if we spent less time fighting change, and more time building what’s rig...
Empowerment Technology for Senior High School Guide
English Textual Question & Ans (12th Class).pdf
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI .pdf

Clustering

  • 2. Clustering is the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group than those in other groups. In simple words, the aim is to segregate groups with similar traits and assign them into clusters. Clustering
  • 3. Let’s understand this with an example. Suppose, you are the head of a rental store and wish to understand preferences of your costumers to scale up your business. Is it possible for you to look at details of each costumer and devise a unique business strategy for each one of them? Definitely not. But, what you can do is to cluster all of your costumers into say 10 groups based on their purchasing habits and use a separate strategy for costumers in each of these 10 groups. And this is what we call clustering. Overview
  • 4. Hard Clustering: In hard clustering, each data point either belongs to a cluster completely or not. For example, in the above example each customer is put into one group out of the 10 groups. Soft Clustering: In soft clustering, instead of putting each data point into a separate cluster, a probability or likelihood of that data point to be in those clusters is assigned. For example, from the above scenario each costumer is assigned a probability to be in either of 10 clusters of the retail store. Types of Clustering
  • 5. Types of clustering algorithms Connectivity models Centroid models Distribution models Density Models Since the task of clustering is subjective, the means that can be used for achieving this goal are plenty. Every methodology follows a different set of rules for defining the ‘similarity’ among data points.
  • 6. K-means clustering K-means clustering is one of the simplest and popular unsupervised machine learning algorithms. ... In other words, the K-means algorithm identifies k number of centroids, and then allocates every data point to the nearest cluster, while keeping the centroids as small as possible.
  • 7. Hierarchical clustering Hierarchical clustering, also known as hierarchical cluster analysis, is an algorithm that groups similar objects into groups called clusters. The endpoint is a set of clusters, where each cluster is distinct from each other cluster, and the objects within each cluster are broadly similar to each other.
  • 8. Hierarchical clustering can’t handle big data well but K Means clustering can. This is because the time complexity of K Means is linear i.e. O(n) while that of hierarchical clustering is quadratic i.e. O(n2). In K Means clustering, since we start with random choice of clusters, the results produced by running the algorithm multiple times might differ. While results are reproducible in Hierarchical clustering. Difference between K Means and Hierarchical clustering
  • 9. K Means is found to work well when the shape of the clusters is hyper spherical (like circle in 2D, sphere in 3D). K Means clustering requires prior knowledge of K i.e. no. of clusters you want to divide your data into. But, you can stop at whatever number of clusters you find appropriate in hierarchical clustering by interpreting the dendrogram
  • 10. Recommendation engines Market segmentation Social network analysis Search result grouping Medical imaging Image segmentation Anomaly detection Clustering has a large no. of applications spread across various domains. Some of the most popular applications of clustering are: Applications of Clustering
  • 11. Classification and regression trees (CART) Neural Networks Stay Tuned with Topics for next Post