SlideShare a Scribd company logo
An introduction to Torch; A Machine Learning Library in C++.  Analysis and Implementation of CLARANS K-medoid Clustering Algorithm in the Java programming Language.  Application of K-medoid to image Processing. By  Adeyemi Fowe  CPSC 7375 (Machine Learning) Spring 2008 Instructor Dr Mariofanna (Fani) Milanova   Computer Science Department University of Arkansas at Little rock. Final Project Presentation
Torch :  www.torch.ch Usage: Powerful & Fast High  Learning Curve Made for Linux env. C++ OOP structure but C codes. Open Source plain txt .cc codes Features: Gradient Machines Support vector machines Ensemble models  K-nearest-neighbors Distributions and Classifiers Speech recognition tools
Torch Structure; Multiple Inheritance
Torch3Vision   Built on Torch Solid Image Processing More user friendly More sample codes examples Supports: pgm, ppm, gif, tif, jpeg  Camera control; e.g Sony pan/tilt/zoom
Application; Face Detection
Sample Use;  A Demo on Linux Console?
Clustering (Unsupervised Learning)
Clustering (Unsupervised Learning) Different types of Clustering:   Partitioning Algorithms:  K-means, K-medoid. Hierarchical Clustering:  Tree of clusters rather than disjoint. Density Based Clustering:  Cluster based on region of concentration. Statistical Clustering:  Statistical techniques like probability and test of hypothesis .
K-Means & K-medoid
K-Means & K-medoid K-means  clustering  use the exact center of a cluster (means or the center of gravity)  while K-medoid uses the most centrally located object in a cluster (medoid). K-medoid is less sensitive to outliers Compared  to K-means. K value (number of clusters) has to be determined a-priori.
K-medoid Algorithms PAM (Partitioning Around Medoids) was developed by Kaufman and Rousseeuw (1990) Designed by Kaufman and Rousseeuw to handle large data sets, CLARA (Clustering LARge Applications) CLARANS: Clustering Large Applications based on Randomized Search. Raymond T. Ng and Jiawei Han(2002)
 
CLARANS Minimum Cost Search The diagram illustrates CLARANS algorithm which performs random search for Minimum cost over the entire data set. By changing swapping a medoid one at a time.
Java Implementation of CLARANS K-medoid Algorithm
To form a cluster (image classification). A medoid has to navigate within this 3-D space to find the closest set of pixels.  This would make K-medoid take the pixel gray values into consideration wile clustering.
Sample Image
Extracted Gray-Values using TorchVision
3D Plot of Pixel Gray-Values
Gray Image Pixel Map
Spectra and Spatial Pattern Recognition Spectral pattern recognition refers to the set of spectral radiances  measurements obtained in the various wavelength bands for each pixel.  Spatial pattern recognition involves the categorization of image pixels on  the basis of their spatial relationship with pixels surrounding them. The aim of this experiment is to delineate the behavior of the K-medoid clustering algorithm while varying this two criteria.  We want to show that changing the weight w is a compromise of spectra spatial pattern of an image.
Spatial and Spectral Differences Cost of assigning node i to representative pixel j is given by: The weight w, serves has a measure of our preference for spatial or  spectra  pattern recognition. It’s a weight metric for the preference structure in MCDA. When w=0: Spatial pattern only. When w=1: Spectral pattern only. When 0<w<1: Both Spatial and Spectra pattern is considered; A typical  MADA .
CLARANS Clusters; K=3
Results for Spatial& Spectra
 
Pixel Distance Functions Reference: Wikipedia.com
Chebyshev Distance; Chess Board Distance http://en.wikipedia.org/wiki/Chebyshev_distance
The Lp Space
 
Lp Space and Decision Making
This clearly displays a  Manhattan cluster for w=0;  only spatial properties. This decision maker needs to consider the how the edges of the clusters  Should be formed. This decision would Most likely be informed by the type of Information to be extracted.
Conclusion We implemented the more efficient CLARANS Algorithm for K-medoid using the Java programming language. We take advantage of our code and explore the differences in distance functions which could be part of the choice of a user.  We showed that the choice of functions should depend on the expected edge-orientation of the clusters.
Thank You. Questions?
References [1] Chan, Y. (2001). Location Theory and Decision Analysis, ITP/South-Western [2] Chan, Y. Location, transport and land-use: Modeling spatial-temporal information. Heidelberg, Germany: Springer-Verlag. [3] Craig M. Wittenbrink, Glen Langdon, Jr. Gabriel Fernandez (1999), Feature Extraction of Clouds from GOES Satellite Data for Integrated Model Measurement Visualization, work paper [4] Raymond T. Ng, Jiawei Han, Efficient and Effective Clustering Methods for Spatial Data Mining, Proceedings of the 20th VLDB Conference Santiago, Chile, 1994 [5] Osmar R. Zaiane, Andrew Foss, Chi-Hoon Lee, and Weinan Wang, On Data Clustering Analysis: Scalability, Constraints and Validation, work paper [6] Gerald J. Dittberner (2001), NOAA’s GOES Satellite System – Status and Plans [7] Weather satellites teacher’s guide, Published by Environment Canada, ISBN Cat. No. En56-172/2001E-IN 0-662-31474-3 [8] ArcView user’s manual [9] Websites: http://goes2.gsfc.nasa.gov http://www.osd.noaa.gov/sats/goes.htm http://rsd.gsfc.nasa.gov/goes/ http://gtielectronics.com [10]Images: h ttp://images.ibsys.com/sh/images/weather/auto/2xat_ir_anim.gif http://ali.apple.com/space/space_images/9908212300Bret.jpg http://www.esri-ireland.ie/graphics/products/Image/ArcGIS_diag.jpg http://www.noaanews.noaa.gov/stories2006/images/goes-over-earth2.jpg http://www.slipperybrick.com/wp-content/uploads/2007/08/escape-key.jpg [11] Torch3vision  Sebastien Marcel and Yann Rodriguez |  http://torch3vision.idiap.ch/ [12]  R. Collobert, S. Bengio, and J. Mariéthoz.  Torch: a modular machine learning software library . Technical Report IDIAP-RR 02-46, IDIAP, 2002 [13]  L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, 1990.

More Related Content

PPT
Dataa miining
PPT
3.2 partitioning methods
PDF
Data clustering
PPT
PPT
3.5 model based clustering
PPT
3.6 constraint based cluster analysis
PPTX
Introduction to Clustering algorithm
PPT
Chapter 11 cluster advanced : web and text mining
Dataa miining
3.2 partitioning methods
Data clustering
3.5 model based clustering
3.6 constraint based cluster analysis
Introduction to Clustering algorithm
Chapter 11 cluster advanced : web and text mining

What's hot (19)

PPTX
Clustering in Data Mining
PPT
Chap8 basic cluster_analysis
PPT
Clustering
PPTX
Data clustring
PDF
New Approach for K-mean and K-medoids Algorithm
PPT
3.1 clustering
PPTX
Grid based method & model based clustering method
PDF
Clustering: A Survey
PPT
Capter10 cluster basic
PPTX
05 Clustering in Data Mining
PPTX
Types of clustering and different types of clustering algorithms
PPTX
Large Scale Data Clustering: an overview
PDF
Birch
PPT
10 clusbasic
PPT
CLUSTERING
PPTX
Clustering
PDF
10 clusbasic
PDF
Big data Clustering Algorithms And Strategies
PPTX
K-Means clustring @jax
Clustering in Data Mining
Chap8 basic cluster_analysis
Clustering
Data clustring
New Approach for K-mean and K-medoids Algorithm
3.1 clustering
Grid based method & model based clustering method
Clustering: A Survey
Capter10 cluster basic
05 Clustering in Data Mining
Types of clustering and different types of clustering algorithms
Large Scale Data Clustering: an overview
Birch
10 clusbasic
CLUSTERING
Clustering
10 clusbasic
Big data Clustering Algorithms And Strategies
K-Means clustring @jax
Ad

Viewers also liked (13)

PDF
Internship project report,Predictive Modelling
PDF
PPT
An Empirical Comparison of Fast and Efficient Tools for Mining Textual Data
DOC
In-plant Training Guidelines_SCSE
PDF
집단지성 프로그래밍 03-군집발견-03
PPT
Chapter - 7 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
DOCX
Pradeep_Chaudhari_Report
PPT
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
PDF
Internship Report
PPTX
Cluster analysis
PDF
Summer internship project report
PPTX
Introduction to Machine Learning
PDF
Deep Learning - Convolutional Neural Networks
Internship project report,Predictive Modelling
An Empirical Comparison of Fast and Efficient Tools for Mining Textual Data
In-plant Training Guidelines_SCSE
집단지성 프로그래밍 03-군집발견-03
Chapter - 7 Data Mining Concepts and Techniques 2nd Ed slides Han &amp; Kamber
Pradeep_Chaudhari_Report
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Internship Report
Cluster analysis
Summer internship project report
Introduction to Machine Learning
Deep Learning - Convolutional Neural Networks
Ad

Similar to Machine Learning Project (20)

PPT
Data mining concepts and techniques Chapter 10
PPT
data mining cocepts and techniques chapter
PPT
data mining cocepts and techniques chapter
PPT
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
PPT
Chapter 10 ClusBasic ppt file for clear understaning
PPT
Chapter -10-Clus_Basic.ppt -DataMinning
PDF
Web image annotation by diffusion maps manifold learning algorithm
PPT
DM UNIT_4 PPT for btech final year students
DOCX
K means report
PPT
Capter10 cluster basic : Han & Kamber
PPT
ClusetrigBasic.ppt
PPTX
A proposed accelerated image copy-move forgery detection-vcip2014
PPT
UniT_A_Clustering machine learning .ppt
PDF
Point Cloud Processing: Estimating Normal Vectors and Curvature Indicators us...
PDF
Neo4j MeetUp - Graph Exploration with MetaExp
PPT
My8clst
PPT
Lect4
PPTX
Leveraging Machine Learning for Accelerated Material Design.pptx
PPT
K_MeansK_MeansK_MeansK_MeansK_MeansK_MeansK_Means.ppt
PDF
ME Synopsis
Data mining concepts and techniques Chapter 10
data mining cocepts and techniques chapter
data mining cocepts and techniques chapter
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10 ClusBasic ppt file for clear understaning
Chapter -10-Clus_Basic.ppt -DataMinning
Web image annotation by diffusion maps manifold learning algorithm
DM UNIT_4 PPT for btech final year students
K means report
Capter10 cluster basic : Han & Kamber
ClusetrigBasic.ppt
A proposed accelerated image copy-move forgery detection-vcip2014
UniT_A_Clustering machine learning .ppt
Point Cloud Processing: Estimating Normal Vectors and Curvature Indicators us...
Neo4j MeetUp - Graph Exploration with MetaExp
My8clst
Lect4
Leveraging Machine Learning for Accelerated Material Design.pptx
K_MeansK_MeansK_MeansK_MeansK_MeansK_MeansK_Means.ppt
ME Synopsis

More from Adeyemi Fowe (8)

PPTX
Mobile Internet - Africa's Digital Backbone
PPT
Elastic Videos - TEDxVictoriaIsland
PPT
Elastic Videos
PPT
Fowe Thesis Full
PPT
Intelligent Embedded Systems (Robotics)
PPT
The Genius Triangle and Humanet
PPT
OpenSource
PPT
ICT a Tool in Your Hands
Mobile Internet - Africa's Digital Backbone
Elastic Videos - TEDxVictoriaIsland
Elastic Videos
Fowe Thesis Full
Intelligent Embedded Systems (Robotics)
The Genius Triangle and Humanet
OpenSource
ICT a Tool in Your Hands

Recently uploaded (20)

PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPT
Teaching material agriculture food technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Cloud computing and distributed systems.
PDF
Approach and Philosophy of On baking technology
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Programs and apps: productivity, graphics, security and other tools
Diabetes mellitus diagnosis method based random forest with bat algorithm
“AI and Expert System Decision Support & Business Intelligence Systems”
A comparative analysis of optical character recognition models for extracting...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation_ Review paper, used for researhc scholars
Teaching material agriculture food technology
Reach Out and Touch Someone: Haptics and Empathic Computing
NewMind AI Weekly Chronicles - August'25-Week II
Cloud computing and distributed systems.
Approach and Philosophy of On baking technology
Building Integrated photovoltaic BIPV_UPV.pdf
Spectral efficient network and resource selection model in 5G networks
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Machine Learning Project

  • 1. An introduction to Torch; A Machine Learning Library in C++. Analysis and Implementation of CLARANS K-medoid Clustering Algorithm in the Java programming Language. Application of K-medoid to image Processing. By Adeyemi Fowe CPSC 7375 (Machine Learning) Spring 2008 Instructor Dr Mariofanna (Fani) Milanova Computer Science Department University of Arkansas at Little rock. Final Project Presentation
  • 2. Torch : www.torch.ch Usage: Powerful & Fast High Learning Curve Made for Linux env. C++ OOP structure but C codes. Open Source plain txt .cc codes Features: Gradient Machines Support vector machines Ensemble models K-nearest-neighbors Distributions and Classifiers Speech recognition tools
  • 4. Torch3Vision Built on Torch Solid Image Processing More user friendly More sample codes examples Supports: pgm, ppm, gif, tif, jpeg Camera control; e.g Sony pan/tilt/zoom
  • 6. Sample Use; A Demo on Linux Console?
  • 8. Clustering (Unsupervised Learning) Different types of Clustering: Partitioning Algorithms: K-means, K-medoid. Hierarchical Clustering: Tree of clusters rather than disjoint. Density Based Clustering: Cluster based on region of concentration. Statistical Clustering: Statistical techniques like probability and test of hypothesis .
  • 10. K-Means & K-medoid K-means clustering use the exact center of a cluster (means or the center of gravity) while K-medoid uses the most centrally located object in a cluster (medoid). K-medoid is less sensitive to outliers Compared to K-means. K value (number of clusters) has to be determined a-priori.
  • 11. K-medoid Algorithms PAM (Partitioning Around Medoids) was developed by Kaufman and Rousseeuw (1990) Designed by Kaufman and Rousseeuw to handle large data sets, CLARA (Clustering LARge Applications) CLARANS: Clustering Large Applications based on Randomized Search. Raymond T. Ng and Jiawei Han(2002)
  • 12.  
  • 13. CLARANS Minimum Cost Search The diagram illustrates CLARANS algorithm which performs random search for Minimum cost over the entire data set. By changing swapping a medoid one at a time.
  • 14. Java Implementation of CLARANS K-medoid Algorithm
  • 15. To form a cluster (image classification). A medoid has to navigate within this 3-D space to find the closest set of pixels. This would make K-medoid take the pixel gray values into consideration wile clustering.
  • 18. 3D Plot of Pixel Gray-Values
  • 20. Spectra and Spatial Pattern Recognition Spectral pattern recognition refers to the set of spectral radiances measurements obtained in the various wavelength bands for each pixel. Spatial pattern recognition involves the categorization of image pixels on the basis of their spatial relationship with pixels surrounding them. The aim of this experiment is to delineate the behavior of the K-medoid clustering algorithm while varying this two criteria. We want to show that changing the weight w is a compromise of spectra spatial pattern of an image.
  • 21. Spatial and Spectral Differences Cost of assigning node i to representative pixel j is given by: The weight w, serves has a measure of our preference for spatial or spectra pattern recognition. It’s a weight metric for the preference structure in MCDA. When w=0: Spatial pattern only. When w=1: Spectral pattern only. When 0<w<1: Both Spatial and Spectra pattern is considered; A typical MADA .
  • 24.  
  • 25. Pixel Distance Functions Reference: Wikipedia.com
  • 26. Chebyshev Distance; Chess Board Distance http://en.wikipedia.org/wiki/Chebyshev_distance
  • 28.  
  • 29. Lp Space and Decision Making
  • 30. This clearly displays a Manhattan cluster for w=0; only spatial properties. This decision maker needs to consider the how the edges of the clusters Should be formed. This decision would Most likely be informed by the type of Information to be extracted.
  • 31. Conclusion We implemented the more efficient CLARANS Algorithm for K-medoid using the Java programming language. We take advantage of our code and explore the differences in distance functions which could be part of the choice of a user. We showed that the choice of functions should depend on the expected edge-orientation of the clusters.
  • 33. References [1] Chan, Y. (2001). Location Theory and Decision Analysis, ITP/South-Western [2] Chan, Y. Location, transport and land-use: Modeling spatial-temporal information. Heidelberg, Germany: Springer-Verlag. [3] Craig M. Wittenbrink, Glen Langdon, Jr. Gabriel Fernandez (1999), Feature Extraction of Clouds from GOES Satellite Data for Integrated Model Measurement Visualization, work paper [4] Raymond T. Ng, Jiawei Han, Efficient and Effective Clustering Methods for Spatial Data Mining, Proceedings of the 20th VLDB Conference Santiago, Chile, 1994 [5] Osmar R. Zaiane, Andrew Foss, Chi-Hoon Lee, and Weinan Wang, On Data Clustering Analysis: Scalability, Constraints and Validation, work paper [6] Gerald J. Dittberner (2001), NOAA’s GOES Satellite System – Status and Plans [7] Weather satellites teacher’s guide, Published by Environment Canada, ISBN Cat. No. En56-172/2001E-IN 0-662-31474-3 [8] ArcView user’s manual [9] Websites: http://goes2.gsfc.nasa.gov http://www.osd.noaa.gov/sats/goes.htm http://rsd.gsfc.nasa.gov/goes/ http://gtielectronics.com [10]Images: h ttp://images.ibsys.com/sh/images/weather/auto/2xat_ir_anim.gif http://ali.apple.com/space/space_images/9908212300Bret.jpg http://www.esri-ireland.ie/graphics/products/Image/ArcGIS_diag.jpg http://www.noaanews.noaa.gov/stories2006/images/goes-over-earth2.jpg http://www.slipperybrick.com/wp-content/uploads/2007/08/escape-key.jpg [11] Torch3vision Sebastien Marcel and Yann Rodriguez | http://torch3vision.idiap.ch/ [12] R. Collobert, S. Bengio, and J. Mariéthoz. Torch: a modular machine learning software library . Technical Report IDIAP-RR 02-46, IDIAP, 2002 [13] L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, 1990.