SlideShare a Scribd company logo
NAME : SHUBHAM SHIRKE
TOPIC : INTRODUCTION TO
MACHINE LEARNING
EMAIL ID :
SHUBHAMSHIRKE31@GMAIL.C
OM
AGENDA
• Introduction
• Basics
• Classification
• Clustering
• Regression
• Use-Cases
ABOUT
• subfield of Artificial Intelligence (AI)
• name is derived from the concept that it deals with
“construction and study of systems that can learn from data”
• can be seen as building blocks to make computers learn to behave more
intelligently
• It is a theoretical concept. There are various techniques with various
implementations.
TERMINOLOGY
• Features
– The number of features or distinct traits that can be used to describe each item in a
quantitative manner.
• Samples
– A sample is an item to process (e.g. classify). It can be a document, a picture, a
sound, a video, a row in database or CSV file, or whatever you can describe with a
fixed set of quantitative traits.
• Feature vector
– is an n-dimensional vector of numerical features that represent some object.
• Feature extraction
– Preparation of feature vector
– transforms the data in the high-dimensional space to a space of fewer dimensions.
• Training/Evolution set
– Set of data to discover potentially predictive relationships.
APPLE
(LEARNING AND TRAINING)
• Features : red Features : Sky blue Features : Green
• Type : fruit Type : Logo Type : fruit
• shape : etc. Shape : etc. Shape : etc.
WORKFLOW
SOFTWARE USED FOR MACHINE LEARNING
• Tensor Flow
• Shogun
• Apache Mahout
• Apache Spark Mllib
• Oryx 2
CATEGORIES
• Supervised Learning
• Unsupervised Learning
• Semi-Supervised Learning
• Reinforcement Learning
SUPERVISED LEARNING
• The correct classes of the training data are
known as
UNSUPERVISED LEARNING
• The correct classes of the training data are not known as
SEMI-SUPERVISED LEARNING
• A Mix of Supervised and Unsupervised learning
REINFORCEMENT LEARNING
• It allows the machine or software agent to learn its behavior based on feedback
from the environment
MACHINE LEARNING TECHNIQUES
TECHNIQUES
• classification: predict class from observations
• clustering: group observations into
•“meaningful” groups
• regression (prediction): predict value from observations
CLASSIFICATION
• Classify a document into a predefined category.
• Documents can be text, images • Popular one is Naive Bayes
Classifier.
• Steps:
– Step1 : Train the program (Building a Model) using a training set with a
category for e.g. sports, cricket, news,
– Classifier will compute probability for each word, the probability that it
makes a document belong to each of considered categories
– Step2 : Test with a test data set against this Model
• Clustering is the task of grouping a set of objects in such a way that objects in the same group
(called a cluster) are more similar to each other
• Objects are not predefined
• For e.g. these keywords
– “man’s shoe”
– “women’s shoe”
– “women’s t-shirt”
– “man’s t-shirt”
– can be cluster into 2 categories “shoe” and “t-shirt” or
• “man” and “women”
• Popular ones are K-means clustering and Hierarchical clustering
K-MEANS CLUSTERING
• partition n observations into k clusters in which each observation belongs to the cluster with the
nearest mean, serving as a prototype of the cluster. •
• http://pypr.sourceforge.net/kmeans.html
HIERARCHICAL CLUSTERING
• Method of cluster analysis which seeks to build a hierarchy of clusters.
• There can be two strategies
•– Agglomerative:
• This is a "bottom up" approach: each observation starts in its own cluster, and pairs of clusters are
merged as one moves up the hierarchy.
• Time complexity is O(n^3) – Divisive:
• This is a "top down" approach: all observations start in one cluster, and splits are performed recursively
as one moves down the hierarchy.
• Time complexity is O(2^n)
• http://en.wikipedia.org/wiki/Hierarchical_clustering
REGRESSION
• Is a measure of the relation between the mean value of one variable (e.g. output) and
corresponding values of other variables (e.g. time and cost)
• regression analysis is a statistical process for estimating the relationships among
variables.
• Regression means to predict the output value using training data.
• Popular one is Logistic regression
•(binary regression)
• http://en.wikipedia.org/wiki/Logistic_regression
•
USE-CASES (CONTD.)
• Text Summarization - Google News
• Rating a Review/Comment: Yelp
• Fraud detection : Credit card Providers
• Decision Making : e.g. Bank/Insurance sector
• Sentiment Analysis
• Speech Understanding – iPhone with Siri
• Face Detection – Facebook’s Photo tagging
•35

More Related Content

PPTX
Classification in data mining
PPTX
Decision tree induction \ Decision Tree Algorithm with Example| Data science
PPTX
Machine Learning
PPTX
K-Folds Cross Validation Method
PDF
An introduction to Machine Learning
PPTX
Outlier analysis and anomaly detection
PPTX
Unsupervised learning clustering
PPTX
Supervised and unsupervised learning
Classification in data mining
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Machine Learning
K-Folds Cross Validation Method
An introduction to Machine Learning
Outlier analysis and anomaly detection
Unsupervised learning clustering
Supervised and unsupervised learning

What's hot (20)

PPTX
Introduction to Machine Learning
PPT
2.5 backpropagation
PPTX
Presentation on supervised learning
PDF
Feature Engineering in Machine Learning
PPT
2.2 decision tree
PPTX
Introduction to Machine Learning
PDF
An Introduction to Anomaly Detection
PPTX
Association rule mining.pptx
PPT
Data mining techniques unit 1
PDF
Data preprocessing using Machine Learning
PPTX
Overfitting & Underfitting
PDF
Understanding Bagging and Boosting
PPTX
Decision tree
PPT
Basics of Machine Learning
PPTX
Structure of agents
PPTX
Feedforward neural network
PPTX
Classification and prediction in data mining
PDF
Feature Extraction
PPT
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Introduction to Machine Learning
2.5 backpropagation
Presentation on supervised learning
Feature Engineering in Machine Learning
2.2 decision tree
Introduction to Machine Learning
An Introduction to Anomaly Detection
Association rule mining.pptx
Data mining techniques unit 1
Data preprocessing using Machine Learning
Overfitting & Underfitting
Understanding Bagging and Boosting
Decision tree
Basics of Machine Learning
Structure of agents
Feedforward neural network
Classification and prediction in data mining
Feature Extraction
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Ad

Similar to Introduction to Machine learning ppt (20)

PPTX
Machine Learning
PPTX
ML SFCSE.pptx
PPT
8 oo approach&uml-23_feb
PDF
Machine Learning for Everyone
PPTX
Machine Learning Innovations
PPTX
Expert system (unit 1 & 2)
PPT
Data mining-primitives-languages-and-system-architectures2641
PPTX
Personalized classifiers
PPTX
05 k-means clustering
PDF
algoritma klastering.pdf
PPTX
Machine Learning : Clustering - Cluster analysis.pptx
PPT
Clustering
PDF
clustering-151017180103-lva1-app6892 (1).pdf
PDF
Data mining chapter04and5-best
PDF
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
PPTX
Spark MLlib - Training Material
PPT
c23_ml1.ppt
PPTX
PgVector + : Enable Richer Interaction with vector database.pptx
PDF
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
PDF
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Machine Learning
ML SFCSE.pptx
8 oo approach&uml-23_feb
Machine Learning for Everyone
Machine Learning Innovations
Expert system (unit 1 & 2)
Data mining-primitives-languages-and-system-architectures2641
Personalized classifiers
05 k-means clustering
algoritma klastering.pdf
Machine Learning : Clustering - Cluster analysis.pptx
Clustering
clustering-151017180103-lva1-app6892 (1).pdf
Data mining chapter04and5-best
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Spark MLlib - Training Material
c23_ml1.ppt
PgVector + : Enable Richer Interaction with vector database.pptx
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Ad

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Cloud computing and distributed systems.
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Machine learning based COVID-19 study performance prediction
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
A Presentation on Artificial Intelligence
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
KodekX | Application Modernization Development
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Diabetes mellitus diagnosis method based random forest with bat algorithm
Cloud computing and distributed systems.
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
“AI and Expert System Decision Support & Business Intelligence Systems”
Understanding_Digital_Forensics_Presentation.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
The AUB Centre for AI in Media Proposal.docx
Advanced methodologies resolving dimensionality complications for autism neur...
Machine learning based COVID-19 study performance prediction
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
A Presentation on Artificial Intelligence
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
KodekX | Application Modernization Development
Digital-Transformation-Roadmap-for-Companies.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Introduction to Machine learning ppt

  • 1. NAME : SHUBHAM SHIRKE TOPIC : INTRODUCTION TO MACHINE LEARNING EMAIL ID : SHUBHAMSHIRKE31@GMAIL.C OM
  • 2. AGENDA • Introduction • Basics • Classification • Clustering • Regression • Use-Cases
  • 3. ABOUT • subfield of Artificial Intelligence (AI) • name is derived from the concept that it deals with “construction and study of systems that can learn from data” • can be seen as building blocks to make computers learn to behave more intelligently • It is a theoretical concept. There are various techniques with various implementations.
  • 4. TERMINOLOGY • Features – The number of features or distinct traits that can be used to describe each item in a quantitative manner. • Samples – A sample is an item to process (e.g. classify). It can be a document, a picture, a sound, a video, a row in database or CSV file, or whatever you can describe with a fixed set of quantitative traits. • Feature vector – is an n-dimensional vector of numerical features that represent some object. • Feature extraction – Preparation of feature vector – transforms the data in the high-dimensional space to a space of fewer dimensions. • Training/Evolution set – Set of data to discover potentially predictive relationships.
  • 5. APPLE (LEARNING AND TRAINING) • Features : red Features : Sky blue Features : Green • Type : fruit Type : Logo Type : fruit • shape : etc. Shape : etc. Shape : etc.
  • 7. SOFTWARE USED FOR MACHINE LEARNING • Tensor Flow • Shogun • Apache Mahout • Apache Spark Mllib • Oryx 2
  • 8. CATEGORIES • Supervised Learning • Unsupervised Learning • Semi-Supervised Learning • Reinforcement Learning
  • 9. SUPERVISED LEARNING • The correct classes of the training data are known as
  • 10. UNSUPERVISED LEARNING • The correct classes of the training data are not known as
  • 11. SEMI-SUPERVISED LEARNING • A Mix of Supervised and Unsupervised learning
  • 12. REINFORCEMENT LEARNING • It allows the machine or software agent to learn its behavior based on feedback from the environment
  • 13. MACHINE LEARNING TECHNIQUES TECHNIQUES • classification: predict class from observations • clustering: group observations into •“meaningful” groups • regression (prediction): predict value from observations
  • 14. CLASSIFICATION • Classify a document into a predefined category. • Documents can be text, images • Popular one is Naive Bayes Classifier. • Steps: – Step1 : Train the program (Building a Model) using a training set with a category for e.g. sports, cricket, news, – Classifier will compute probability for each word, the probability that it makes a document belong to each of considered categories – Step2 : Test with a test data set against this Model
  • 15. • Clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other • Objects are not predefined • For e.g. these keywords – “man’s shoe” – “women’s shoe” – “women’s t-shirt” – “man’s t-shirt” – can be cluster into 2 categories “shoe” and “t-shirt” or • “man” and “women” • Popular ones are K-means clustering and Hierarchical clustering
  • 16. K-MEANS CLUSTERING • partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. • • http://pypr.sourceforge.net/kmeans.html
  • 17. HIERARCHICAL CLUSTERING • Method of cluster analysis which seeks to build a hierarchy of clusters. • There can be two strategies •– Agglomerative: • This is a "bottom up" approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. • Time complexity is O(n^3) – Divisive: • This is a "top down" approach: all observations start in one cluster, and splits are performed recursively as one moves down the hierarchy. • Time complexity is O(2^n) • http://en.wikipedia.org/wiki/Hierarchical_clustering
  • 18. REGRESSION • Is a measure of the relation between the mean value of one variable (e.g. output) and corresponding values of other variables (e.g. time and cost) • regression analysis is a statistical process for estimating the relationships among variables. • Regression means to predict the output value using training data. • Popular one is Logistic regression •(binary regression) • http://en.wikipedia.org/wiki/Logistic_regression •
  • 19. USE-CASES (CONTD.) • Text Summarization - Google News • Rating a Review/Comment: Yelp • Fraud detection : Credit card Providers • Decision Making : e.g. Bank/Insurance sector • Sentiment Analysis • Speech Understanding – iPhone with Siri • Face Detection – Facebook’s Photo tagging •35