SlideShare a Scribd company logo
Unit 2
Classification
Classification
• Introduction
• Statistical Based Algorithm
• Distance Based Algorithm
• Tree Based Algorithm
• Rule Based Algorithm
• Neural Network Based Algorithm
• Combining Technique
Introduction
• Classification involves mapping of input data to appropriate
classes.
• Def: Given a database D = {t1 , t2 , ... , tn } of tuples (items,
records) and a set of classes C = { C 1, ... , Cm }, the
classification problem is to define a mapping f: D C where
each ti is assigned to one class. A class, Cj , contains precisely
those tuples mapped to it; that is, Cj = {ti |f(ti ) = Cj , 1 ≤ i ≤ n and
ti E D}.
• The problem is implemented in two phases:
1.Create a specific model by evaluating the training data.
2. Apply the model to classifying tuples from the target database.
Introduction
Introduction
• Issues In Classification:.
1. Missing Data
2. Measuring Performance.
Missing Data
There are many approaches to handle the missing data:
• Ignore the missing data.
• Assume a value for the missing data.
• Assume a special value for the missing data.
Measuring Performance and
Accuracy
• Classification accuracy is usually calculated by determining the
percentage of tuples placed in the correct class.
• Given a specific class and a database tuple may or may not be
assigned to that class while its actual membership may or may
not be in that class. This gives us four quadrants:
• True positive (TP): 𝑡𝑖 predicted to be in 𝐶𝑗 and is actually in it.
• False positive (FP): 𝑡𝑖 predicted to be in 𝐶𝑗 but is not actually in
it.
• True negative (TN): 𝑡𝑖 not predicted to be in 𝐶𝑗 and is not
actually in it.
• False negative (FN): 𝑡𝑖 not predicted to be in 𝐶𝑗 but is actually in
it.
Measuring Performance and
Accuracy
Measuring Performance and
Accuracy
Measuring Performance and
Accuracy

More Related Content

PPTX
Unit 4 Classification of data and more info on it
PPTX
04 Classification in Data Mining
PPT
Data mining techniques unit iv
PPTX
IME 672 - Classifier Evaluation I.pptx
PPTX
Build_Machine_Learning_System for Machine Learning Course
PDF
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
PDF
introducatio to ml introducatio to ml introducatio to ml
Unit 4 Classification of data and more info on it
04 Classification in Data Mining
Data mining techniques unit iv
IME 672 - Classifier Evaluation I.pptx
Build_Machine_Learning_System for Machine Learning Course
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
introducatio to ml introducatio to ml introducatio to ml

Similar to Lecture1.ppt (20)

PPTX
DecisionTree.pptx for btech cse student
PPTX
MODEL EVALUATION.pptx
PPTX
lecture_3_3.pptx Classification and pred
PPT
Chapter 08 ClassBasic.ppt file used for help
PPT
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
PPT
Information Retrieval 08
PPTX
TIE: A Framework for Embedding-based Incremental Temporal Knowledge Graph Com...
PPTX
DECISION TREE AND PROBABILISTIC MODELS.pptx
PPTX
Fundamentals of Data Science Modeling Lec
PPTX
UNIT 3: Data Warehousing and Data Mining
PPTX
Week_8machine learning (feature selection).pptx
PPTX
SAMPATH-SEMINAR.pptx ..............................
PDF
1. Demystifying ML.pdf
PPT
clustering, k-mean clustering, confusion matrices
PDF
Lecture 2 Basic Concepts in Machine Learning for Language Technology
PPTX
Intro to Machine Learning for non-Data Scientists
PDF
3 module 2
PPT
3 DM Classification HFCS kilometres .ppt
PDF
Dealing with imbalanced data in RTB
PDF
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
DecisionTree.pptx for btech cse student
MODEL EVALUATION.pptx
lecture_3_3.pptx Classification and pred
Chapter 08 ClassBasic.ppt file used for help
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Information Retrieval 08
TIE: A Framework for Embedding-based Incremental Temporal Knowledge Graph Com...
DECISION TREE AND PROBABILISTIC MODELS.pptx
Fundamentals of Data Science Modeling Lec
UNIT 3: Data Warehousing and Data Mining
Week_8machine learning (feature selection).pptx
SAMPATH-SEMINAR.pptx ..............................
1. Demystifying ML.pdf
clustering, k-mean clustering, confusion matrices
Lecture 2 Basic Concepts in Machine Learning for Language Technology
Intro to Machine Learning for non-Data Scientists
3 module 2
3 DM Classification HFCS kilometres .ppt
Dealing with imbalanced data in RTB
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Ad

More from Minakshee Patil (18)

PPTX
Introduction, characteristics, Pseudocode.pptx
PPTX
0-1_knapsack_using_Dynamic Programming.pptx
PPTX
Introduction to Computational Complexity Theory pptx
PPTX
Different Searching and Sorting Methods.pptx
PPTX
Analysis of Algorithms (1).pptx, asymptotic
PPTX
0-1_knapsack_using_DP, types of knapsack
PPT
Linear Data Structures, array, stack, queue
PPTX
Unit 5-BACKTRACKING- n queens, sum of subset, graph coloring problems
PPT
stack, opeartions on stack, applications of stack
PPTX
Algorithm Design Techiques, divide and conquer
PPTX
Analysis of Algorithms, recurrence relation, solving recurrences
PPT
Lecture2 (9).ppt
PPTX
oracle.pptx
PPT
Unit 1.ppt
PPTX
Hierarchical clustering algorithm.pptx
PPT
Lecture2 (1).ppt
PPT
Lecture3 (3).ppt
PPT
Lecture4.ppt
Introduction, characteristics, Pseudocode.pptx
0-1_knapsack_using_Dynamic Programming.pptx
Introduction to Computational Complexity Theory pptx
Different Searching and Sorting Methods.pptx
Analysis of Algorithms (1).pptx, asymptotic
0-1_knapsack_using_DP, types of knapsack
Linear Data Structures, array, stack, queue
Unit 5-BACKTRACKING- n queens, sum of subset, graph coloring problems
stack, opeartions on stack, applications of stack
Algorithm Design Techiques, divide and conquer
Analysis of Algorithms, recurrence relation, solving recurrences
Lecture2 (9).ppt
oracle.pptx
Unit 1.ppt
Hierarchical clustering algorithm.pptx
Lecture2 (1).ppt
Lecture3 (3).ppt
Lecture4.ppt
Ad

Recently uploaded (20)

PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PDF
Abrasive, erosive and cavitation wear.pdf
PPT
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
PDF
86236642-Electric-Loco-Shed.pdf jfkduklg
PDF
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PDF
Integrating Fractal Dimension and Time Series Analysis for Optimized Hyperspe...
PPTX
UNIT 4 Total Quality Management .pptx
PPT
Occupational Health and Safety Management System
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PPTX
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PDF
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
PDF
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
PPT
Total quality management ppt for engineering students
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
Abrasive, erosive and cavitation wear.pdf
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
86236642-Electric-Loco-Shed.pdf jfkduklg
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
Automation-in-Manufacturing-Chapter-Introduction.pdf
R24 SURVEYING LAB MANUAL for civil enggi
Integrating Fractal Dimension and Time Series Analysis for Optimized Hyperspe...
UNIT 4 Total Quality Management .pptx
Occupational Health and Safety Management System
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
Total quality management ppt for engineering students

Lecture1.ppt

  • 2. Classification • Introduction • Statistical Based Algorithm • Distance Based Algorithm • Tree Based Algorithm • Rule Based Algorithm • Neural Network Based Algorithm • Combining Technique
  • 3. Introduction • Classification involves mapping of input data to appropriate classes. • Def: Given a database D = {t1 , t2 , ... , tn } of tuples (items, records) and a set of classes C = { C 1, ... , Cm }, the classification problem is to define a mapping f: D C where each ti is assigned to one class. A class, Cj , contains precisely those tuples mapped to it; that is, Cj = {ti |f(ti ) = Cj , 1 ≤ i ≤ n and ti E D}. • The problem is implemented in two phases: 1.Create a specific model by evaluating the training data. 2. Apply the model to classifying tuples from the target database.
  • 5. Introduction • Issues In Classification:. 1. Missing Data 2. Measuring Performance.
  • 6. Missing Data There are many approaches to handle the missing data: • Ignore the missing data. • Assume a value for the missing data. • Assume a special value for the missing data.
  • 7. Measuring Performance and Accuracy • Classification accuracy is usually calculated by determining the percentage of tuples placed in the correct class. • Given a specific class and a database tuple may or may not be assigned to that class while its actual membership may or may not be in that class. This gives us four quadrants: • True positive (TP): 𝑡𝑖 predicted to be in 𝐶𝑗 and is actually in it. • False positive (FP): 𝑡𝑖 predicted to be in 𝐶𝑗 but is not actually in it. • True negative (TN): 𝑡𝑖 not predicted to be in 𝐶𝑗 and is not actually in it. • False negative (FN): 𝑡𝑖 not predicted to be in 𝐶𝑗 but is actually in it.