SlideShare a Scribd company logo
 Data-Applied.com: Decision
IntroductionDecision trees let you construct decision modelsThey can be used for forecasting, classification or decisionAt each branch the data is spit based on a particular field of dataDecision trees are constructed using Divide and Conquer techniques
Divide-and-Conquer: Constructing Decision TreesSteps to construct a decision tree recursively:Select an attribute to placed at root node and make one branch for each possible value Repeat the process recursively at each branch, using only those instances that reach the branch If at any time all instances at a node have the classification, stop developing that part of the treeProblem: How to decide which attribute to split on
Divide-and-Conquer: Constructing Decision TreesSteps to find the attribute to split on:We consider all the possible attributes as option and branch them according to different possible valuesNow for each possible attribute value we calculate Information and then find the Information gain for each attribute optionSelect that attribute for division which gives a Maximum Information GainDo this until each branch terminates at an attribute which gives Information = 0
Divide-and-Conquer: Constructing Decision TreesCalculation of Information and Gain:For data: (P1, P2, P3……Pn) such that P1 + P2 + P3 +……. +Pn = 1 Information(P1, P2 …..Pn)  =  -P1logP1 -P2logP2 – P3logP3 ……… -PnlogPnGain  = Information before division – Information after division
Divide-and-Conquer: Constructing Decision TreesExample:Here we have consider eachattribute individuallyEach is divided into branches according to different possible values Below each branch the number ofclass is marked
Divide-and-Conquer: Constructing Decision TreesCalculations:Using the formulae for Information, initially we haveNumber of instances with class = Yes is 9 Number of instances with class = No is 5So we have P1 = 9/14 and P2 = 5/14Info[9/14, 5/14] = -9/14log(9/14) -5/14log(5/14) = 0.940 bitsNow for example lets consider Outlook attribute, we observe the following:
Divide-and-Conquer: Constructing Decision TreesExample Contd.Gain by using Outlook for division        = info([9,5]) – info([2,3],[4,0],[3,2])				                          = 0.940 – 0.693 = 0.247 bitsGain (outlook) = 0.247 bits	Gain (temperature) = 0.029 bits	Gain (humidity) = 0.152 bits	Gain (windy) = 0.048 bitsSo since Outlook gives maximum gain, we will use it for divisionAnd we repeat the steps for Outlook = Sunny and Rainy and stop for 	Overcast since we have Information = 0 for it
Divide-and-Conquer: Constructing Decision TreesHighly branching attributes: The problemIf we follow the previously subscribed method, it will always favor an attribute with the largest number of  branchesIn extreme cases it will favor an attribute which has different value for each instance: Identification code
Divide-and-Conquer: Constructing Decision TreesHighly branching attributes: The problemInformation for such an attribute is 0info([0,1]) + info([0,1]) + info([0,1]) + …………. + info([0,1]) = 0It will hence have the maximum gain and will be chosen for branchingBut such an attribute is not good for predicting class of an unknown instance nor does it tells anything about the structure of divisionSo we use gain ratio to compensate for this
Divide-and-Conquer: Constructing Decision TreesHighly branching attributes: Gain ratioGain ratio =  gain/split infoTo calculate split info, for each instance value we just consider the number of instances covered by each attribute value, irrespective of the classThen we calculate the split info, so for identification code with 14 different values we have:info([1,1,1,…..,1]) = -1/14 x log1/14 x 14 = 3.807For Outlook we will have the split info:info([5,4,5]) =  -1/5 x log 1/5 -1/4 x log1/4 -1/5 x log 1/5  = 1.577
Decision using Data Applied’s web interface
Step1: Selection of data
Step2: SelectingDecision
Step3: Result
Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.

More Related Content

PPTX
WEKA: Algorithms The Basic Methods
PPTX
Data mining
PPT
Decision tree
PDF
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
PPTX
ID3 ALGORITHM
PPTX
Decision tree in artificial intelligence
PPTX
Random forest algorithm
PPTX
Id3 algorithm
WEKA: Algorithms The Basic Methods
Data mining
Decision tree
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
ID3 ALGORITHM
Decision tree in artificial intelligence
Random forest algorithm
Id3 algorithm

What's hot (11)

PPT
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
PPT
Decision tree and random forest
PPTX
WEKA: Practical Machine Learning Tools And Techniques
PDF
K - Nearest neighbor ( KNN )
PDF
Fuzzy c means_realestate_application
PPTX
Fuzzy c means manual work
PDF
ID3 Algorithm & ROC Analysis
PDF
Rough K Means - Numerical Example
PDF
Image Compression
PPT
k Nearest Neighbor
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
Decision tree and random forest
WEKA: Practical Machine Learning Tools And Techniques
K - Nearest neighbor ( KNN )
Fuzzy c means_realestate_application
Fuzzy c means manual work
ID3 Algorithm & ROC Analysis
Rough K Means - Numerical Example
Image Compression
k Nearest Neighbor
Ad

Viewers also liked (8)

PPTX
Data Applied:Outliers
PPTX
Data Applied: Clustering
PPTX
Data Applied: Correlation
PPTX
Data Applied: Association
PPTX
Data Applied: Forecast
PPTX
Data Applied:Tree Maps
PPTX
Data Applied:Similarity
PPTX
Data Applied:Tree Maps
Data Applied:Outliers
Data Applied: Clustering
Data Applied: Correlation
Data Applied: Association
Data Applied: Forecast
Data Applied:Tree Maps
Data Applied:Similarity
Data Applied:Tree Maps
Ad

Similar to Data Applied: Decision (20)

PPTX
WEKA:Algorithms The Basic Methods
PDF
Machine Learning with Python- Machine Learning Algorithms- Decision Tree.pdf
PPT
DM Unit-III ppt.ppt
PPTX
Machine learning session 10
PPT
classification_by_decission_tree_induction_iv1.ppt
PPTX
Decision Trees Learning in Machine Learning
PPTX
unit 5 decision tree2.pptx
PDF
Know How to Create and Visualize a Decision Tree with Python.pdf
PPT
2.2 decision tree
PPTX
Reliability Decission tree, system reliability theory
PPT
classification in data warehouse and mining
PDF
Decision Tree in classification problems in ML
PDF
PPTX
An algorithm for building
PPTX
03b-algorithm-data-mining-DTs-gain-ratio.pptx
PPTX
Decision Tree Concepts and Problems Machine Learning
PPTX
Data mining
PDF
Integrating Artificial Intelligence with IoT
PDF
Supervised Learning Decision Trees Machine Learning
PDF
Supervised Learning Decision Trees Review of Entropy
WEKA:Algorithms The Basic Methods
Machine Learning with Python- Machine Learning Algorithms- Decision Tree.pdf
DM Unit-III ppt.ppt
Machine learning session 10
classification_by_decission_tree_induction_iv1.ppt
Decision Trees Learning in Machine Learning
unit 5 decision tree2.pptx
Know How to Create and Visualize a Decision Tree with Python.pdf
2.2 decision tree
Reliability Decission tree, system reliability theory
classification in data warehouse and mining
Decision Tree in classification problems in ML
An algorithm for building
03b-algorithm-data-mining-DTs-gain-ratio.pptx
Decision Tree Concepts and Problems Machine Learning
Data mining
Integrating Artificial Intelligence with IoT
Supervised Learning Decision Trees Machine Learning
Supervised Learning Decision Trees Review of Entropy

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Encapsulation theory and applications.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Getting Started with Data Integration: FME Form 101
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
Tartificialntelligence_presentation.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
MIND Revenue Release Quarter 2 2025 Press Release
TLE Review Electricity (Electricity).pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Approach and Philosophy of On baking technology
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
A comparative analysis of optical character recognition models for extracting...
Empathic Computing: Creating Shared Understanding
Encapsulation theory and applications.pdf
Encapsulation_ Review paper, used for researhc scholars
Advanced methodologies resolving dimensionality complications for autism neur...
Getting Started with Data Integration: FME Form 101
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
A comparative study of natural language inference in Swahili using monolingua...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
SOPHOS-XG Firewall Administrator PPT.pptx
OMC Textile Division Presentation 2021.pptx
Tartificialntelligence_presentation.pptx

Data Applied: Decision

  • 2. IntroductionDecision trees let you construct decision modelsThey can be used for forecasting, classification or decisionAt each branch the data is spit based on a particular field of dataDecision trees are constructed using Divide and Conquer techniques
  • 3. Divide-and-Conquer: Constructing Decision TreesSteps to construct a decision tree recursively:Select an attribute to placed at root node and make one branch for each possible value Repeat the process recursively at each branch, using only those instances that reach the branch If at any time all instances at a node have the classification, stop developing that part of the treeProblem: How to decide which attribute to split on
  • 4. Divide-and-Conquer: Constructing Decision TreesSteps to find the attribute to split on:We consider all the possible attributes as option and branch them according to different possible valuesNow for each possible attribute value we calculate Information and then find the Information gain for each attribute optionSelect that attribute for division which gives a Maximum Information GainDo this until each branch terminates at an attribute which gives Information = 0
  • 5. Divide-and-Conquer: Constructing Decision TreesCalculation of Information and Gain:For data: (P1, P2, P3……Pn) such that P1 + P2 + P3 +……. +Pn = 1 Information(P1, P2 …..Pn) = -P1logP1 -P2logP2 – P3logP3 ……… -PnlogPnGain = Information before division – Information after division
  • 6. Divide-and-Conquer: Constructing Decision TreesExample:Here we have consider eachattribute individuallyEach is divided into branches according to different possible values Below each branch the number ofclass is marked
  • 7. Divide-and-Conquer: Constructing Decision TreesCalculations:Using the formulae for Information, initially we haveNumber of instances with class = Yes is 9 Number of instances with class = No is 5So we have P1 = 9/14 and P2 = 5/14Info[9/14, 5/14] = -9/14log(9/14) -5/14log(5/14) = 0.940 bitsNow for example lets consider Outlook attribute, we observe the following:
  • 8. Divide-and-Conquer: Constructing Decision TreesExample Contd.Gain by using Outlook for division = info([9,5]) – info([2,3],[4,0],[3,2]) = 0.940 – 0.693 = 0.247 bitsGain (outlook) = 0.247 bits Gain (temperature) = 0.029 bits Gain (humidity) = 0.152 bits Gain (windy) = 0.048 bitsSo since Outlook gives maximum gain, we will use it for divisionAnd we repeat the steps for Outlook = Sunny and Rainy and stop for Overcast since we have Information = 0 for it
  • 9. Divide-and-Conquer: Constructing Decision TreesHighly branching attributes: The problemIf we follow the previously subscribed method, it will always favor an attribute with the largest number of branchesIn extreme cases it will favor an attribute which has different value for each instance: Identification code
  • 10. Divide-and-Conquer: Constructing Decision TreesHighly branching attributes: The problemInformation for such an attribute is 0info([0,1]) + info([0,1]) + info([0,1]) + …………. + info([0,1]) = 0It will hence have the maximum gain and will be chosen for branchingBut such an attribute is not good for predicting class of an unknown instance nor does it tells anything about the structure of divisionSo we use gain ratio to compensate for this
  • 11. Divide-and-Conquer: Constructing Decision TreesHighly branching attributes: Gain ratioGain ratio = gain/split infoTo calculate split info, for each instance value we just consider the number of instances covered by each attribute value, irrespective of the classThen we calculate the split info, so for identification code with 14 different values we have:info([1,1,1,…..,1]) = -1/14 x log1/14 x 14 = 3.807For Outlook we will have the split info:info([5,4,5]) = -1/5 x log 1/5 -1/4 x log1/4 -1/5 x log 1/5 = 1.577
  • 12. Decision using Data Applied’s web interface
  • 16. Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.
  • 17. The tutorials section is free, self-guiding and will not involve any additional support.
  • 18. Visit us at www.dataminingtools.net