Machine Learning-Lec6 expalin the decision .pdf

Decision Tree
Dr. Marwa M. Emam
Faculty of computers and Information
Minia University
Dr. Marwa M. Emam 1

Agenda
 Introduction to Decision Trees
 Basic Structure of Decision Trees
 Decision Tree Components
 How Decision Trees Work
 Splitting Criteria
 Advantages of Decision Trees
 Challenges and Limitations
Dr. Marwa M. Emam 2

Decision Tree
 A decision tree is a popular machine learning algorithm used for both
classification and regression tasks.
 Decision trees are trained with labeled data, where the labels that we want
to predict can be classes (for classification) or values (for regression).
 It models decisions based on a series of questions or conditions and their
possible outcomes.
 The structure of a decision tree resembles an inverted tree, where each
internal node represents a decision or test, each branch represents an
outcome of that decision, and each leaf node represents the final decision or
classification.
Dr. Marwa M. Emam
3

Decision Tree …
 Decision tree is a un directed graph/tree where:
 Each internal node correspond to attributes (Features).
 Leafs correspond to classification outcomes.
 Edge denotes assignment.
 Root: The most dominant attribute.
Dr. Marwa M. Emam 4

Decision Tree …
 Classification trees :
 Tree models where the target variable can take a discrete set of values are
called classification trees. In these tree structures, leaves represent class
labels and branches represent conjunctions of features that lead to those
class labels.
 Regression trees :
 Decision trees where the target variable can take continuous values (real
numbers) like the price of a house, or a patient’s length of stay in a
hospital, are called regression trees.
Dr. Marwa M. Emam 5

Classification Tree:
 Nodes in the classification tree are identified by the feature
names of the given data.
 Branches in the tree are identified by the values of features.
 The leaf nodes identified by are the class labels.
Dr. Marwa M. Emam 6

Decision Tree Structure
Dr. Marwa M. Emam 7

 Decision tree A machine learning model based on yes-or-no questions and
represented by a binary tree. The tree has a root node, decision nodes, leaf nodes,
and branches.
 root node The topmost node of the tree. It contains the first yes-or-no question. For
convenience, we refer to it as the root.
 decision node Each yes-or-no question in our model is represented by a decision
node, with two branches emanating from it (one for the “yes” answer, and one for
the “no” answer).
 leaf node A node that has no branches emanating from it. These represent the
decisions we make after traversing the tree. For convenience, we refer to them as
leaves.
 branch The two edges emanating from each decision node, corresponding to the
“yes” and “no” answers to the question in the node.
Dr. Marwa M. Emam
8

Problem….
Dr. Marwa M. Emam 10

How to build the tree ? Choose the root
node??

node??
 In decision tree algorithms, entropy and information gain are concepts used to
determine the best feature to split the data at each internal node.
 Entropy is a measure of disorder or impurity in a set of data. In the context of
decision trees, entropy is used to quantify the homogeneity (or heterogeneity) of
a group of samples with respect to their class labels.
 E(S)= 𝑷𝟏 𝐥𝐨𝐠𝟐( 𝑷𝟏 ) - 𝑷𝟐 𝒍𝒐𝒈𝟐(𝑷𝟐)
 Where 𝑷𝟏 , 𝑷𝟐 are the proportions of samples belonging to the two classes.
Dr. Marwa M. Emam
12

node??
 In decision tree algorithms, entropy and information gain are concepts used to
determine the best feature to split the data at each internal node.
 Entropy is a measure of disorder or impurity in a set of data. In the context of
decision trees, entropy is used to quantify the homogeneity (or heterogeneity) of
a group of samples with respect to their class labels.
 E(S)= 𝑷𝟏 𝐥𝐨𝐠𝟐( 𝑷𝟏 ) - 𝑷𝟐 𝒍𝒐𝒈𝟐(𝑷𝟐)
 Where 𝑷𝟏 , 𝑷𝟐 are the proportions of samples belonging to the two classes.
Dr. Marwa M. Emam
13

How to build the tree ? Choose the
root node??
 Information Gain is a metric used to determine the effectiveness of
a feature in reducing entropy. The goal is to select the feature that
results in the highest information gain when splitting the data.
 G(S, A)= E(S) -
|𝑺𝒗|
|𝑺|
E(𝑺𝒗)
 Where S is the dataset (original). A is the feature being considered
for splitting. 𝑺𝒗 represents the subset of S for which feature A has
value 𝒗. E is the entropy.

Algorithm ID3

Machine Learning-Lec6 expalin the decision .pdf

More Related Content

Similar to Machine Learning-Lec6 expalin the decision .pdf (20)

More from BeshoyArnest (8)

Recently uploaded (20)

Machine Learning-Lec6 expalin the decision .pdf