4. Motivation
• The main motivation of this project is :
1. According to WHO, by 2030 at least 23.6 million people will be suffering from
heart diseases.
2. To reduce this number we want to help doctors identify patients with heart disease.
3. To identify the symptoms and problems faced by patients of heart diseases.
4. To develop efficient model to predict heart disease using machine learning
algorithms.
6. Dataset
• The dataset consists of data from 304 people.
• It has 14 different attributes.
• The attributes include: age, sex, cholesterol, ecg, etc.
7. K-Nearest Neighbour
• K-Nearest Neighbour is one of the simplest Machine Learning
algorithms based on Supervised Learning technique
• K-NN algorithm assumes the similarity between the new
case/data and available cases and put the new case into
the category that is most similar to the available categories.
• K-NN algorithm stores all the available data and classifies a
new data point based on the similarity. This means when new
data appears then it can be easily classified into a well suite
category by using K- NN algorithm.
9. Random Forest
• As the name suggests, "Random Forest is a classifier that
contains a number of decision trees on various subsets of
the given dataset and takes the average to improve the
predictive accuracy of that dataset."
10. Decision Tree
• Non-linear classifier
• Easy to use
• Easy to interpret
• Susceptible to overfitting but can be avoided.
11. Anatomy of a decision tree
overcast
high normal false
true
sunny
rain
No No
Yes Yes
Yes
Outlook
Humidity
Windy
Each node is a test on
one attribute
Possible attribute values
of the node
Leafs are the
decisions
12. Conclusion
• By using the above methods we can classify whether the
person has a heart disease or not.