1. FEATURE BASE HEART DISEASE
PREDICTION
USING RFCNN ADVANCE MACHINE LEARNING ALGORITHMS
STUDENT DETAILS:
QAMAR BEGUM
ROLL NO. 161022742009
M. E, COMPUTER SCIENCE
PROJECT GUIDE:
Dr.MOHAMMED SANAULLAH QASEEM
PROFESSOR & HOD
CSE DEPARTMENT
2. USING RFCNN ADVANCE MACHINE LEARNING ALGORITHM
FEATURE BASE HEART DISEASE
PREDICTION
3. TABLE OF CONTENT
• Abstract
• Existing system
• Disadvantages
• Proposed System
• Libraries
• Test Accuracy
• Results
• Application
• Conclusion
• Future Scope
• Literature survey
• System requirement
• Dendrogram
• Use case Diagram
• System
Implementation
4. ABSTRACT
• Heart disease is one of the leading causes of death worldwide,
• Predictive healthcare solutions frequently use machine learning (ML) techniques.
But the accuracy of the majority of conventional ML models is less than 80%.
• This procedure serves as a feature selector, identifying the most important
features from the dataset and eliminating irrelevant or noisy ones.
• Results will be useful to recognize patterns, correlations, and interactions among
the selected features.
5. EXISTING SYSTEM
• Lack of feature selection and data representation limits their effectiveness.
• Most models are sensitive to noise and overfitting.
• Accuracy is often below 80%, which is insufficient for critical
medical predictions.
• Less work on analysis identifying the most important features
from the dataset
6. DISADVANTAGES
• Inadequate feature selection results in poor data representation.
• Limited ability to capture non-linear patterns in data.
• High sensitivity to noise and outliers in the dataset.
• Overfitting to training data, leading to poor generalization.
7. PROPOSED SYSTEM
• We will be understanding advanced techniques to address the limitations of
existing methods
• We will be able to understand predictions from multiple models to improve
robustness.
• Algorithm designed for high accuracy and scalability.
• Ensures diversity in predictions by splitting data more randomly.
• This ensemble approach leverages the strengths of each algorithm to deliver higher
accuracy, precision, and recall for heart disease prediction
8. Training Data
Instance
Training each decision tree on a random subset
Bagging (majority)
Prediction Output
...............
Class A Class A class B
pOutput
RF
ALGORITHMS
9. Step 1: Select random K data points from the training set.
Step 2:Build the decision trees associated with the selected data
points(Subsets).
Step 3:Choose the number N for decision trees that you want to
build.
Step 4:Repeat Step 1 and 2.
Step 5: For new data points, find the predictions of each decision
tree, and assign the new data points to the category
STEPS
14. LITERATURE SURVEY
PAPER TITLE AND YEAR ALGORITHMS ADVANTAGES LIMITATIONS
Deep Learning Neural
Networks for Predictive
Healthcare
2020
Deep Learning
Neural Networks
(DLNN)
• High accuracy due
to automated
feature extraction.
• Computationally
expensive.
• Requires extensive
labeled data for
training.
Genetic Algorithm and TRFNN
for Optimized Healthcare
Predictions
2021
Genetic
Algorithms (GA)
with Transfer
Function Neural
Networks
(TRFNN)
• Improved feature
selection
enhances model
accuracy.
• Adaptive learning
process.
• Computational
overhead due to
GA.
• Risk of local optima
in feature selection.
15. LITERATURE SURVEY
PAPER TITLE AND YEAR ALGORITHMS ADVANTAGES LIMITATIONS
Hybrid SCNN-SVM for Image-
Based Disease Detection
2022
Sparse
Convolutional Neural
Network (SCNN)
with Support Vector
Machines (SVM)
• Reduces
computational
cost with
sparse
representations
.
• Limited scalability for
very large datasets.
• Complexity in model
integration.
PFFBPNN for Predictive
Analytics in Healthcare
2020
Partial Feedback
Feedforward
Backpropagation
Neural Networks
(PFFBPNN)
• Reduces
training time
compared to
standard
backpropagatio
n.
• .Sensitive to
hyperparameter tuning.
• Requires large memory
resources for
computation.
16. LITERATURE SURVEY
PAPER TITLE AND YEAR ALGORITHMS ADVANTAGES LIMITATIONS
IoT-Enabled Random Forest
Model for Remote Health
Monitoring 2021
IoT and Random
Forest for real-
time monitoring
• Handles high-
dimensional data
effectively.
• Limited by IoT
device bandwidth.
• Dependent on
stable internet
connectivity
17. SOFTWARE
REQUIREMENTS
HARDWARE
REQUIREMENTS
• OS: WINDOWS
• PYTHON : PYTHON 3.X AND ABOVE
• SETUP TOOLS AND PIP TO BE
INSTALLED FOR 3.6.X AND ABOVE
•RAM : 1GB AND HIGHER
•Processor : Intel i3 and above
•Hard Disk : 5GB minimum is required
18. DENDROGRAM
Heart Disease Clinical Data
Data Processing
Feature selection
Feature Normalization
Classification
Heart Disease
Absent
Heart disease
present
20. HYBRID APPROACH
Random Forest serves as a feature selector.
• Identifying the most important features.
• Eliminating irrelevant or noisy ones. .
CNN is used as the classification engine.
• Recognize patterns
• Recognize correlations
• Interactions among the selected features
22. FEATURE REDUCTION
• Random Forest calculates feature importance scores
and selects the top-N features
• This reduces dimensionality and improves Convolution
Neural Network efficiency by focusing only on
meaningful data.
23. COMBINATION OF STRENGTHS
• Random Forest is robust to overfitting due to its
ensemble structure, making it effective for feature
selection.
• The hybrid model can manage complex relationships
in the data since CNN is excellent at learning complex,
hierarchical patterns.
24. FEATURE SELECTION
Step:1 Train the Random Forest Model
step:2 Compute Feature Importance
Step:3 Rank Features by Importance
Step:4 Select Top Features
Step:5 Feature Subset
Input: The complete dataset with all features and the target variable.
29. CONCLUSION
• Prediction of heart diseases is done successfully in achieving superior
performance compared to other machine learning models
• 80/20 split dataset gives highest accuracy of impressive 83% underscoring
its effectiveness in heart disease prediction
• RFCNN's capability to handle complex medical datasets and provide
reliable predictions
• The Flask-based implementation ensures a user-friendly interface for
medical professionals and patients, with seamless integration into
healthcare systems.
30. FUTURE SCOPE
Enhance the model's
performance by
incorporating additional
features like genetic data or
lifestyle factors.
Expand the system to
predict other
cardiovascular conditions
for broader applicability.
Explore the integration of
blockchain for secure and
decentralized patient data
management.
Collaborate with healthcare
providers to validate the
system in real-world clinical
environments