SlideShare a Scribd company logo
International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017
DOI: 10.5121/ijcsea.2017.7501 1
APPLYING MACHINE LEARNING TECHNIQUES TO FIND
IMPORTANT ATTRIBUTES FOR HEART FAILURE
SEVERITY ASSESSMENT
Puram Surya Prudvi 1
and Ershad Sharifahmadian 2
1
Department of Computer Engineering, Masters in Computer Engineering, University of
Houston Clear Lake, Houston, Texas, USA
2
Department of Computer Engineering, Visiting Asst Professor, University of Houston
Clear Lake, Houston, Texas, USA
ABSTRACT
The diagnosis of heart disease depends mostly on the combination of clinical and pathological data. It
leads to the quality of medical care provided for the patient. In this paper, three machine learning (ML)
techniques −Classification and Regression tree (CART), Neural Networks (NN), and Support vector
machine (SVM)− are utilized to find the best attributes for estimating the severity of heart failure. The data
is collected from three different resources, then each input attribute used for assessing the severity of heart
failure is analyzed individually after implementing the machine learning techniques. Finally, the most
important supportive attributes are presented in this paper by which medical staffs can identify heart
failure severity fast and more accurately. In fact, by screening important attributes, clinicians can make
better decision about right treatment procedures or preventive actions that reduce risk of heart attacks.
KEYWORDS
Congestive Heart Failure (CHF), Decision support system, Heart failure severity, Machine learning, Risk
classification
1. INTRODUCTION
Heart failure (HF) occurs due to an insufficient supply of blood from the heart. To meet the
general body needs a certain amount of blood is necessary. Breathlessness, insufficient sleep,
excessive tiredness, swelling of legs are some of the symptoms of the heart failure. Heart failure
is not same as a heart attack which is caused due to damage of the heart muscle. Some of the
common causes of heart failure are heart attack, high blood pressure, excess alcohol consumption,
drugs consumption, cigarette smoking, atrial fibrillation etc. High output of blood also causes
heart failure. When the amount of blood pumped by the heart is greater than the typical amount of
blood and the heart is not able to keep up, and then high-output heart failure will occur, which can
be termed as Congestive heart failure (CHF). Person affected with CHF usually has substantial
symptoms such as shortness of breath and chest pain.
Heart failure management includes the perpetuation of life, reduction of symptoms and being
more activity. The maintenance of heart failure is very important for the person affected with HF.
The Heart failure severity assessment and type prediction are important when the patient
condition is analyzed [1]. Thus, for the severity assessment, several symptoms from patients
should be observed. The main goal of this study is to find important attributes by which the
severity assessment of heart failure is better identified. This goal can be achieved based on
classification of data using the machine learning techniques. The data with different attributes in
different experiments are given to ML techniques. Then, based on the results of classification (i.e.
International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017
2
output of ML techniques) important attributes are selected. The severity assessment is
exceptionally helpful, for instance, in remote healthcare monitoring. In this paper, the severity
assessment is done using three machine learning techniques including CART (classification and
regression tree), SVM (support vector machine), and NN (neural networks). Then, the
performance of ML techniques is evaluated for each attribute (i.e. symptom) separately. In this
study, the results are evaluated such that the attribute which helps to assess the HF severity more
accurately is considered as one of the important supportive attributes.
2. RELATED WORK
In the previous works several approaches for assessing severity of Heart Failure, other diseases
and various Machine learning approaches for classification are proposed.
Gabriele Guidi, Maria Chiara Pettenati, Paolo Melillo et.al. [1] proposed a clinical decision
support system for assessing the severity. In the support system, a management interface is built
for the heart failure type prediction and severity assessment. To implement the smart functions
machine learning techniques are implemented. P. Melillo, N. De Luca, M. Bracale, and L.
Pecchia et.al. [5] proposed Classification tree based risk assessment for separating higher risk
patients from lower risk patients using of long term heart rate variability measures. T.John Peter,
K. Somasundaram et.al. [9] used Naïve Bayes, K-Nearest Neighbour, Decision Tree for
prediction of risk in heart failure patients. Kavetha.BV, Venu Gopala Krishnan.J, et.al [10] used
CVPartition method for classifying, deciding and detecting Maligant and Benign in
mammorgams. Amiya Halder, Oyendrila Dobe et.al [17] explained about Fuzzy feature selection
and support vector machine for detecting Tumor in Brain MRI.
The present work describes about three machine learning approaches for assessing the severity of
heart failure and the important attributes for assessing severity are concluded by several insights
into the data.
3. METHOD
To analyze the importance of supportive attributes in estimating the severity of heart failure, the
outcome of ML techniques is considered. The ML techniques classify patients based on the
severity level of the heart failure as i) mild, ii) moderate and iii) severe. The general block
diagram of HF severity assessment is given in the figure 1.
Classification and Regression Trees is an order strategy which utilizes chronicled information to
build purported decision trees. Decision trees are then used to group additional information. To
utilize CART, we need to know number of classes from the earlier [2], [3].
we are going to discuss the three machine learning techniques and their role in classification:
International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017
3
a. Decision trees arrange occurrence by sorting them down the tree from the root to some
leaf node, which gives the arrangement of the occurrence. Every node in the tree
determines a trial of some trait of the occurrence and each branch downward from that
node relates to one of the conceivable qualities for the given attribute. An occurrence is
ordered by beginning at the root node of the tree, testing the property determined by this
node, and then moving down the tree limb comparing to the estimation of the
characteristic. This procedure is then continual for the sub- tree rooted at the new node
[4].
b. Support vector machine (SVM) forms a model that allocates new examples to
Classification or regression analysis for a given set of training data sets [5]. The SVM
model is an illustration of the training examples with space as a plane and the data as
points and are cleverly mapped, separated by a clear gap to project that they belong to
two distinct categories
c. Neural networks learning method is a computational approach based on a rough analogy
of artificial neural networks which are an enormous collection of neural units exhibiting
the exact same way the brain solves problems with the help of large clusters of biological
neurons [4]. The feed-forward neural network is also implemented. By varying the
hidden neurons from 2 to 8. The best configuration is 8 neurons for the severity
assessment.
4. RESULTS
In this paper, we used the heart disease data from three resources. The database I is an
anonymized database of HF patients, with varying severity degrees, all treated by the Cardiology
Department at the St. Maria Nuova Hospital, Florence, Italy, in the period 2001–2008 [1]. This
database consists of 136 records from 90 patients, including baseline and follow-up data (when
available). At the time of the data collection, the specialist physician provided the mentioned HF
severity assessment in the desired three levels: i) mild, ii) moderate, and iii) severe, which was
stored in the database. 12 variables (i.e. attributes) in this database that are used as input for the
machine learning techniques are the following:
1) Anamnestic data: age, gender, and New York Heart Association (NYHA) class.
2) Instrumental data: weight, systolic blood pressure, diastolic blood pressure, EF (Ejection
Fraction), BNP (Brain natriuretic peptide), heart rate, ECG parameters (atrial fibrillation
true/false, left bundle branch block true/false, and ventricular tachycardia true/false).
The database II is machine learning repository of UCI [6] which was collected from the
Cleveland Clinic Foundation.
We have 303 instances of which 164 instances belonged to the healthy cases and 139 instances
belonged to the heart disease. 14 clinical features have been recorded for each instance. The table
I shows the 14 clinical features and their description.
International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017
4
Table I- Clinical features and their description
The database III is anonymized data collected from 246 patients with 14 attributes such as age,
sex, Dyspnea, smoking, dust, Respiratory frequency, Inhale and exhale time, ECG ST
segmentation, Heart rate, peripheral capillary oxygen saturation (spo2), systolic and diastolic
blood pressures. The data is collected from Siddhartha government medical hospital, Vijayawada,
India. The data provides an overview of prevalence of Chronic Obstructive Pulmonary Disease
(COPD) in Vijayawada, India. It provides insights into the mortality, morbidity and etiological
determination of COPD and emphasis in understanding the multidimensional nature of problem.
All the data is normalized eliminating redundant data (for example, storing the same data in more
than one table) and ensuring data dependencies make sense.
Using the machine learning techniques, we evaluate the performance of each attribute in
assessing the severity of HF. First, input data is considered from three resources and then the
common attributes in the three datasets are considered. Each attribute is used for the ML
techniques along with the common attributes, and the performance in assessing the HF severity is
observed.
Three machine learning techniques are applied and the corresponding results are as presented:
We extract the common attributes from the three datasets. The common attributes are age, sex,
Heart rate, blood pressure. Then, we examine the performance of each method in assessing the
severity of heart failure based on each supportive attribute. First, Classification and regression
tree (CART) is examined with the common attributes. Then each supportive attribute from each
International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017
5
dataset is added to the common attributes and results are studied separately. Later the support
vector machine (SVM), and NN are inspected in the same way. The accuracy, precision, and
sensitivity are calculated for each method individually. Each test is done using MATLAB
R2016b.
The accuracy of the three ML techniques using the three datasets are shown in Figures 2, 3,4. For
database I, common attributes are considered and their role is evaluated in assessing the severity
of the heart failure. Then, supportive attributes such as BNP, ECG parameters, NYHA class, EF
rate, Weight are used with the common attributes. Then, the accuracy of each ML techniques is
calculated. Among all the supportive attributes from the database I, BNP, NYHA class, ECG
parameters provide higher accuracy comparing other input attributes as shown in the figure 2.
Fig 2: Graph showing the accuracy of the machine learning techniques for the database I
Fig 3: Graph showing the accuracy of the machine learning techniques for the database II.
Fig 4: Graph showing the accuracy of the machine learning techniques for the database III.
International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017
6
Next, the common attributes from the database II are considered, then the supportive attributes
such as Cholesterol, Chest pain, ECG parameters, Exercise induced angina, fasting blood sugar,
ST segment, Thalassemia are considered. Among all the supportive attributes, Cholesterol, Chest
pain, ECG parameters and ST segment provide higher accuracy as shown in the figure 3. Last,
ML techniques are applied on the database III collected from Siddhartha government medical
hospital, Vijayawada, India. After that, the supportive attributes are identified and the role of each
attribute in assessing the severity of the heart failure is observed. Of all the supportive attributes,
the Respiratory Frequency (RF), spo2, ECG parameters, and Dyspnea provide higher accuracy as
shown in the figure 4.
5. DISCUSSION
The selection of the important supportive attributes among all attributes in the heart failure
severity assessment is done not only based on the accuracy but also based on the precision and
sensitivity. The rate of precision is observed for each method and evaluated for each level of
severity (i.e. Mild, Moderate and Severe). The precision for each level is tested with different
supportive attributes along with the common attributes. The figure 5 demonstrates the variation in
precision for each class (i.e. Mild, Moderate and Severe) when the CART is implemented for
different supportive attributes.
Fig 5: Graph showing precision values of all classes with respective to CART technique using the different
supportive attributes along with common attributes.
The figure 6 explains the sensitivity of each class when the CART is implemented for different
supportive attributes. In medical diagnosis, test sensitivity is the ability of a test to correctly
identify those with the disease. In other words, sensitivity is the extent to which the true positives
are not overlooked. When we observe the accuracy, precision and sensitivity rates of different
supportive attributes while implementing CART we noticed that BNP, chest pain, smoking, ECG
parameters, and dyspnea exhibited the better results. Therefore, we can select those as important
attributes in severity assessment of heart failure during the CART implementation.
Fig 6: Graph showing sensitivity values of all classes with respective to CART technique using the
different supportive attributes along with common attributes.
International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017
7
In the same way, SVM and NN are used. The supportive attributes and the common attributes are
evaluated. Figures 7, and 8 show the precision, sensitivity for each level of severity (i.e. class)
when the SVM is implemented. After implementing the SVM, we find out that BNP, smoking,
ECG parameters, cholesterol, chest pain, dyspnea are important supportive attributes.
Fig 7: Graph showing precision values of all classes with respective to SVM technique using the different
supportive attributes along with common attributes
Fig 8: Graph showing sensitivity values of all classes with respective to SVM technique using the different
supportive attributes along with common attributes
Figures 9 and 10 demonstrate the precision, and sensitivity for each class when the NN is
implemented. After NN implementation, we conclude that ECG parameters, cholesterol, Chest
pain, smoking, dyspnea are the important supportive attributes as those performed well
comparing to other attributes. CART produced satisfactory results in severity assessment if
compared with other studies that assess HF severity such as [1] that classify HF patients in three
groups mild, moderate, and severe. As shown in the graphs, the accuracy, precision, and
sensitivity are calculated for each supportive attribute individually.
Fig 9: Graph showing precision values of all classes with respective to NN technique using the different
supportive attributes along with common attributes
International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017
8
Based on results, CART outperforms well with the accuracy of 84.4%. For different machine
learning techniques, different supportive attributes are considered. the SVM has the average
accuracy of 76%. The neural network has the average accuracy of 78%.
Fig 10: Graph showing sensitivity values of all classes with respective to NN technique using the different
supportive attributes along with common attributes
5. CONCLUSION
T This work identifies main attributes for fast and more accurate assessment of the heart failure
severity in patients and the status of the disease. After evaluating selected clinical observations
from patients (i.e. Main attributes), physicians and other health professionals can better choose
right treatment procedures or preventive actions that reduce risk of heart attacks. First, we
selected the common clinical attributes such as age, sex, heart rate etc. Three datasets were used.
Those databases were selected to comprehensively evaluate broad ranges of clinical parameters
which influence the heart failure severity assessment. The three machine learning techniques were
implemented to identify the main supportive attributes. Different ml techniques were used to
show that identified main attributes are independent from ml techniques, in other words, the
changing classification method (i.e. ML technique) will not significantly affect the main
supportive attributes. Later we evaluated the performance of each technique by adding different
clinical attributes individually to the common attributes. After examining the performance of the
ml techniques, main clinical attributes were identified as important supportive attribute for each
technique. After cart implementation, we notice that BNP, chest pain, smoking, ECG parameters,
and dyspnea exhibited the better results. After implementing the SVM, we find out that BNP,
smoking, ECG parameters, cholesterol, chest pain, dyspnea are important supportive attributes.
After NN implementation, we conclude that ECG parameters, cholesterol, chest pain, smoking,
dyspnea are the important supportive attributes. Cart outperforms well with the accuracy of
84.4%. The SVM has the average accuracy of 76%. The neural network has the average accuracy
of 78%. Among all the methods, cart technique provided the better results in severity assessment
of heart failure with the accuracy of 84.4%.
ACKNOWLEDMENT
The authors would like to thank Dr. Gabriele Guidi, Robert Detrano, Dr. M.P.P. Bala
Narasimhulu for their support and providing database for this project.
REFERENCES
[1] Gabriele Guidi, Maria Chiara Pettenati, Paolo Melillo, “A Machine Learning System to Improve
Heart Failure Patient Assistance,” IEEE J. biomedical and health informatics, vol. 18, no. 6, pp. 1750-
1756, Nov 2014
International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017
9
[2] Priyanka Pandey,Amita Jain, “A Comparative study of Classification techniques: Support vector
Machine & Decision Tress”, . International Conference on Computing for Sustainable Global
Development, pp.3620-3624, 2016.
[3] Roman Timofeev, “Classification and Regression Tress. Theory and Applications”, Cent. Of Appl.
Statis. And Econ., Humboldt Univ., Berlin, Dec.20,2004.
[4] s [Online].
[5] P. Melillo, N. De Luca, M. Bracale, and L. Pecchia, “Classification tree for risk assessment in patients
suffering from congestive heart failure via long-term heart rate variability,” IEEE J. Biomed. Health
Inform., vol. 17, pp. 727–733, May 2013.
[6] UCI Machine Learning Repository [homepage on the Internet]. Arlington: The Association; 2006
[updated 1996 Dec 3;cited 2011 Feb 2].Available from:
http://archive.ics.uci.edu/ml/datasets/Heart+Disease.
[7] M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification
tasks,” Inf. Process. Manage., vol. 45, no. 4, pp. 427–437, Jul. 2009.
[8] Chun-Fu Lin and Sheng-De Wang, “Fuzzy Support Vector Machines”, IEEE Transactions on Neural
Networks, Vol.2, No.2, March 2002.
[9] T.John Peter, K. Somasundaram, “An empirical study on prediction of heart disease using
classification data mining techniques”, IEEE-International Conference On Advances In Engineering,
Science And Management (ICAESM -2012) March 30, pp.514-518, 2012.
[10] Kavetha.BV, Venu Gopala Krishnan.J, “Decide,Detect And Classify Benign And Maligant in
Mammorgams using CV-Partitioning Method”, ARPN Journal of Engineering and Applied
Sciences.Vol.10,No.10, June 2015.
[11] AH Chen, SY Huang, PS Cheng, EJ Lin, “HDPS: Heart Disease Prediction System. Computing in
Cardiology”, Vol.38, pp. 557-560, 2011.
[12] Ryusuke Hata, Kazuyuki Muras, “Multi-valued Autoencoders for Multi-Valued Neural Networks”,
IEEE Trans. International Joint Conference on Neural Networks, pp.4412-4417, 2016.
[13] Atul Kumar Pandey, Prabhat Pandey, K.L. Jaiswal, Ashish Kumar Sen, “A Heart Disease Prediction
Model using Decision Tree. IOSR Journal of Computer Engineering”, Vol.12, Issue 6, pp. 83-86, Aug
2013.
[14] Seyed Saleh Mohaseni Matiar Mohamadyrai, “Heart Arrhythmias Classification via a Sequential
Classifier Using Neural Network, Principal Component Analysis and Heart Rate Variation”, IEEE
Trans.8th International Conference on Intelligent Systems, pp.715-722, 2016.
[15] Udaya Sampath K. Perara Miriya Thanthrige, Jagath Samarabandu Xianbin Wang, “Machine
Learning Techniques for Intrusion Detection on Public Dataset”, IEEE Trans. Canadian Conference
on Electrical and Computer Engineering, 2016.
[16] Fabricio Aparecido Breve, Daniel Carlos Guimaraes Pedronette, “Combined Unsupervised and Semi-
Supervised Learning for Data Classification”, IEEE Trans. International WorkShop on Machine
Learning for Signal Procession, sep.2016.
[17] Amiya Halder, Oyendrila Dobe, “Detection of Tumor in Brain MRI Using Fuzzy Feature Selection
and Support Vector Machine”, International Conference on Advances in Computing,
Communications and Informatics, pp.21-24, sep.2016.
[18] B.Venkatalakshmi, M.V Shivsankar, “Heart Disease Diagnosis Using Predictive Data mining”,
ICIET’14, Volume 3, Issue 3, March 2014.

More Related Content

PDF
IRJET- Heart Disease Prediction System
PDF
E04733639
PDF
Heart Disease Prediction Using Data Mining Techniques
PDF
A Heart Disease Prediction Model using Logistic Regression
PDF
IRJET- A Literature Review on Heart and Alzheimer Disease Prediction
PDF
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase
PDF
The Analysis of Performace Model Tiered Artificial Neural Network for Assessm...
PDF
Automated segmentation and classification technique for brain stroke
IRJET- Heart Disease Prediction System
E04733639
Heart Disease Prediction Using Data Mining Techniques
A Heart Disease Prediction Model using Logistic Regression
IRJET- A Literature Review on Heart and Alzheimer Disease Prediction
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase
The Analysis of Performace Model Tiered Artificial Neural Network for Assessm...
Automated segmentation and classification technique for brain stroke

What's hot (20)

PDF
IRJET- Heart Failure Risk Prediction using Trained Electronic Health Record
PDF
PERFORMANCE ANALYSIS OF MULTICLASS SUPPORT VECTOR MACHINE CLASSIFICATION FOR ...
PDF
Classification of Heart Diseases Patients using Data Mining Techniques
PDF
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
PDF
prediction of heart disease using machine learning algorithms
DOCX
Using AI to Predict Strokes
PDF
IRJET- A System to Detect Heart Failure using Deep Learning Techniques
PDF
Heart Disease Prediction Using Associative Relational Classification Techniq...
PDF
Heart Disease Prediction using Machine Learning Algorithm
PDF
IRJET- Role of Different Data Mining Techniques for Predicting Heart Disease
DOCX
Heart disease prediction system
PDF
A data mining approach for prediction of heart disease using neural networks
PDF
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
PDF
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
PDF
Heart disease prediction
PDF
IRJET- The Prediction of Heart Disease using Naive Bayes Classifier
PDF
A Heart Disease Prediction Model using Decision Tree
PPTX
Data mining techniques on heart failure diagnosis
PPTX
Stroke Prediction
PDF
A Survey on Heart Disease Prediction Techniques
IRJET- Heart Failure Risk Prediction using Trained Electronic Health Record
PERFORMANCE ANALYSIS OF MULTICLASS SUPPORT VECTOR MACHINE CLASSIFICATION FOR ...
Classification of Heart Diseases Patients using Data Mining Techniques
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
prediction of heart disease using machine learning algorithms
Using AI to Predict Strokes
IRJET- A System to Detect Heart Failure using Deep Learning Techniques
Heart Disease Prediction Using Associative Relational Classification Techniq...
Heart Disease Prediction using Machine Learning Algorithm
IRJET- Role of Different Data Mining Techniques for Predicting Heart Disease
Heart disease prediction system
A data mining approach for prediction of heart disease using neural networks
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
IRJET- Genetic Algorithm for Feature Selection to Improve Heart Disease Predi...
Heart disease prediction
IRJET- The Prediction of Heart Disease using Naive Bayes Classifier
A Heart Disease Prediction Model using Decision Tree
Data mining techniques on heart failure diagnosis
Stroke Prediction
A Survey on Heart Disease Prediction Techniques
Ad

Similar to APPLYING MACHINE LEARNING TECHNIQUES TO FIND IMPORTANT ATTRIBUTES FOR HEART FAILURE SEVERITY ASSESSMENT (20)

PDF
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...
PDF
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DIS...
PDF
A hybrid model for heart disease prediction using recurrent neural network an...
PDF
Prediction of Heart Disease Using Data Mining Techniques- A Review
PDF
Comparing Data Mining Techniques used for Heart Disease Prediction
PDF
Acute coronary-syndrome-prediction-using-data-mining-techniques--an-application
PDF
238_heartdisease (1).pdf
PPTX
Artificial Intelligence And Machine Learning In Healthcare: A Cardiovascular ...
PDF
Heart Attack Prediction System Using Fuzzy C Means Classifier
PDF
Survey on data mining techniques in heart disease prediction
DOCX
DOCUMENT-Effective Heart Disease Prediction Using Hybrid Machine Learning Tec...
PDF
COMPARISON AND EVALUATION DATA MINING TECHNIQUES IN THE DIAGNOSIS OF HEART DI...
PDF
Hybrid CNN and LSTM Network For Heart Disease Prediction
PDF
Heart disease prediction by using novel optimization algorithm_ A supervised ...
PPT
javed_prethesis2608 on predcition of heart disease
PDF
Heart disease classification using optimized Machine learning algorithms.pdf
PDF
An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...
PDF
Diagnosis of some diseases in medicine via computerized experts system
PDF
A data mining approach for prediction of heart disease using neural networks
PDF
A data mining approach for prediction of heart disease using neural networks
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DISE...
EVALUATING THE ACCURACY OF CLASSIFICATION ALGORITHMS FOR DETECTING HEART DIS...
A hybrid model for heart disease prediction using recurrent neural network an...
Prediction of Heart Disease Using Data Mining Techniques- A Review
Comparing Data Mining Techniques used for Heart Disease Prediction
Acute coronary-syndrome-prediction-using-data-mining-techniques--an-application
238_heartdisease (1).pdf
Artificial Intelligence And Machine Learning In Healthcare: A Cardiovascular ...
Heart Attack Prediction System Using Fuzzy C Means Classifier
Survey on data mining techniques in heart disease prediction
DOCUMENT-Effective Heart Disease Prediction Using Hybrid Machine Learning Tec...
COMPARISON AND EVALUATION DATA MINING TECHNIQUES IN THE DIAGNOSIS OF HEART DI...
Hybrid CNN and LSTM Network For Heart Disease Prediction
Heart disease prediction by using novel optimization algorithm_ A supervised ...
javed_prethesis2608 on predcition of heart disease
Heart disease classification using optimized Machine learning algorithms.pdf
An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...
Diagnosis of some diseases in medicine via computerized experts system
A data mining approach for prediction of heart disease using neural networks
A data mining approach for prediction of heart disease using neural networks
Ad

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Modernizing your data center with Dell and AMD
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
A Presentation on Artificial Intelligence
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
KodekX | Application Modernization Development
PPTX
Big Data Technologies - Introduction.pptx
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Electronic commerce courselecture one. Pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Advanced methodologies resolving dimensionality complications for autism neur...
Modernizing your data center with Dell and AMD
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Unlocking AI with Model Context Protocol (MCP)
A Presentation on Artificial Intelligence
“AI and Expert System Decision Support & Business Intelligence Systems”
Mobile App Security Testing_ A Comprehensive Guide.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Encapsulation_ Review paper, used for researhc scholars
20250228 LYD VKU AI Blended-Learning.pptx
cuic standard and advanced reporting.pdf
KodekX | Application Modernization Development
Big Data Technologies - Introduction.pptx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Understanding_Digital_Forensics_Presentation.pptx
Electronic commerce courselecture one. Pdf

APPLYING MACHINE LEARNING TECHNIQUES TO FIND IMPORTANT ATTRIBUTES FOR HEART FAILURE SEVERITY ASSESSMENT

  • 1. International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017 DOI: 10.5121/ijcsea.2017.7501 1 APPLYING MACHINE LEARNING TECHNIQUES TO FIND IMPORTANT ATTRIBUTES FOR HEART FAILURE SEVERITY ASSESSMENT Puram Surya Prudvi 1 and Ershad Sharifahmadian 2 1 Department of Computer Engineering, Masters in Computer Engineering, University of Houston Clear Lake, Houston, Texas, USA 2 Department of Computer Engineering, Visiting Asst Professor, University of Houston Clear Lake, Houston, Texas, USA ABSTRACT The diagnosis of heart disease depends mostly on the combination of clinical and pathological data. It leads to the quality of medical care provided for the patient. In this paper, three machine learning (ML) techniques −Classification and Regression tree (CART), Neural Networks (NN), and Support vector machine (SVM)− are utilized to find the best attributes for estimating the severity of heart failure. The data is collected from three different resources, then each input attribute used for assessing the severity of heart failure is analyzed individually after implementing the machine learning techniques. Finally, the most important supportive attributes are presented in this paper by which medical staffs can identify heart failure severity fast and more accurately. In fact, by screening important attributes, clinicians can make better decision about right treatment procedures or preventive actions that reduce risk of heart attacks. KEYWORDS Congestive Heart Failure (CHF), Decision support system, Heart failure severity, Machine learning, Risk classification 1. INTRODUCTION Heart failure (HF) occurs due to an insufficient supply of blood from the heart. To meet the general body needs a certain amount of blood is necessary. Breathlessness, insufficient sleep, excessive tiredness, swelling of legs are some of the symptoms of the heart failure. Heart failure is not same as a heart attack which is caused due to damage of the heart muscle. Some of the common causes of heart failure are heart attack, high blood pressure, excess alcohol consumption, drugs consumption, cigarette smoking, atrial fibrillation etc. High output of blood also causes heart failure. When the amount of blood pumped by the heart is greater than the typical amount of blood and the heart is not able to keep up, and then high-output heart failure will occur, which can be termed as Congestive heart failure (CHF). Person affected with CHF usually has substantial symptoms such as shortness of breath and chest pain. Heart failure management includes the perpetuation of life, reduction of symptoms and being more activity. The maintenance of heart failure is very important for the person affected with HF. The Heart failure severity assessment and type prediction are important when the patient condition is analyzed [1]. Thus, for the severity assessment, several symptoms from patients should be observed. The main goal of this study is to find important attributes by which the severity assessment of heart failure is better identified. This goal can be achieved based on classification of data using the machine learning techniques. The data with different attributes in different experiments are given to ML techniques. Then, based on the results of classification (i.e.
  • 2. International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017 2 output of ML techniques) important attributes are selected. The severity assessment is exceptionally helpful, for instance, in remote healthcare monitoring. In this paper, the severity assessment is done using three machine learning techniques including CART (classification and regression tree), SVM (support vector machine), and NN (neural networks). Then, the performance of ML techniques is evaluated for each attribute (i.e. symptom) separately. In this study, the results are evaluated such that the attribute which helps to assess the HF severity more accurately is considered as one of the important supportive attributes. 2. RELATED WORK In the previous works several approaches for assessing severity of Heart Failure, other diseases and various Machine learning approaches for classification are proposed. Gabriele Guidi, Maria Chiara Pettenati, Paolo Melillo et.al. [1] proposed a clinical decision support system for assessing the severity. In the support system, a management interface is built for the heart failure type prediction and severity assessment. To implement the smart functions machine learning techniques are implemented. P. Melillo, N. De Luca, M. Bracale, and L. Pecchia et.al. [5] proposed Classification tree based risk assessment for separating higher risk patients from lower risk patients using of long term heart rate variability measures. T.John Peter, K. Somasundaram et.al. [9] used Naïve Bayes, K-Nearest Neighbour, Decision Tree for prediction of risk in heart failure patients. Kavetha.BV, Venu Gopala Krishnan.J, et.al [10] used CVPartition method for classifying, deciding and detecting Maligant and Benign in mammorgams. Amiya Halder, Oyendrila Dobe et.al [17] explained about Fuzzy feature selection and support vector machine for detecting Tumor in Brain MRI. The present work describes about three machine learning approaches for assessing the severity of heart failure and the important attributes for assessing severity are concluded by several insights into the data. 3. METHOD To analyze the importance of supportive attributes in estimating the severity of heart failure, the outcome of ML techniques is considered. The ML techniques classify patients based on the severity level of the heart failure as i) mild, ii) moderate and iii) severe. The general block diagram of HF severity assessment is given in the figure 1. Classification and Regression Trees is an order strategy which utilizes chronicled information to build purported decision trees. Decision trees are then used to group additional information. To utilize CART, we need to know number of classes from the earlier [2], [3]. we are going to discuss the three machine learning techniques and their role in classification:
  • 3. International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017 3 a. Decision trees arrange occurrence by sorting them down the tree from the root to some leaf node, which gives the arrangement of the occurrence. Every node in the tree determines a trial of some trait of the occurrence and each branch downward from that node relates to one of the conceivable qualities for the given attribute. An occurrence is ordered by beginning at the root node of the tree, testing the property determined by this node, and then moving down the tree limb comparing to the estimation of the characteristic. This procedure is then continual for the sub- tree rooted at the new node [4]. b. Support vector machine (SVM) forms a model that allocates new examples to Classification or regression analysis for a given set of training data sets [5]. The SVM model is an illustration of the training examples with space as a plane and the data as points and are cleverly mapped, separated by a clear gap to project that they belong to two distinct categories c. Neural networks learning method is a computational approach based on a rough analogy of artificial neural networks which are an enormous collection of neural units exhibiting the exact same way the brain solves problems with the help of large clusters of biological neurons [4]. The feed-forward neural network is also implemented. By varying the hidden neurons from 2 to 8. The best configuration is 8 neurons for the severity assessment. 4. RESULTS In this paper, we used the heart disease data from three resources. The database I is an anonymized database of HF patients, with varying severity degrees, all treated by the Cardiology Department at the St. Maria Nuova Hospital, Florence, Italy, in the period 2001–2008 [1]. This database consists of 136 records from 90 patients, including baseline and follow-up data (when available). At the time of the data collection, the specialist physician provided the mentioned HF severity assessment in the desired three levels: i) mild, ii) moderate, and iii) severe, which was stored in the database. 12 variables (i.e. attributes) in this database that are used as input for the machine learning techniques are the following: 1) Anamnestic data: age, gender, and New York Heart Association (NYHA) class. 2) Instrumental data: weight, systolic blood pressure, diastolic blood pressure, EF (Ejection Fraction), BNP (Brain natriuretic peptide), heart rate, ECG parameters (atrial fibrillation true/false, left bundle branch block true/false, and ventricular tachycardia true/false). The database II is machine learning repository of UCI [6] which was collected from the Cleveland Clinic Foundation. We have 303 instances of which 164 instances belonged to the healthy cases and 139 instances belonged to the heart disease. 14 clinical features have been recorded for each instance. The table I shows the 14 clinical features and their description.
  • 4. International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017 4 Table I- Clinical features and their description The database III is anonymized data collected from 246 patients with 14 attributes such as age, sex, Dyspnea, smoking, dust, Respiratory frequency, Inhale and exhale time, ECG ST segmentation, Heart rate, peripheral capillary oxygen saturation (spo2), systolic and diastolic blood pressures. The data is collected from Siddhartha government medical hospital, Vijayawada, India. The data provides an overview of prevalence of Chronic Obstructive Pulmonary Disease (COPD) in Vijayawada, India. It provides insights into the mortality, morbidity and etiological determination of COPD and emphasis in understanding the multidimensional nature of problem. All the data is normalized eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense. Using the machine learning techniques, we evaluate the performance of each attribute in assessing the severity of HF. First, input data is considered from three resources and then the common attributes in the three datasets are considered. Each attribute is used for the ML techniques along with the common attributes, and the performance in assessing the HF severity is observed. Three machine learning techniques are applied and the corresponding results are as presented: We extract the common attributes from the three datasets. The common attributes are age, sex, Heart rate, blood pressure. Then, we examine the performance of each method in assessing the severity of heart failure based on each supportive attribute. First, Classification and regression tree (CART) is examined with the common attributes. Then each supportive attribute from each
  • 5. International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017 5 dataset is added to the common attributes and results are studied separately. Later the support vector machine (SVM), and NN are inspected in the same way. The accuracy, precision, and sensitivity are calculated for each method individually. Each test is done using MATLAB R2016b. The accuracy of the three ML techniques using the three datasets are shown in Figures 2, 3,4. For database I, common attributes are considered and their role is evaluated in assessing the severity of the heart failure. Then, supportive attributes such as BNP, ECG parameters, NYHA class, EF rate, Weight are used with the common attributes. Then, the accuracy of each ML techniques is calculated. Among all the supportive attributes from the database I, BNP, NYHA class, ECG parameters provide higher accuracy comparing other input attributes as shown in the figure 2. Fig 2: Graph showing the accuracy of the machine learning techniques for the database I Fig 3: Graph showing the accuracy of the machine learning techniques for the database II. Fig 4: Graph showing the accuracy of the machine learning techniques for the database III.
  • 6. International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017 6 Next, the common attributes from the database II are considered, then the supportive attributes such as Cholesterol, Chest pain, ECG parameters, Exercise induced angina, fasting blood sugar, ST segment, Thalassemia are considered. Among all the supportive attributes, Cholesterol, Chest pain, ECG parameters and ST segment provide higher accuracy as shown in the figure 3. Last, ML techniques are applied on the database III collected from Siddhartha government medical hospital, Vijayawada, India. After that, the supportive attributes are identified and the role of each attribute in assessing the severity of the heart failure is observed. Of all the supportive attributes, the Respiratory Frequency (RF), spo2, ECG parameters, and Dyspnea provide higher accuracy as shown in the figure 4. 5. DISCUSSION The selection of the important supportive attributes among all attributes in the heart failure severity assessment is done not only based on the accuracy but also based on the precision and sensitivity. The rate of precision is observed for each method and evaluated for each level of severity (i.e. Mild, Moderate and Severe). The precision for each level is tested with different supportive attributes along with the common attributes. The figure 5 demonstrates the variation in precision for each class (i.e. Mild, Moderate and Severe) when the CART is implemented for different supportive attributes. Fig 5: Graph showing precision values of all classes with respective to CART technique using the different supportive attributes along with common attributes. The figure 6 explains the sensitivity of each class when the CART is implemented for different supportive attributes. In medical diagnosis, test sensitivity is the ability of a test to correctly identify those with the disease. In other words, sensitivity is the extent to which the true positives are not overlooked. When we observe the accuracy, precision and sensitivity rates of different supportive attributes while implementing CART we noticed that BNP, chest pain, smoking, ECG parameters, and dyspnea exhibited the better results. Therefore, we can select those as important attributes in severity assessment of heart failure during the CART implementation. Fig 6: Graph showing sensitivity values of all classes with respective to CART technique using the different supportive attributes along with common attributes.
  • 7. International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017 7 In the same way, SVM and NN are used. The supportive attributes and the common attributes are evaluated. Figures 7, and 8 show the precision, sensitivity for each level of severity (i.e. class) when the SVM is implemented. After implementing the SVM, we find out that BNP, smoking, ECG parameters, cholesterol, chest pain, dyspnea are important supportive attributes. Fig 7: Graph showing precision values of all classes with respective to SVM technique using the different supportive attributes along with common attributes Fig 8: Graph showing sensitivity values of all classes with respective to SVM technique using the different supportive attributes along with common attributes Figures 9 and 10 demonstrate the precision, and sensitivity for each class when the NN is implemented. After NN implementation, we conclude that ECG parameters, cholesterol, Chest pain, smoking, dyspnea are the important supportive attributes as those performed well comparing to other attributes. CART produced satisfactory results in severity assessment if compared with other studies that assess HF severity such as [1] that classify HF patients in three groups mild, moderate, and severe. As shown in the graphs, the accuracy, precision, and sensitivity are calculated for each supportive attribute individually. Fig 9: Graph showing precision values of all classes with respective to NN technique using the different supportive attributes along with common attributes
  • 8. International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017 8 Based on results, CART outperforms well with the accuracy of 84.4%. For different machine learning techniques, different supportive attributes are considered. the SVM has the average accuracy of 76%. The neural network has the average accuracy of 78%. Fig 10: Graph showing sensitivity values of all classes with respective to NN technique using the different supportive attributes along with common attributes 5. CONCLUSION T This work identifies main attributes for fast and more accurate assessment of the heart failure severity in patients and the status of the disease. After evaluating selected clinical observations from patients (i.e. Main attributes), physicians and other health professionals can better choose right treatment procedures or preventive actions that reduce risk of heart attacks. First, we selected the common clinical attributes such as age, sex, heart rate etc. Three datasets were used. Those databases were selected to comprehensively evaluate broad ranges of clinical parameters which influence the heart failure severity assessment. The three machine learning techniques were implemented to identify the main supportive attributes. Different ml techniques were used to show that identified main attributes are independent from ml techniques, in other words, the changing classification method (i.e. ML technique) will not significantly affect the main supportive attributes. Later we evaluated the performance of each technique by adding different clinical attributes individually to the common attributes. After examining the performance of the ml techniques, main clinical attributes were identified as important supportive attribute for each technique. After cart implementation, we notice that BNP, chest pain, smoking, ECG parameters, and dyspnea exhibited the better results. After implementing the SVM, we find out that BNP, smoking, ECG parameters, cholesterol, chest pain, dyspnea are important supportive attributes. After NN implementation, we conclude that ECG parameters, cholesterol, chest pain, smoking, dyspnea are the important supportive attributes. Cart outperforms well with the accuracy of 84.4%. The SVM has the average accuracy of 76%. The neural network has the average accuracy of 78%. Among all the methods, cart technique provided the better results in severity assessment of heart failure with the accuracy of 84.4%. ACKNOWLEDMENT The authors would like to thank Dr. Gabriele Guidi, Robert Detrano, Dr. M.P.P. Bala Narasimhulu for their support and providing database for this project. REFERENCES [1] Gabriele Guidi, Maria Chiara Pettenati, Paolo Melillo, “A Machine Learning System to Improve Heart Failure Patient Assistance,” IEEE J. biomedical and health informatics, vol. 18, no. 6, pp. 1750- 1756, Nov 2014
  • 9. International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol. 7, No. 5, October 2017 9 [2] Priyanka Pandey,Amita Jain, “A Comparative study of Classification techniques: Support vector Machine & Decision Tress”, . International Conference on Computing for Sustainable Global Development, pp.3620-3624, 2016. [3] Roman Timofeev, “Classification and Regression Tress. Theory and Applications”, Cent. Of Appl. Statis. And Econ., Humboldt Univ., Berlin, Dec.20,2004. [4] s [Online]. [5] P. Melillo, N. De Luca, M. Bracale, and L. Pecchia, “Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability,” IEEE J. Biomed. Health Inform., vol. 17, pp. 727–733, May 2013. [6] UCI Machine Learning Repository [homepage on the Internet]. Arlington: The Association; 2006 [updated 1996 Dec 3;cited 2011 Feb 2].Available from: http://archive.ics.uci.edu/ml/datasets/Heart+Disease. [7] M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Inf. Process. Manage., vol. 45, no. 4, pp. 427–437, Jul. 2009. [8] Chun-Fu Lin and Sheng-De Wang, “Fuzzy Support Vector Machines”, IEEE Transactions on Neural Networks, Vol.2, No.2, March 2002. [9] T.John Peter, K. Somasundaram, “An empirical study on prediction of heart disease using classification data mining techniques”, IEEE-International Conference On Advances In Engineering, Science And Management (ICAESM -2012) March 30, pp.514-518, 2012. [10] Kavetha.BV, Venu Gopala Krishnan.J, “Decide,Detect And Classify Benign And Maligant in Mammorgams using CV-Partitioning Method”, ARPN Journal of Engineering and Applied Sciences.Vol.10,No.10, June 2015. [11] AH Chen, SY Huang, PS Cheng, EJ Lin, “HDPS: Heart Disease Prediction System. Computing in Cardiology”, Vol.38, pp. 557-560, 2011. [12] Ryusuke Hata, Kazuyuki Muras, “Multi-valued Autoencoders for Multi-Valued Neural Networks”, IEEE Trans. International Joint Conference on Neural Networks, pp.4412-4417, 2016. [13] Atul Kumar Pandey, Prabhat Pandey, K.L. Jaiswal, Ashish Kumar Sen, “A Heart Disease Prediction Model using Decision Tree. IOSR Journal of Computer Engineering”, Vol.12, Issue 6, pp. 83-86, Aug 2013. [14] Seyed Saleh Mohaseni Matiar Mohamadyrai, “Heart Arrhythmias Classification via a Sequential Classifier Using Neural Network, Principal Component Analysis and Heart Rate Variation”, IEEE Trans.8th International Conference on Intelligent Systems, pp.715-722, 2016. [15] Udaya Sampath K. Perara Miriya Thanthrige, Jagath Samarabandu Xianbin Wang, “Machine Learning Techniques for Intrusion Detection on Public Dataset”, IEEE Trans. Canadian Conference on Electrical and Computer Engineering, 2016. [16] Fabricio Aparecido Breve, Daniel Carlos Guimaraes Pedronette, “Combined Unsupervised and Semi- Supervised Learning for Data Classification”, IEEE Trans. International WorkShop on Machine Learning for Signal Procession, sep.2016. [17] Amiya Halder, Oyendrila Dobe, “Detection of Tumor in Brain MRI Using Fuzzy Feature Selection and Support Vector Machine”, International Conference on Advances in Computing, Communications and Informatics, pp.21-24, sep.2016. [18] B.Venkatalakshmi, M.V Shivsankar, “Heart Disease Diagnosis Using Predictive Data mining”, ICIET’14, Volume 3, Issue 3, March 2014.