SlideShare a Scribd company logo
2
Most read
4
Most read
10
Most read
PRESENTED BY
MD. ZABIRUL ISLAM
1507110
MIN CHEN, (SENIOR MEMBER, IEEE)
PUBLISHED IN: IEEE JOURNAL & MAGAZINE ( VOLUME: 5 , PAGES;8869-8879, APRIL 2017)
Disease Prediction by Machine Learning Over
Big Data From Healthcare Communities
OUTLINE
 Motivation
 Introduction
 Dataset
 Evaluation Methods
 Cnn-Based Unimodal Disease Risk
Prediction (CNN-UDRP) Algorithm
 Experimental Results
 Conclusion
Department of Computer Science and Engineering 1
Motivation
 In our modern society ,with the improvement of living standards, the incidence of chronic
disease is increasing.
 According to a Chinese report in 2015, 86.6% of deaths are caused by chronic disease.
 So with the development of big data analytics technology , accurate analysis of medical data
can give benefits early disease detection.
Department of Computer Science and Engineering 2
Introduction
 Existing model uses structured data to predict the patients of either
high risk or low risk.
 But for a complex disease, structured data is not a good way to
describe the disease.
 We propose a new convolutional neural network (CNN)-based
multimodal disease risk prediction algorithm using structured and
unstructured data from hospital.
 In this paper, we mainly focus on the risk prediction of cerebral
infarction.
Department of Computer Science and Engineering 3
DATASET
 Structured data (S-data):the patient's basic information such as the patient's age, gender
and life habits and laboratory data .
 Unstructured Text data (T-data): the patient's narration of his/her illness, the doctor's
interrogation records and diagnosis.
 We use (S-data,T-data,S&T data) to predict whether the patient is at high-risk of cerebral
infarction.
Department of Computer Science and Engineering 4
CNN-BASED UNIMODAL DISEASE RISK
PREDICTION (CNN-UDRP) ALGORITHM
• CNN-UDRP: Only uses on the text data to predict high risk of cerebral infarction.
• CNN-MDRP: Uses for structured and unstructured text data for prediction.
• For the processing of medical text data, we utilize CNN-based unimodal disease risk prediction (CNN-
UDRP) algorithm which can be divided into the following five steps.
Department of Computer Science and Engineering 5
CONT.
 REPRESENTATION OF TEXT DATA:
Each word in the medical text is represented in the form of vector i.e.Word Embedding in NLP.
 CONVOLUTION LAYER OF TEXT CNN:
Every time we choose s words of each word vector in the text as the representation row vector.
 POOL LAYER OF TEXT CNN:
Select the max value of the n elements of each row to which play key role in the text.
 FULL CONNECTION LAYER OF TEXT CNN
 CNN CLASSIFIER:
Department of Computer Science and Engineering 6
EXPERIMENTAL RESULTS
 For S-data, we use traditional machine learning algorithms i.e.to predict the risk of cerebral
infarction disease.
 The highest accuracy of DT is 63% , recall of NB is 0.80 than other
 For S-data, the NB classification is the best in experiment.
Department of Computer Science and Engineering 8
CONT..
 The number of iterations increasing, the training error
rate of the CNN-UDRP (T-data) decreases test
accuracy increases
 when the number of iterations are 70, the training
process of CNN-MDRP (S&T-data) algorithm is
already stable
 We extract 10, 20,….,120 features from text by using CNN.
 When the feature number of text is smaller than 30, the
accuracy and recall of CNN-UDRP (T-data) and CNN-MDRP
(S&T-data) algorithms are smaller.
Department of Computer Science and Engineering 7
CONT.
 The accuracy of CNN-UDRP (T-data) is 0.9420 and the recall is 0.9808
 The accuracy of CNN-MDRP (S&T-data) is 0.9480 and the recall is 0.99923
 As seen for S-data the corresponding accuracy is low, which is roughly around 50%.
 We find that by combining these two data, the accuracy rate can reach 94.80% to better evaluate the risk of cerebral infarction disease.
Department of Computer Science and Engineering 9
CONCLUSION
• For some simple disease, e.g., hyperlipidemia, only a few features of structured data can get a good
description of the disease,
• But for a complex disease, only using features of structured data is not a good way to describe the
disease.
• In this paper, we propose (CNN-MDRP) algorithm using structured and unstructured data from hospital
with 94.8% accuracy.
Department of Computer Science and Engineering 10
REFERENCES
 P. Groves, B. Kayyali, D. Knott, and S. van Kuiken, The`Big Data'Revolution in Healthcare: Accelerating Value and Innovation. USA:
Center for US Health System Reform Business Technology Ofce, 2016.
 M. Chen, S. Mao, and Y. Liu, ``Big data: A survey,'' Mobile Netw. Appl.,vol. 19, no. 2, pp. 171209, Apr. 2014.
 P. B. Jensen, L. J. Jensen, and S. Brunak, ``Mining electronic health records: Towards better research applications and clinical care,''
NatureRev. Genet., vol. 13, no. 6, pp. 395405, 2012.
Department of Computer Science and Engineering 11
Department of Computer Science and Engineering 12
Department of Computer Science and Engineering 12

More Related Content

PPTX
Project on disease prediction
PPTX
Final ppt
PDF
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
PPTX
Predicting Diabetes Using Machine Learning
PPTX
Prediction of heart disease using machine learning.pptx
PPT
Diabetes prediction using machine learning
PPTX
Machine Learning for Disease Prediction
PPTX
Disease prediction system using python.pptx
Project on disease prediction
Final ppt
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Predicting Diabetes Using Machine Learning
Prediction of heart disease using machine learning.pptx
Diabetes prediction using machine learning
Machine Learning for Disease Prediction
Disease prediction system using python.pptx

What's hot (20)

PPTX
DISEASE PREDICTION SYSTEM USING DATA MINING
PPTX
Disease prediction and doctor recommendation system
PPTX
Disease Prediction And Doctor Appointment system
PPTX
HEALTH PREDICTION ANALYSIS USING DATA MINING
PDF
Heart Attack Prediction using Machine Learning
PDF
Diabetes Prediction Using Machine Learning
PPTX
Brain tumor detection using image segmentation ppt
PPTX
Brain Tumour Detection.pptx
DOCX
DISEASE PREDICTION BY MACHINE LEARNING OVER BIG DATA FROM HEALTHCARE COMMUNI...
PDF
Heart Disease Prediction using Machine Learning Algorithm
DOCX
deep learning applications in medical image analysis brain tumor
PPTX
Data science
PPT
Breast cancer detection using Artificial Neural Network
PDF
Deep learning for medical imaging
PPTX
Facial Emotion Recognition: A Deep Learning approach
PPT
Brain tumor detection by scanning MRI images (using filtering techniques)
PPT
Survey on data mining techniques in heart disease prediction
PPTX
Brain tumor detection using convolutional neural network
PPTX
Image classification using convolutional neural network
PPTX
AGE AND GENDER DETECTION.pptx
DISEASE PREDICTION SYSTEM USING DATA MINING
Disease prediction and doctor recommendation system
Disease Prediction And Doctor Appointment system
HEALTH PREDICTION ANALYSIS USING DATA MINING
Heart Attack Prediction using Machine Learning
Diabetes Prediction Using Machine Learning
Brain tumor detection using image segmentation ppt
Brain Tumour Detection.pptx
DISEASE PREDICTION BY MACHINE LEARNING OVER BIG DATA FROM HEALTHCARE COMMUNI...
Heart Disease Prediction using Machine Learning Algorithm
deep learning applications in medical image analysis brain tumor
Data science
Breast cancer detection using Artificial Neural Network
Deep learning for medical imaging
Facial Emotion Recognition: A Deep Learning approach
Brain tumor detection by scanning MRI images (using filtering techniques)
Survey on data mining techniques in heart disease prediction
Brain tumor detection using convolutional neural network
Image classification using convolutional neural network
AGE AND GENDER DETECTION.pptx
Ad

Similar to Disease Prediction by Machine Learning Over Big Data From Healthcare Communities (20)

PDF
ANN Model To Predict Coronary Heart Disease Based On Risk Factors
PDF
IRJET - Cloud based Enhanced Cardiac Disease Prediction using Naïve Bayesian ...
PDF
Prediction of Heart Disease Using Machine Learning and Deep Learning Techniques.
PDF
Alzheimer Disease Prediction using Machine Learning Algorithms
PDF
An Artificial Neural Network Model for Neonatal Disease Diagnosis
PDF
Bidirectional Recurrent Network and Neuro‑fuzzy Frequent Pattern Mining for H...
PPTX
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
PDF
Prediction of Neurological Disorder using Classification Approach
DOCX
DOCUMENT-Effective Heart Disease Prediction Using Hybrid Machine Learning Tec...
PDF
IRJET- Machine Learning Techniques for Brain Stroke using MRI
PDF
Hybrid CNN and LSTM Network For Heart Disease Prediction
PDF
IRJET- A System to Detect Heart Failure using Deep Learning Techniques
PDF
Ai and acute stroke imaging
PDF
Prediction of heart disease using neural network
PDF
PHONOCARDIOGRAM HEART SOUND SIGNAL CLASSIFICATION USING DEEP LEARNING TECHNIQUE
PPTX
Neural Systems in Medicine Biomedical.pptx
PDF
A REVIEW ON THE PREDICTION OF CONGENITAL HEART DISEASE USING DEEP LEARNING AN...
PPTX
Aplication of artificial neural network in cancer diagnosis
PDF
Enhancing stroke prediction using the waikato environment for knowledge analysis
PPTX
Comparative Analysis of Machine Learning Models for Predicting Heart Disease
ANN Model To Predict Coronary Heart Disease Based On Risk Factors
IRJET - Cloud based Enhanced Cardiac Disease Prediction using Naïve Bayesian ...
Prediction of Heart Disease Using Machine Learning and Deep Learning Techniques.
Alzheimer Disease Prediction using Machine Learning Algorithms
An Artificial Neural Network Model for Neonatal Disease Diagnosis
Bidirectional Recurrent Network and Neuro‑fuzzy Frequent Pattern Mining for H...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
Prediction of Neurological Disorder using Classification Approach
DOCUMENT-Effective Heart Disease Prediction Using Hybrid Machine Learning Tec...
IRJET- Machine Learning Techniques for Brain Stroke using MRI
Hybrid CNN and LSTM Network For Heart Disease Prediction
IRJET- A System to Detect Heart Failure using Deep Learning Techniques
Ai and acute stroke imaging
Prediction of heart disease using neural network
PHONOCARDIOGRAM HEART SOUND SIGNAL CLASSIFICATION USING DEEP LEARNING TECHNIQUE
Neural Systems in Medicine Biomedical.pptx
A REVIEW ON THE PREDICTION OF CONGENITAL HEART DISEASE USING DEEP LEARNING AN...
Aplication of artificial neural network in cancer diagnosis
Enhancing stroke prediction using the waikato environment for knowledge analysis
Comparative Analysis of Machine Learning Models for Predicting Heart Disease
Ad

Recently uploaded (20)

PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
Lecture Notes Electrical Wiring System Components
PPT
Project quality management in manufacturing
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
ETO & MEO Certificate of Competency Questions and Answers
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Geodesy 1.pptx...............................................
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
composite construction of structures.pdf
CH1 Production IntroductoryConcepts.pptx
Lecture Notes Electrical Wiring System Components
Project quality management in manufacturing
Embodied AI: Ushering in the Next Era of Intelligent Systems
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
CYBER-CRIMES AND SECURITY A guide to understanding
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Mechanical Engineering MATERIALS Selection
Foundation to blockchain - A guide to Blockchain Tech
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
ETO & MEO Certificate of Competency Questions and Answers
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Geodesy 1.pptx...............................................
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
composite construction of structures.pdf

Disease Prediction by Machine Learning Over Big Data From Healthcare Communities

  • 1. PRESENTED BY MD. ZABIRUL ISLAM 1507110 MIN CHEN, (SENIOR MEMBER, IEEE) PUBLISHED IN: IEEE JOURNAL & MAGAZINE ( VOLUME: 5 , PAGES;8869-8879, APRIL 2017) Disease Prediction by Machine Learning Over Big Data From Healthcare Communities
  • 2. OUTLINE  Motivation  Introduction  Dataset  Evaluation Methods  Cnn-Based Unimodal Disease Risk Prediction (CNN-UDRP) Algorithm  Experimental Results  Conclusion Department of Computer Science and Engineering 1
  • 3. Motivation  In our modern society ,with the improvement of living standards, the incidence of chronic disease is increasing.  According to a Chinese report in 2015, 86.6% of deaths are caused by chronic disease.  So with the development of big data analytics technology , accurate analysis of medical data can give benefits early disease detection. Department of Computer Science and Engineering 2
  • 4. Introduction  Existing model uses structured data to predict the patients of either high risk or low risk.  But for a complex disease, structured data is not a good way to describe the disease.  We propose a new convolutional neural network (CNN)-based multimodal disease risk prediction algorithm using structured and unstructured data from hospital.  In this paper, we mainly focus on the risk prediction of cerebral infarction. Department of Computer Science and Engineering 3
  • 5. DATASET  Structured data (S-data):the patient's basic information such as the patient's age, gender and life habits and laboratory data .  Unstructured Text data (T-data): the patient's narration of his/her illness, the doctor's interrogation records and diagnosis.  We use (S-data,T-data,S&T data) to predict whether the patient is at high-risk of cerebral infarction. Department of Computer Science and Engineering 4
  • 6. CNN-BASED UNIMODAL DISEASE RISK PREDICTION (CNN-UDRP) ALGORITHM • CNN-UDRP: Only uses on the text data to predict high risk of cerebral infarction. • CNN-MDRP: Uses for structured and unstructured text data for prediction. • For the processing of medical text data, we utilize CNN-based unimodal disease risk prediction (CNN- UDRP) algorithm which can be divided into the following five steps. Department of Computer Science and Engineering 5
  • 7. CONT.  REPRESENTATION OF TEXT DATA: Each word in the medical text is represented in the form of vector i.e.Word Embedding in NLP.  CONVOLUTION LAYER OF TEXT CNN: Every time we choose s words of each word vector in the text as the representation row vector.  POOL LAYER OF TEXT CNN: Select the max value of the n elements of each row to which play key role in the text.  FULL CONNECTION LAYER OF TEXT CNN  CNN CLASSIFIER: Department of Computer Science and Engineering 6
  • 8. EXPERIMENTAL RESULTS  For S-data, we use traditional machine learning algorithms i.e.to predict the risk of cerebral infarction disease.  The highest accuracy of DT is 63% , recall of NB is 0.80 than other  For S-data, the NB classification is the best in experiment. Department of Computer Science and Engineering 8
  • 9. CONT..  The number of iterations increasing, the training error rate of the CNN-UDRP (T-data) decreases test accuracy increases  when the number of iterations are 70, the training process of CNN-MDRP (S&T-data) algorithm is already stable  We extract 10, 20,….,120 features from text by using CNN.  When the feature number of text is smaller than 30, the accuracy and recall of CNN-UDRP (T-data) and CNN-MDRP (S&T-data) algorithms are smaller. Department of Computer Science and Engineering 7
  • 10. CONT.  The accuracy of CNN-UDRP (T-data) is 0.9420 and the recall is 0.9808  The accuracy of CNN-MDRP (S&T-data) is 0.9480 and the recall is 0.99923  As seen for S-data the corresponding accuracy is low, which is roughly around 50%.  We find that by combining these two data, the accuracy rate can reach 94.80% to better evaluate the risk of cerebral infarction disease. Department of Computer Science and Engineering 9
  • 11. CONCLUSION • For some simple disease, e.g., hyperlipidemia, only a few features of structured data can get a good description of the disease, • But for a complex disease, only using features of structured data is not a good way to describe the disease. • In this paper, we propose (CNN-MDRP) algorithm using structured and unstructured data from hospital with 94.8% accuracy. Department of Computer Science and Engineering 10
  • 12. REFERENCES  P. Groves, B. Kayyali, D. Knott, and S. van Kuiken, The`Big Data'Revolution in Healthcare: Accelerating Value and Innovation. USA: Center for US Health System Reform Business Technology Ofce, 2016.  M. Chen, S. Mao, and Y. Liu, ``Big data: A survey,'' Mobile Netw. Appl.,vol. 19, no. 2, pp. 171209, Apr. 2014.  P. B. Jensen, L. J. Jensen, and S. Brunak, ``Mining electronic health records: Towards better research applications and clinical care,'' NatureRev. Genet., vol. 13, no. 6, pp. 395405, 2012. Department of Computer Science and Engineering 11
  • 13. Department of Computer Science and Engineering 12
  • 14. Department of Computer Science and Engineering 12