SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6937
Disease Prediction Using Machine Learning
Akash C. Jamgade, Prof. S. D. Zade
Student, Dept. of Computer Science and Engineering, Priyadarshini Institute of Engineering & Technology,
Nagpur, Maharashtra, India
Professor, Dept. of Computer Science and Engineering, Priyadarshini Institute of Engineering & Technology,
Nagpur, Maharashtra, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - One such implementation of machine learning
algorithms is in the field of healthcare. Medical facilities need
to be advanced so that better decisions for patient diagnosis
and treatment options can be made. Machine learning in
healthcare aids the humans to process huge and complex
medical datasets and then analyze them into clinical insights.
This then can further be used by physicians in providing
medical care. Hence machine learning when implemented in
healthcare can leads to increased patient satisfaction. In this
paper, we try to implement functionalitiesof machinelearning
in healthcare in a single system. Instead of diagnosis, when a
disease prediction is implemented using certain machine
learning predictive algorithms then healthcare can be made
smart. Some cases can occur when early diagnosis of a disease
is not within reach. Hence disease prediction can beeffectively
implemented. As widely said “Prevention is better than cure”,
prediction of diseases and epidemic outbreakwouldleadto an
early prevention of an occurrence of a disease. This paper
mainly focus on the development of a system or we could say
an immediate medical provision which would incorporate the
symptoms collected from multisensory devices and other
medical data and store them into a healthcare dataset. This
dataset would then be analyzed using K-mean machine
learning algorithms to deliver results with maximum
accuracy.
Key Words: Big Data, healthcare, Machine learning, K-mean
algorithm, etc.
1. INTRODUCTION
Disease prediction using patient treatment history and
health data by applying data mining and machine learning
techniques is ongoing struggle for the past decades. Many
works have been applied data mining techniques to
pathological data or medical profiles for prediction of
specific diseases. These approaches tried to predict the
reoccurrence of disease. Also, some approaches try to do
prediction on control and progression of disease. Therecent
success of deep learning in disparate areas of machine
learning has driven a shift towards machinelearningmodels
that can learn rich, hierarchical representations of raw data
with little pre processing and produce moreaccurateresults.
With the development of big data technology,moreattention
has been paid to disease prediction from the perspective of
big data analysis; various researches have been conducted
by selecting the characteristics automatically from a large
number of data to improve the accuracy of risk classification
rather than the previously selected characteristics.
The main focus is on to use machinelearninginhealthcare to
supplement patient care for better results. Machinelearning
has made easier to identify different diseases and diagnosis
correctly. Predictive analysis with the help of efficient
multiple machine learning algorithms helps to predict the
disease more correctly and help treat patients.
The healthcare industry produces large amounts of health-
care data daily that can be used to extract information for
predicting disease that can happen to a patient in future
while using the treatment history and health data. This
hidden information in the healthcare data will be later used
for affective decision making for patient’s health. Also, this
areas need improvement by using the informative data in
healthcare.
One such implementation of machine learning algorithms is
in the field of healthcare. Medical facilities need to be
advanced so that better decisions for patient diagnosis and
treatment options can be made. Machine learning in
healthcare aids the humans to process huge and complex
medical datasets and then analyzethemintoclinical insights.
This then can further be used by physicians in providing
medical care. Hence machine learning whenimplemented in
healthcare can leads to increased patient satisfaction.The k-
mean algorithm is used to predict diseases using patient
treatment history and health data.
2. EXISTING SYSTEM
Prediction using traditional disease risk model usually
involves a machine learning and supervised learning
algorithm which uses training data with the labels for the
training of the models. High-risk and Low-risk patient
classification is done in groups test sets. But these models
are only valuable in clinical situations and are widely
studied. A system for sustainable health monitoring using
smart clothing by Chen et.al. He thoroughly studied
heterogeneous systems and was able to achieve the best
results for cost minimization on the tree and simple path
cases for heterogeneous systems.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6938
The information of patient’s statistics, test results, and
disease history is recorded in EHR which enables to identify
potential data-centric solutions which reduce the cost of
medical case studies. Bates et al. propose six applications of
big data in the healthcare field. Existing systems can predict
the diseases but not the subtype of diseases.Itfailstopredict
the condition of people.
The predictions of diseases have been non-specific and
indefinite
3. PROPOSED SYSTEM
In this paper, we have combined the structure and
unstructured data in healthcare fields that let us assess the
risk of disease. The approach of the latent factor model for
reconstructing the missingdata inmedical recordswhich are
collected from the hospital. And by using statistical
knowledge, we could determine the major chronic diseases
in a particular region and in particular community. To
handle structured data, we consult hospital experts to know
useful features.
In the case of unstructured text data, we select the features
automatically with the help of k-mean algorithm. We
propose a k-mean algorithm for both structured and
unstructured data.
3.1 The k-means algorithm
The k-means algorithm is a simple iterative method to
partition a given dataset into a specified number of clusters,
k. This algorithm has beendiscoveredbyseveral researchers
across different disciplines. The algorithm operates on a set
of d-dimensional vectors, D = {xi | i = 1, . . . , N}, where xi ∈ Rd
denotes the ith data point. The algorithm is initialized by
picking k points in Rd as the initial k cluster. Techniques for
selecting these initial seedsincludesamplingatrandomfrom
the dataset, setting them as the solution of clustering a small
subset of the data or perturbing theglobal meanofthedatak
times.
4. SYSTEM ARCHITECTURE
Fig -1: System Architecture
5. CONCLUSION
With the proposed system, higher accuracy can be achieved.
We not only use structured data, but also the text data of the
patient based on the proposed k-mean algorithm. To find
that out, we combine both data, and the accuracy rate canbe
reached up to 95%. None of the existing system and work is
focused on using both the data types in the field of medical
big data analytics. We propose a K-Mean clustering
algorithm for both structured and unstructured data. The
disease risk model is obtained by combiningbothstructured
and unstructured features.
ACKNOWLEDGEMENT
I express my sincere gratitude towards my guide of Prof. S.
D. Zade for their constant help, encouragement and
inspiration throughout the project work. Also I wouldlike to
thank the Head of Computer Science and Engineering
Department Dr. P. S. Prasad for him valuable guidance ,
ability to motive me and even willingness to solve difficulty
made it possible to make my project unique and made task
easier. My sincere thanks to Principal, Dr. V. M. Nanoti for
providing me necessary facility to carry out the work.
REFERENCES
[1] D. W. Bates, S. Saria, L. Ohno-Machado, A. Shah, and G.
Escobar, “Big data in health care: using analytics to
identify and manage high-risk and high-cost patients,”
Health Affairs, vol. 33, no. 7, pp. 1123–1131, 2014.
[2] K.R.Lakshmi, Y.Nagesh and M.VeeraKrishna,
”Performance comparison of three data mining
techniques for predicting kidney disease survivability”,
International Journal of Advances in Engineering &
Technology, Mar. 2014.
[3] Mr. Chala Beyene, Prof. Pooja Kamat, “Survey on
Prediction and Analysis the OccurrenceofHeartDisease
Using Data Mining Techniques”, International Journal of
Pure and Applied Mathematics, 2018.
[4] Boshra Brahmi, Mirsaeid Hosseini Shirvani, “Prediction
and Diagnosis of Heart Disease by Data Mining
Techniques”, Journals of Multidisciplinary Engineering
Science and Technology, vol.2,2February2015,pp.164-
168.
[5] A. Singh, G. Nadkarni, O. Gottesman, S. B. Ellis, E. P.
Bottinger, and J. V. Guttag, “Incorporating temporal ehr
data in predictive models for risk stratification of renal
function deterioration,” Journal of biomedical
informatics, vol. 53, pp. 220–228, 2015.
[6] S. Patel and H. Patel, “Survey of data mining techniques
used in healthcare domain,” Int. J. of Inform. Sci. and
Tech., Vol. 6, pp. 53-60,March 2016.

More Related Content

PPTX
Disease prediction using machine learning
PDF
I.ITERATIVE DEEPENING DEPTH FIRST SEARCH(ID-DFS) II.INFORMED SEARCH IN ARTIFI...
PPTX
Data-Intensive Technologies for Cloud Computing
PPTX
Java project-presentation
DOC
PDF
Vehicle accident detection system (VAD)
PDF
Cs6503 theory of computation book notes
Disease prediction using machine learning
I.ITERATIVE DEEPENING DEPTH FIRST SEARCH(ID-DFS) II.INFORMED SEARCH IN ARTIFI...
Data-Intensive Technologies for Cloud Computing
Java project-presentation
Vehicle accident detection system (VAD)
Cs6503 theory of computation book notes

What's hot (20)

PPTX
Forms of learning in ai
PPTX
Implementation of lexical analyser
PPTX
Introdution and designing a learning system
PPTX
Context free grammar
PPT
Introduction to Google App Engine
PPTX
Introduction TO Finite Automata
PPTX
DIABETES PREDICTION SYSTEM .pptx
PPTX
Learning in AI
PPTX
Disease Prediction And Doctor Appointment system
PPTX
Vehicles Parking Management System project presentation 2020
PPTX
NFA Non Deterministic Finite Automata by Mudasir khushik
PPT
1.Role lexical Analyzer
PPT
NFA or Non deterministic finite automata
PDF
I. FSSP(Progression Planner) II. BSSP(Regression Planner
PPT
Pass 1 flowchart
PDF
NFA to DFA
PPTX
OOPS In JAVA.pptx
PPTX
Inheritance in JAVA PPT
PPTX
Artificial intelligence in autonomous vehicle
PPTX
Final ppt
Forms of learning in ai
Implementation of lexical analyser
Introdution and designing a learning system
Context free grammar
Introduction to Google App Engine
Introduction TO Finite Automata
DIABETES PREDICTION SYSTEM .pptx
Learning in AI
Disease Prediction And Doctor Appointment system
Vehicles Parking Management System project presentation 2020
NFA Non Deterministic Finite Automata by Mudasir khushik
1.Role lexical Analyzer
NFA or Non deterministic finite automata
I. FSSP(Progression Planner) II. BSSP(Regression Planner
Pass 1 flowchart
NFA to DFA
OOPS In JAVA.pptx
Inheritance in JAVA PPT
Artificial intelligence in autonomous vehicle
Final ppt
Ad

Similar to IRJET- Disease Prediction using Machine Learning (20)

PDF
A comprehensive study on disease risk predictions in machine learning
PDF
Multi Disease Detection using Deep Learning
PDF
HEALTH PREDICTION ANALYSIS USING DATA MINING
PDF
Predicting disease from several symptoms using machine learning approach.
PDF
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
PDF
vaagdevi paper.pdf
PDF
Heart Disease Prediction Using Data Mining
PDF
IRJET - E-Health Chain and Anticipation of Future Disease
PDF
Heart Disease Prediction using Machine Learning
PDF
IRJET-Survey on Data Mining Techniques for Disease Prediction
PDF
IRJET- Analyse Big Data Electronic Health Records Database using Hadoop Cluster
PDF
IRJET- Cancer Disease Prediction using Machine Learning over Big Data
PDF
Predictions And Analytics In Healthcare: Advancements In Machine Learning
PDF
IRJET - Review on Classi?cation and Prediction of Dengue and Malaria Dise...
PDF
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
PDF
Health Care Application using Machine Learning and Deep Learning
PDF
Ijarcet vol-2-issue-4-1393-1397
PDF
IRJET - Prediction and Analysis of Multiple Diseases using Machine Learni...
PDF
Analysis on Data Mining Techniques for Heart Disease Dataset
PDF
IRJET - Digital Assistance: A New Impulse on Stroke Patient Health Care using...
A comprehensive study on disease risk predictions in machine learning
Multi Disease Detection using Deep Learning
HEALTH PREDICTION ANALYSIS USING DATA MINING
Predicting disease from several symptoms using machine learning approach.
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
vaagdevi paper.pdf
Heart Disease Prediction Using Data Mining
IRJET - E-Health Chain and Anticipation of Future Disease
Heart Disease Prediction using Machine Learning
IRJET-Survey on Data Mining Techniques for Disease Prediction
IRJET- Analyse Big Data Electronic Health Records Database using Hadoop Cluster
IRJET- Cancer Disease Prediction using Machine Learning over Big Data
Predictions And Analytics In Healthcare: Advancements In Machine Learning
IRJET - Review on Classi?cation and Prediction of Dengue and Malaria Dise...
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
Health Care Application using Machine Learning and Deep Learning
Ijarcet vol-2-issue-4-1393-1397
IRJET - Prediction and Analysis of Multiple Diseases using Machine Learni...
Analysis on Data Mining Techniques for Heart Disease Dataset
IRJET - Digital Assistance: A New Impulse on Stroke Patient Health Care using...
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PPTX
web development for engineering and engineering
PPTX
Fluid Mechanics, Module 3: Basics of Fluid Mechanics
PPTX
Welding lecture in detail for understanding
PDF
ETO & MEO Certificate of Competency Questions and Answers
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPT
Mechanical Engineering MATERIALS Selection
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
composite construction of structures.pdf
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
Unit 5 BSP.pptxytrrftyyydfyujfttyczcgvcd
web development for engineering and engineering
Fluid Mechanics, Module 3: Basics of Fluid Mechanics
Welding lecture in detail for understanding
ETO & MEO Certificate of Competency Questions and Answers
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Lecture Notes Electrical Wiring System Components
CYBER-CRIMES AND SECURITY A guide to understanding
Mechanical Engineering MATERIALS Selection
Embodied AI: Ushering in the Next Era of Intelligent Systems
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
composite construction of structures.pdf
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Unit 5 BSP.pptxytrrftyyydfyujfttyczcgvcd

IRJET- Disease Prediction using Machine Learning

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6937 Disease Prediction Using Machine Learning Akash C. Jamgade, Prof. S. D. Zade Student, Dept. of Computer Science and Engineering, Priyadarshini Institute of Engineering & Technology, Nagpur, Maharashtra, India Professor, Dept. of Computer Science and Engineering, Priyadarshini Institute of Engineering & Technology, Nagpur, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - One such implementation of machine learning algorithms is in the field of healthcare. Medical facilities need to be advanced so that better decisions for patient diagnosis and treatment options can be made. Machine learning in healthcare aids the humans to process huge and complex medical datasets and then analyze them into clinical insights. This then can further be used by physicians in providing medical care. Hence machine learning when implemented in healthcare can leads to increased patient satisfaction. In this paper, we try to implement functionalitiesof machinelearning in healthcare in a single system. Instead of diagnosis, when a disease prediction is implemented using certain machine learning predictive algorithms then healthcare can be made smart. Some cases can occur when early diagnosis of a disease is not within reach. Hence disease prediction can beeffectively implemented. As widely said “Prevention is better than cure”, prediction of diseases and epidemic outbreakwouldleadto an early prevention of an occurrence of a disease. This paper mainly focus on the development of a system or we could say an immediate medical provision which would incorporate the symptoms collected from multisensory devices and other medical data and store them into a healthcare dataset. This dataset would then be analyzed using K-mean machine learning algorithms to deliver results with maximum accuracy. Key Words: Big Data, healthcare, Machine learning, K-mean algorithm, etc. 1. INTRODUCTION Disease prediction using patient treatment history and health data by applying data mining and machine learning techniques is ongoing struggle for the past decades. Many works have been applied data mining techniques to pathological data or medical profiles for prediction of specific diseases. These approaches tried to predict the reoccurrence of disease. Also, some approaches try to do prediction on control and progression of disease. Therecent success of deep learning in disparate areas of machine learning has driven a shift towards machinelearningmodels that can learn rich, hierarchical representations of raw data with little pre processing and produce moreaccurateresults. With the development of big data technology,moreattention has been paid to disease prediction from the perspective of big data analysis; various researches have been conducted by selecting the characteristics automatically from a large number of data to improve the accuracy of risk classification rather than the previously selected characteristics. The main focus is on to use machinelearninginhealthcare to supplement patient care for better results. Machinelearning has made easier to identify different diseases and diagnosis correctly. Predictive analysis with the help of efficient multiple machine learning algorithms helps to predict the disease more correctly and help treat patients. The healthcare industry produces large amounts of health- care data daily that can be used to extract information for predicting disease that can happen to a patient in future while using the treatment history and health data. This hidden information in the healthcare data will be later used for affective decision making for patient’s health. Also, this areas need improvement by using the informative data in healthcare. One such implementation of machine learning algorithms is in the field of healthcare. Medical facilities need to be advanced so that better decisions for patient diagnosis and treatment options can be made. Machine learning in healthcare aids the humans to process huge and complex medical datasets and then analyzethemintoclinical insights. This then can further be used by physicians in providing medical care. Hence machine learning whenimplemented in healthcare can leads to increased patient satisfaction.The k- mean algorithm is used to predict diseases using patient treatment history and health data. 2. EXISTING SYSTEM Prediction using traditional disease risk model usually involves a machine learning and supervised learning algorithm which uses training data with the labels for the training of the models. High-risk and Low-risk patient classification is done in groups test sets. But these models are only valuable in clinical situations and are widely studied. A system for sustainable health monitoring using smart clothing by Chen et.al. He thoroughly studied heterogeneous systems and was able to achieve the best results for cost minimization on the tree and simple path cases for heterogeneous systems.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6938 The information of patient’s statistics, test results, and disease history is recorded in EHR which enables to identify potential data-centric solutions which reduce the cost of medical case studies. Bates et al. propose six applications of big data in the healthcare field. Existing systems can predict the diseases but not the subtype of diseases.Itfailstopredict the condition of people. The predictions of diseases have been non-specific and indefinite 3. PROPOSED SYSTEM In this paper, we have combined the structure and unstructured data in healthcare fields that let us assess the risk of disease. The approach of the latent factor model for reconstructing the missingdata inmedical recordswhich are collected from the hospital. And by using statistical knowledge, we could determine the major chronic diseases in a particular region and in particular community. To handle structured data, we consult hospital experts to know useful features. In the case of unstructured text data, we select the features automatically with the help of k-mean algorithm. We propose a k-mean algorithm for both structured and unstructured data. 3.1 The k-means algorithm The k-means algorithm is a simple iterative method to partition a given dataset into a specified number of clusters, k. This algorithm has beendiscoveredbyseveral researchers across different disciplines. The algorithm operates on a set of d-dimensional vectors, D = {xi | i = 1, . . . , N}, where xi ∈ Rd denotes the ith data point. The algorithm is initialized by picking k points in Rd as the initial k cluster. Techniques for selecting these initial seedsincludesamplingatrandomfrom the dataset, setting them as the solution of clustering a small subset of the data or perturbing theglobal meanofthedatak times. 4. SYSTEM ARCHITECTURE Fig -1: System Architecture 5. CONCLUSION With the proposed system, higher accuracy can be achieved. We not only use structured data, but also the text data of the patient based on the proposed k-mean algorithm. To find that out, we combine both data, and the accuracy rate canbe reached up to 95%. None of the existing system and work is focused on using both the data types in the field of medical big data analytics. We propose a K-Mean clustering algorithm for both structured and unstructured data. The disease risk model is obtained by combiningbothstructured and unstructured features. ACKNOWLEDGEMENT I express my sincere gratitude towards my guide of Prof. S. D. Zade for their constant help, encouragement and inspiration throughout the project work. Also I wouldlike to thank the Head of Computer Science and Engineering Department Dr. P. S. Prasad for him valuable guidance , ability to motive me and even willingness to solve difficulty made it possible to make my project unique and made task easier. My sincere thanks to Principal, Dr. V. M. Nanoti for providing me necessary facility to carry out the work. REFERENCES [1] D. W. Bates, S. Saria, L. Ohno-Machado, A. Shah, and G. Escobar, “Big data in health care: using analytics to identify and manage high-risk and high-cost patients,” Health Affairs, vol. 33, no. 7, pp. 1123–1131, 2014. [2] K.R.Lakshmi, Y.Nagesh and M.VeeraKrishna, ”Performance comparison of three data mining techniques for predicting kidney disease survivability”, International Journal of Advances in Engineering & Technology, Mar. 2014. [3] Mr. Chala Beyene, Prof. Pooja Kamat, “Survey on Prediction and Analysis the OccurrenceofHeartDisease Using Data Mining Techniques”, International Journal of Pure and Applied Mathematics, 2018. [4] Boshra Brahmi, Mirsaeid Hosseini Shirvani, “Prediction and Diagnosis of Heart Disease by Data Mining Techniques”, Journals of Multidisciplinary Engineering Science and Technology, vol.2,2February2015,pp.164- 168. [5] A. Singh, G. Nadkarni, O. Gottesman, S. B. Ellis, E. P. Bottinger, and J. V. Guttag, “Incorporating temporal ehr data in predictive models for risk stratification of renal function deterioration,” Journal of biomedical informatics, vol. 53, pp. 220–228, 2015. [6] S. Patel and H. Patel, “Survey of data mining techniques used in healthcare domain,” Int. J. of Inform. Sci. and Tech., Vol. 6, pp. 53-60,March 2016.