SlideShare a Scribd company logo
Automating Fetus Health
Monitor Using Machine
Learning
Md. Tamjid Rayhan
Department of Electrical and Electronic Engineering
University of Chittagong
Session : 2012 - 2013
1
Outline of this seminar
• How I started my journey in Machine Learning and what plan do I
suggest for the newbies?
• How did I approach a Biomedical Engineering problem using machine
learning?
• Question and Answer session.
2
Machine learning: Beginner to Professional
• Which tutorials, programming languages, books and blogs suits the
beginners best?
• Do I need GPU’s , TPU’s, Cloud computing resources? If so, can I get
them for free?
• Where do I get my data for free?
• How do I know whether I have become a Pro in Machine Learning?
3
Where to Start?
• C /C++ / JAVA / MATLAB / R/ Python?
• There’s a github repo that can help you decide
• https://github.com/josephmisiti/awesome-machine-learning
4
Advantages of python
• Easy to learn
• Easy to find materials
• Easy to write code in python
• Enormous community who are very supportive
• There is a library / framework of python for almost any ML task
• Python can also be used for Web and GUI development
5
How to start learning python?
https://python.maateen.me/
6
Why not official documentation?
https://docs.python.org/3/
7
What about video tutorials?
https://www.py4e.com/
8
Which IDE should you use?
• Here’s a link to all available options. Find one that suits you best:
• https://wiki.python.org/moin/PythonEditors
• I use iPython/Jupyter notebooks for data analysis and model
development
• I currently use VS code for my python development
• I started coding in the python IDLE that comes built in with python
installation.
9
Data structure and algorithms
Grokking Algorithms Introduction to algorithms
10
Machine learning is all about MATH!!!
• This is the truth , whether you are happy about it or sad about it!
• Do I need to be a math genius to start learning Machine Learning. Well the
answer is no!
• The prerequisite maths of learning machine learning is very simple: Matrix
multiplication, vector dot product, differentiation, mean, variance,
histogram.
• You need to love math though! Because each and everything you learn
about will involve math!
11
Get your hands dirty asap! With Kaggle!
12
Essential Kaggle Micro-courses!
1. Python 5. Deep Learning
2. Pandas 6. Feature engineering
3. Intro to Machine Learning 7. Data visualization
4. Intermediate Machine Learning 8. Intro to SQL
13
Outcome of Kaggle Micro courses
Python Library Introduction Machine learning introduction
Pandas Your first ML Project
Scikit-learn Your first ML competition
Tensorflow Your first DL project
Matplotlib and Seaborn Data collection using SQL
14
The signature Machine Learning Course
https://www.coursera.org/learn/machine-learning
15
Outcome of “Machine Learning” course
Statistics Machine Learning Algorithms
Linear and Logistic Regression Neural Networks implementation
Concept of cost function and gradient
descent
SVM implementation
Concept of bias and Variance K-Means clustering implementation
Performance Metrics of an ML algorithm Recommender system implementation
16
Do I need GPU or TPU?
• No, instead use the free online virtual runtimes provided by Kaggle
and Google Colab
• In Google Colab you get a runtime with 12GB RAM and you have
access to GPU and TPU. Also you can mount your google drive as
permanent storage.
• In Kaggle you can upload your own dataset , do analysis on it using
their free runtime.
17
Where to get the data from?
• Kaggle competitions and datasets
• UCI Machine learning repository
• Physionet.org
• Google Dataset search
• Web scraping
• Retrieving data from relational databases using SQL
• Collecting and Building your own dataset
• Collecting data from existing products and services
18
What are the best books to learn Machine
learning?
Hands on Machine Learning using
scikit-learn and tensorflow
Deep Learning – Ian Goodfellow
https://www.deeplearningbook.org/
19
Dive into deep learning!
https://www.coursera.org/specializations/deep-learning
20
Miscellenius Resources
• Google crash course on machine learning
• https://developers.google.com/machine-learning/crash-course/
• This blog is for computer vision enthusiasts
• https://www.pyimagesearch.com/
• Subscribe the newsletter to deeplearning.ai
• https://www.deeplearning.ai/thebatch/
• Having difficulty setting up your GPU?
• https://course.fast.ai/#using-a-gpu
21
End to End learning vs. multiple stage
learning
22
Small data vs Big data
https://www.industryweek.com/technology-and-iiot/digital-tools/article/21122846/making-
ai-work-with-small-data
• Unlike consumer Internet companies, which have data from billions of
users to train powerful AI models, collecting massive training sets in
manufacturing is often not feasible.
• In a recent MAPI survey, 58% of research respondents reported that the
most significant barrier to deployment of AI solutions pertained to a lack of
data resources.
• Synthetic data generation, Transfer learning, Self-supervised learning, few-
shot learning, One-shot learning, anomaly detection, Human-in-the-loop
23
Solving a problem using Machine Learning
• Why did I choose to work on automating Fetus Health Monitor?
• Why Machine Learning is needed to solve this problem?
• What were my results? Did I answer my research questions?
• Is my work ready to get implemented in a hospital yet? If no, why
not?
24
Problem
Stillbirth
Global
problem
Application of
technology is
inadequete
SDG of
attaining zero
stillbirth by
2030
28 out of
1000 births
are stillbirths
in Bangladesh
98% of these
death in
Developing
countries
2.1 million
stillbirth in
2015
25
Solution
Automate
Effective decision in
emergency
Save time
Enable Tele-
Monitoring
26
Cardiotocography features
Feature Definition Ideal Value
Baseline Heart rate at rest 110 – 160 beats per minute
Variability Fluctuations from the baseline > 6 beats per minute
Acceleration
Abrupt increase from baseline of
15 beats/min that lasts for 15
seconds
Must be present in a healthy fetus
Deceleration
Decrease from baseline of 15
beats/min that lasts for 15 seconds
Decelerations are non- reassuring,
Should not be present in healthy
fetus
27
Cardiotocography data of a normal fetus
28
Cardiotocography data of a pathologic fetus
29
Similar Problems
• Detection of heart disease from ECG data
• Predicting epileptic seizure from analyzing EEG data
• Detecting Pneumonia from Chest X-ray Images
30
Objectives
• Research question 1: Is it possible to predict fetus health
automatically from cardiotocography data?
• Research question 2: If it is possible to predict fetus health
automatically from the Cardiotocography data, then how accurate is
that model in predicting fetus health?
• Research question 3: If it is possible to predict fetus health
automatically from the Cardiotocography data, then which features of
cardiotocography data are most important in predicting fetus health?
31
Data preprocessing
Exploratory
data analysis
Handling
categorical
variables
Missing value
interpolation
Outlier
detection and
removal
Feature
selection
Train Test
spliting
32
Model Development
Choose algorithm
Algorithm
applicability by
drawing learning
curve
Tune
Hyperparameters
from validation
curves
Train model
Calculate
performance metrics,
draw confusion
matrix and ROC
Compare the
performance of all
built models
33
Learning Curve for SVM
Learning curve for SVM with Linear kernel Learning curve for SVM with Gaussian kernel
34
Comparisn between built models
Model Description
Sensitivity
(Pathologic)
Sensitivity
(Suspected)
Precision (Normal) Accuracy
Logistic regression with
selected features
Good(0.946) bad(0.767) Excellent(0.976) Good(0.812)
Logistic regression with
all features
Excellent
(0.973)
Good(0.86) Excellent(0.988) Average (0.804)
Random forest with
selected features
Good (0.892) Good(0.837) Excellent( .974) Excellent( 0.953)
Random forest with all
features
Good(0.892) Good(0.884) Excellent( .98) Excellent( 0.953)
SVM with selected
features
Good(0.919) Good(0.837) Excellent(0.992) Good(0.841)
SVM with all features Good(0.946) Good(0.814) Excellent(0.984) Good(0.833)
35
Answers to research questions
• Question 1: Is it possible to predict fetus health automatically from
Cardiotocography data?
• Answer: Yes, It is possible.
• Question 2: Then how accurate is that model in predicting fetus
health?
• Answer: Based on the performance of models I selected two models
as most implementable
• One of them Random forest with selected features has high overall
accuracy of 95.3%
• Another one Logistic Regression with all features has compromised
overall accuracy of 80.4% to obtain excellent sensitivity in pathologic
of 97%
36
Answers to research questions
• Question 3: Which features of cardiotocography data are most
important in predicting fetus health?
• Answer: According to feature importance of random forest model the
feature importance of the five most important features are as below:
Feature Importance
Percentage of time with abnormal short term variability (ASTV) 0.15
Percentage of time with abnormal long term variability (ALTV) 0.13
Histogram mean 0.1
Mean value of short term variability (MSTV) 0.08
Acceleration (AC) 0.07
37
How do you know you have become a
Professional in Machine Learning?
• Be a competitions, notebooks, datasets, discussions master in Kaggle. 4x
grandmaster Abhishek Thakur explains:
• https://www.youtube.com/watch?v=z15TKkAPNUM
• Publish the results of your research in an impactful journal.
• Get the Machine learning engineer job you wanted
• Get accepted in your favourite PhD program
• Build a good freelancing portfolio in this field
38
Yann Lecun’s Advice
• So my advice is, if you want to get into this, make yourself useful.
• So make a contribution to an open source project, for example.
• Or make an implementation of some standard algorithm that you can't find the
code of
• online, but you'd like to make it available to other people.
• So take a paper that you think is important,
• and then re-implement the algorithm, and then put it open source package,
• or contribute to one of those open source packages.
• And if the stuff you write is interesting and useful, you'll get noticed.
• Maybe you'll get a nice job at a company you really wanted a job at,
• or maybe you'll get accepted in your favorite PhD program or things like this.
39
A business example! Digital Marketing!
• A company has passenger ships from Tokyo to Hokkaido
• The company also have a dataset of Internet usage data for it’s
potential customers (Potential Tourists from Tokyo to Hokkaido)
• Can this company deliver custom advertisements based on the users
behavior and characteristics, so that the user is most likely to buy a
ticket to Hokkaido from this company?
40
K-Means clustering to group similar
customers
41
What are the things this group of customers
are interested in?
42
My team’s solution for this cluster!
• There will be a fortune telling corner on the ship!
• There will be Kabuki shows on the journey!
43
Our Advertisement for this cluster!
Omikuji Kabuki Art performance
44
45
Thank you! 
46

More Related Content

PDF
Data Science unit 2 By: Professor Lili Saghafi
PPTX
Introduction to machine learning
PDF
Data science unit 1 By: Professor Lili Saghafi
PDF
The ELIXIR UK industry survey by Gabriella Rustici
PPTX
Testing for cognitive bias in ai systems
PDF
Andrew NG machine learning
PDF
Scalable Learning Technologies for Big Data Mining
PDF
Top 10 Data Science Practitioner Pitfalls
Data Science unit 2 By: Professor Lili Saghafi
Introduction to machine learning
Data science unit 1 By: Professor Lili Saghafi
The ELIXIR UK industry survey by Gabriella Rustici
Testing for cognitive bias in ai systems
Andrew NG machine learning
Scalable Learning Technologies for Big Data Mining
Top 10 Data Science Practitioner Pitfalls

What's hot (18)

PDF
Machine Learning Algorithms (Part 1)
PPTX
H2O World - Top 10 Data Science Pitfalls - Mark Landry
PDF
Managing machine learning
PDF
Data driven portfolio management agile2017
PDF
How to Become a Data Scientist | Data Scientist Skills | Data Science Trainin...
PPTX
Machine Learning 101
PDF
Guide to end end machine learning projects
PDF
Machine Learning Landscape
PPTX
Summarizing Siegel's Predictive Analytics
PDF
(In)convenient truths about applied machine learning
PDF
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
PPTX
Machine learning: A Walk Through School Exams
PPTX
Simple metrics for Curricular Analytics
PDF
Reinforcement Learning in Practice: Contextual Bandits
PDF
Isee system simulation report
PDF
Introduction to Data Science
PDF
Sentiment Analysis | Machine Learning Algorithms | Data Science Tutorial | Ed...
PDF
AI Orange Belt - Session 1
Machine Learning Algorithms (Part 1)
H2O World - Top 10 Data Science Pitfalls - Mark Landry
Managing machine learning
Data driven portfolio management agile2017
How to Become a Data Scientist | Data Scientist Skills | Data Science Trainin...
Machine Learning 101
Guide to end end machine learning projects
Machine Learning Landscape
Summarizing Siegel's Predictive Analytics
(In)convenient truths about applied machine learning
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Machine learning: A Walk Through School Exams
Simple metrics for Curricular Analytics
Reinforcement Learning in Practice: Contextual Bandits
Isee system simulation report
Introduction to Data Science
Sentiment Analysis | Machine Learning Algorithms | Data Science Tutorial | Ed...
AI Orange Belt - Session 1
Ad

Similar to Automating fetal heart monitor using machine learning (20)

PPTX
Machine Learning for Data Extraction
PPTX
Unit 1-ML (1) (1).pptx
PPTX
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
PDF
Early_TranscEndEnTals_EighTh_EdiTion.pdf
PDF
Big data expo - machine learning in the elastic stack
PDF
Artur Suchwalko “What are common mistakes in Data Science projects and how to...
PDF
Data science training in hyd ppt converted (1)
PDF
Data science training in hyd pdf converted (1)
PDF
Data science training in hydpdf converted (1)
PPTX
Machine Learning
PPTX
Frequency response analysis.pptx
PDF
Data Science Training and Placement
PPTX
internship ppt.pptx
PPTX
INT254_Zero Lecture Machine Learning 1st book
PDF
Machine Learning and its Applications
PDF
Hacking Predictive Modeling - RoadSec 2018
PDF
Unit 1_Introduction to ML_Types_Applications.pdf
PDF
Barga Data Science lecture 2
PPTX
AI hype or reality
PDF
Machine Learning for Statisticians - Introduction
Machine Learning for Data Extraction
Unit 1-ML (1) (1).pptx
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Early_TranscEndEnTals_EighTh_EdiTion.pdf
Big data expo - machine learning in the elastic stack
Artur Suchwalko “What are common mistakes in Data Science projects and how to...
Data science training in hyd ppt converted (1)
Data science training in hyd pdf converted (1)
Data science training in hydpdf converted (1)
Machine Learning
Frequency response analysis.pptx
Data Science Training and Placement
internship ppt.pptx
INT254_Zero Lecture Machine Learning 1st book
Machine Learning and its Applications
Hacking Predictive Modeling - RoadSec 2018
Unit 1_Introduction to ML_Types_Applications.pdf
Barga Data Science lecture 2
AI hype or reality
Machine Learning for Statisticians - Introduction
Ad

Recently uploaded (20)

PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
A systematic review of self-coping strategies used by university students to ...
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Cell Types and Its function , kingdom of life
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
01-Introduction-to-Information-Management.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Chinmaya Tiranga quiz Grand Finale.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
A systematic review of self-coping strategies used by university students to ...
Final Presentation General Medicine 03-08-2024.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Supply Chain Operations Speaking Notes -ICLT Program
Cell Types and Its function , kingdom of life
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
2.FourierTransform-ShortQuestionswithAnswers.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Microbial diseases, their pathogenesis and prophylaxis
01-Introduction-to-Information-Management.pdf
Microbial disease of the cardiovascular and lymphatic systems
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student

Automating fetal heart monitor using machine learning

  • 1. Automating Fetus Health Monitor Using Machine Learning Md. Tamjid Rayhan Department of Electrical and Electronic Engineering University of Chittagong Session : 2012 - 2013 1
  • 2. Outline of this seminar • How I started my journey in Machine Learning and what plan do I suggest for the newbies? • How did I approach a Biomedical Engineering problem using machine learning? • Question and Answer session. 2
  • 3. Machine learning: Beginner to Professional • Which tutorials, programming languages, books and blogs suits the beginners best? • Do I need GPU’s , TPU’s, Cloud computing resources? If so, can I get them for free? • Where do I get my data for free? • How do I know whether I have become a Pro in Machine Learning? 3
  • 4. Where to Start? • C /C++ / JAVA / MATLAB / R/ Python? • There’s a github repo that can help you decide • https://github.com/josephmisiti/awesome-machine-learning 4
  • 5. Advantages of python • Easy to learn • Easy to find materials • Easy to write code in python • Enormous community who are very supportive • There is a library / framework of python for almost any ML task • Python can also be used for Web and GUI development 5
  • 6. How to start learning python? https://python.maateen.me/ 6
  • 7. Why not official documentation? https://docs.python.org/3/ 7
  • 8. What about video tutorials? https://www.py4e.com/ 8
  • 9. Which IDE should you use? • Here’s a link to all available options. Find one that suits you best: • https://wiki.python.org/moin/PythonEditors • I use iPython/Jupyter notebooks for data analysis and model development • I currently use VS code for my python development • I started coding in the python IDLE that comes built in with python installation. 9
  • 10. Data structure and algorithms Grokking Algorithms Introduction to algorithms 10
  • 11. Machine learning is all about MATH!!! • This is the truth , whether you are happy about it or sad about it! • Do I need to be a math genius to start learning Machine Learning. Well the answer is no! • The prerequisite maths of learning machine learning is very simple: Matrix multiplication, vector dot product, differentiation, mean, variance, histogram. • You need to love math though! Because each and everything you learn about will involve math! 11
  • 12. Get your hands dirty asap! With Kaggle! 12
  • 13. Essential Kaggle Micro-courses! 1. Python 5. Deep Learning 2. Pandas 6. Feature engineering 3. Intro to Machine Learning 7. Data visualization 4. Intermediate Machine Learning 8. Intro to SQL 13
  • 14. Outcome of Kaggle Micro courses Python Library Introduction Machine learning introduction Pandas Your first ML Project Scikit-learn Your first ML competition Tensorflow Your first DL project Matplotlib and Seaborn Data collection using SQL 14
  • 15. The signature Machine Learning Course https://www.coursera.org/learn/machine-learning 15
  • 16. Outcome of “Machine Learning” course Statistics Machine Learning Algorithms Linear and Logistic Regression Neural Networks implementation Concept of cost function and gradient descent SVM implementation Concept of bias and Variance K-Means clustering implementation Performance Metrics of an ML algorithm Recommender system implementation 16
  • 17. Do I need GPU or TPU? • No, instead use the free online virtual runtimes provided by Kaggle and Google Colab • In Google Colab you get a runtime with 12GB RAM and you have access to GPU and TPU. Also you can mount your google drive as permanent storage. • In Kaggle you can upload your own dataset , do analysis on it using their free runtime. 17
  • 18. Where to get the data from? • Kaggle competitions and datasets • UCI Machine learning repository • Physionet.org • Google Dataset search • Web scraping • Retrieving data from relational databases using SQL • Collecting and Building your own dataset • Collecting data from existing products and services 18
  • 19. What are the best books to learn Machine learning? Hands on Machine Learning using scikit-learn and tensorflow Deep Learning – Ian Goodfellow https://www.deeplearningbook.org/ 19
  • 20. Dive into deep learning! https://www.coursera.org/specializations/deep-learning 20
  • 21. Miscellenius Resources • Google crash course on machine learning • https://developers.google.com/machine-learning/crash-course/ • This blog is for computer vision enthusiasts • https://www.pyimagesearch.com/ • Subscribe the newsletter to deeplearning.ai • https://www.deeplearning.ai/thebatch/ • Having difficulty setting up your GPU? • https://course.fast.ai/#using-a-gpu 21
  • 22. End to End learning vs. multiple stage learning 22
  • 23. Small data vs Big data https://www.industryweek.com/technology-and-iiot/digital-tools/article/21122846/making- ai-work-with-small-data • Unlike consumer Internet companies, which have data from billions of users to train powerful AI models, collecting massive training sets in manufacturing is often not feasible. • In a recent MAPI survey, 58% of research respondents reported that the most significant barrier to deployment of AI solutions pertained to a lack of data resources. • Synthetic data generation, Transfer learning, Self-supervised learning, few- shot learning, One-shot learning, anomaly detection, Human-in-the-loop 23
  • 24. Solving a problem using Machine Learning • Why did I choose to work on automating Fetus Health Monitor? • Why Machine Learning is needed to solve this problem? • What were my results? Did I answer my research questions? • Is my work ready to get implemented in a hospital yet? If no, why not? 24
  • 25. Problem Stillbirth Global problem Application of technology is inadequete SDG of attaining zero stillbirth by 2030 28 out of 1000 births are stillbirths in Bangladesh 98% of these death in Developing countries 2.1 million stillbirth in 2015 25
  • 27. Cardiotocography features Feature Definition Ideal Value Baseline Heart rate at rest 110 – 160 beats per minute Variability Fluctuations from the baseline > 6 beats per minute Acceleration Abrupt increase from baseline of 15 beats/min that lasts for 15 seconds Must be present in a healthy fetus Deceleration Decrease from baseline of 15 beats/min that lasts for 15 seconds Decelerations are non- reassuring, Should not be present in healthy fetus 27
  • 28. Cardiotocography data of a normal fetus 28
  • 29. Cardiotocography data of a pathologic fetus 29
  • 30. Similar Problems • Detection of heart disease from ECG data • Predicting epileptic seizure from analyzing EEG data • Detecting Pneumonia from Chest X-ray Images 30
  • 31. Objectives • Research question 1: Is it possible to predict fetus health automatically from cardiotocography data? • Research question 2: If it is possible to predict fetus health automatically from the Cardiotocography data, then how accurate is that model in predicting fetus health? • Research question 3: If it is possible to predict fetus health automatically from the Cardiotocography data, then which features of cardiotocography data are most important in predicting fetus health? 31
  • 32. Data preprocessing Exploratory data analysis Handling categorical variables Missing value interpolation Outlier detection and removal Feature selection Train Test spliting 32
  • 33. Model Development Choose algorithm Algorithm applicability by drawing learning curve Tune Hyperparameters from validation curves Train model Calculate performance metrics, draw confusion matrix and ROC Compare the performance of all built models 33
  • 34. Learning Curve for SVM Learning curve for SVM with Linear kernel Learning curve for SVM with Gaussian kernel 34
  • 35. Comparisn between built models Model Description Sensitivity (Pathologic) Sensitivity (Suspected) Precision (Normal) Accuracy Logistic regression with selected features Good(0.946) bad(0.767) Excellent(0.976) Good(0.812) Logistic regression with all features Excellent (0.973) Good(0.86) Excellent(0.988) Average (0.804) Random forest with selected features Good (0.892) Good(0.837) Excellent( .974) Excellent( 0.953) Random forest with all features Good(0.892) Good(0.884) Excellent( .98) Excellent( 0.953) SVM with selected features Good(0.919) Good(0.837) Excellent(0.992) Good(0.841) SVM with all features Good(0.946) Good(0.814) Excellent(0.984) Good(0.833) 35
  • 36. Answers to research questions • Question 1: Is it possible to predict fetus health automatically from Cardiotocography data? • Answer: Yes, It is possible. • Question 2: Then how accurate is that model in predicting fetus health? • Answer: Based on the performance of models I selected two models as most implementable • One of them Random forest with selected features has high overall accuracy of 95.3% • Another one Logistic Regression with all features has compromised overall accuracy of 80.4% to obtain excellent sensitivity in pathologic of 97% 36
  • 37. Answers to research questions • Question 3: Which features of cardiotocography data are most important in predicting fetus health? • Answer: According to feature importance of random forest model the feature importance of the five most important features are as below: Feature Importance Percentage of time with abnormal short term variability (ASTV) 0.15 Percentage of time with abnormal long term variability (ALTV) 0.13 Histogram mean 0.1 Mean value of short term variability (MSTV) 0.08 Acceleration (AC) 0.07 37
  • 38. How do you know you have become a Professional in Machine Learning? • Be a competitions, notebooks, datasets, discussions master in Kaggle. 4x grandmaster Abhishek Thakur explains: • https://www.youtube.com/watch?v=z15TKkAPNUM • Publish the results of your research in an impactful journal. • Get the Machine learning engineer job you wanted • Get accepted in your favourite PhD program • Build a good freelancing portfolio in this field 38
  • 39. Yann Lecun’s Advice • So my advice is, if you want to get into this, make yourself useful. • So make a contribution to an open source project, for example. • Or make an implementation of some standard algorithm that you can't find the code of • online, but you'd like to make it available to other people. • So take a paper that you think is important, • and then re-implement the algorithm, and then put it open source package, • or contribute to one of those open source packages. • And if the stuff you write is interesting and useful, you'll get noticed. • Maybe you'll get a nice job at a company you really wanted a job at, • or maybe you'll get accepted in your favorite PhD program or things like this. 39
  • 40. A business example! Digital Marketing! • A company has passenger ships from Tokyo to Hokkaido • The company also have a dataset of Internet usage data for it’s potential customers (Potential Tourists from Tokyo to Hokkaido) • Can this company deliver custom advertisements based on the users behavior and characteristics, so that the user is most likely to buy a ticket to Hokkaido from this company? 40
  • 41. K-Means clustering to group similar customers 41
  • 42. What are the things this group of customers are interested in? 42
  • 43. My team’s solution for this cluster! • There will be a fortune telling corner on the ship! • There will be Kabuki shows on the journey! 43
  • 44. Our Advertisement for this cluster! Omikuji Kabuki Art performance 44
  • 45. 45