SlideShare a Scribd company logo
AWS & GOOGLE MACHINE LEARNING SERVICES
11.10.2017
Max Pagels, Machine Learning Specialist
@maxpagels, linkedin.com/in/maxpagels/
ABOUT ME
• BSc, MSc in Computer Science (University of Helsinki)
• Currently doing applied machine learning at SC5, a
consultancy based in Helsinki
• Former CS researcher
• Current favourite ML algorithm: LSTM neural networks
• Favourite programming languages: JavaScript for full-stack,
Python for ML
• Other hats I wear: full-stack developer, technical interviewer
AGENDA
Part 1: Overview of AWS & Google machine learning services
〜 〜 〜
Part 2: Classification demo using Google ML Engine and AWS ML
〜 〜 〜
Part 3 (time permitting): Free-form Q&A
IF YOU HAVE ANY QUESTIONS, AT ANY TIME, DO ASK!
MACHINE LEARNING
MACHINE LEARNING LEARNS FROM
DATA TO SOLVE COMPLEX TASKS
In contrast to traditional programming, machine learning
learns from data and automatically produces a program to
solve a task.
That makes it suitable for tasks that are impossible or
infeasible to code by hand, as well as automation of things
that are tedious but (until now) required human
supervision. ML is the foundation of modern artificial
intelligence.
AWS Machine Learning & Google Cloud Machine Learning
TRADITIONAL PROGRAMMING
Input → Algorithm → Output
MACHINE LEARNING
Input & output → Learning algorithm → Program
THERE ARE 3 MAIN TYPES OF MACHINE
LEARNING
SUPERVISED LEARNING
Learn from labelled data to build
a model for predicting a number
(regression) or a a discrete class
(classification) on new data
UNSUPERVISED LEARNING
Find structure in unlabelled data
to provide insights
REINFORCEMENT LEARNING
Take actions in the world, receive
rewards, and learn to maximise
reward over time
MACHINE LEARNING IS RESOURCE-
INTENSIVE
ML models learn from data using numerical
optimisation and linear algebra. It’s not uncommon
for each pass over training data to require
thousands or even millions of mathematical
operations. Inference (predicting/classifying new
examples) is also expensive.
THE CLOUD OFFERS CPU/GPU COMPUTE
ON-DEMAND
In addition to infrastructure-as-a-service, the big
cloud vendors (Azure, Google, AWS) also offer a
number of managed and hybrid ML & AI solutions.
LET’S TAKE A CLOSER LOOK AT THE
SERVICES PROVIDED BY GOOGLE AND
AMAZON (AWS)
GOOGLE CLOUD MACHINE LEARNING
From their website: “Google Cloud's AI provides modern machine learning
services, with pre-trained models and a service to generate your own tailored
models. Our neural net-based ML service has better training performance and
increased accuracy compared to other large scale deep learning systems. Our
services are fast, scalable and easy to use. Major Google applications use Cloud
machine learning, including Photos (image search), the Google app (voice search),
Translate, and Inbox (Smart Reply). Our platform is now available as a cloud
service to bring unmatched scale and speed to your business applications.”
BREAKDOWN OF GOOGLE AI SERVICES
FULLY MANAGED APIS
• Cloud Jobs: machine learning-powered
job search engine
• Cloud Video Intelligence: extract
metadata, identify key nouns, and
automatically annotate the content of
videos using a REST API
• Cloud Vision: image classification, object
recognition and OCR as-a-service
• Cloud Speech: audio-to-text
• Natural Language: extract information
about people, places, events etc.
Sentiment analysis supported
• Cloud Translation: language translation á
la Google Translate
HYBRID/BYO
• Machine Learning Engine: general-
purpose machine learning training and
inference engine
• Implement your own learning
algorithms in TensorFlow, provide
training data, train in the cloud without
worrying about servers
• Deploy trained models in the cloud for
a scalable server less prediction API
• Can also do training and/or inference
on your local machine
AWS MACHINE LEARNING SERVICES
From their website: “Within AWS, we’re focused on bringing that knowledge and
capability to you through three layers of the AI stack: Frameworks and
Infrastructure with tools like Apache MXNet and TensorFlow, API-driven
Services to quickly add intelligence to applications, and Machine Learning
Platforms for data scientists.”
BREAKDOWN OF AWS AI SERVICES
FULLY MANAGED APIS
• Amazon Lex: natural language
understanding and speech recognition,
powered by the same AIs used in Alexa
• Amazon Polly: text-to-speech as-a-
service
• Amazon Rekognition: ready-made image
recognition, object recognition, and OCR
FULLY MANAGED SERVICES
• Amazon Machine Learning: linear and
logistic regression as-a-service
• Provide data, choose learning
algorithm, train in the cloud
• Deploy a trained model as a
prediction API in the cloud
BYO/PLATFORM SERVICES
• Amazon EMR: Managed Hadoop/Spark
environment, implement your
algorithms/training/inference yourself
• Amazon Deep Learning AMIs: spin up
instances on EC2 preinstalled with
TensorFlow, MXnet, Theano, Caffe, CNTK,
Torch etc. and handle the rest yourself
• Amazon EC2: spin up instances with the
CPU/GPU power you require and install
whatever you like
A NOTE ON PRICING
Cloud services are typically pay-as-you-go/pay for what you use. For
machine learning, that usually means you pay for time/resources needed to
train, and time needed to do predictions. Unless you use a service that
explicitly spins up hardware and keeps it running, you typically don’t pay
anything if you aren’t doing training/inference.
PRICING EXAMPLE (CLOUD INFRASTRUCTURE)
PRICING EXAMPLE (CLOUD INFRASTRUCTURE)
Example: on AWS EC2, a p2.8xlarge instance has:
• 32 vCPUs
• 488 GiB RAM
• 8 NVIDIA K80 GPUs, 2,496 PPCs and 12GiB of
GPU memory per GPU
PRICING EXAMPLE (CLOUD INFRASTRUCTURE)
Cost of buying one K80 yourself: 5,000 €
Cost of buying the equivalent hardware yourself: 50,000 €
Cost of running the instance in AWS: about 8 € per hour
50,000 € equals 260 consecutive days of p2.8xlarge use
QUESTIONS SO FAR?
MULTI-CLASS CLASSIFICATION USING GOOGLE ML
ENGINE & AWS MACHINE LEARNING
THE IRIS DATASET
There are lots of freely available ML datasets online (for
example on Kaggle). One of them is the legendary Iris flower
dataset.
It includes three iris species with 50 samples each as well as
some properties (features) about each flower: sepal length,
sepal width, petal length & petal width (in cm).
With Google ML engine and AWS Machine learning, we are
going to train an ML model on the iris data to build a
classifier that can correctly classify new examples as one of
three iris species (classes).
LEARNING ALGORITHM: LOGISTIC
REGRESSION
Logistic regression is a simple learning algorithm. It’s similar
to linear regression, but meant for classification problems.
LOGISTIC REGRESSION OVERVIEW
1. Assume a linear relationship between features:
L = w₁ * sepal_width + w₂ * sepal_length + w₃ * petal_width + w₄
* petal_length
2.Use a sigmoid function to convert the result to a probability of
belonging to a class:
H(L) = sigmoid(L) = 1 / (1 + e^(-L))
3.Build 1) & 2) for each possible class
4.Iterate over our dataset, construct H(L) for each example, check how
far we were from the correct class (we have the correct answers in our
labelled dataset)
5.Adjust weights w₁, w₂, w₃ and w₄ so that, on average, we get things
less wrong next time (note: use partial derivatives)
6.Iterate 4)-5) until we achieve good accuracy (i.e. classify as many
examples correctly as possible)
7. Stop iterating when accuracy is “good enough” or after some
predetermined number of iterations
REMEMBER: G-I-G-O
GIGO stands for “Garbage in, garbage out”. Without quality,
cleaned source data machine learning won’t work well.
Some estimates say that feature engineering & data cleaning
account for 80% of data scentists’ work
WALKTHROUGH: AWS MACHINE LEARNING
WALKTHROUGH: GOOGLE CLOUD ML ENGINE
Google Cloud Machine Learning Engine AWS Machine Learning
Service type Hybrid Fully managed
Supported algorithms Linear and non-linear learners (DNNs, linear &
logistic regression, Bayesian learners etc.)
Only linear learners (linear & logistic
regression)
Algorithm implementation BYO: Build using Tensorflow (low-level API or
Estimators) or Keras (tf.contrib.keras)
Pre-defined (linear & logistic multi-class
regression)
Accepted data sources Google Cloud Storage, BigTable & other
Google Cloud platform storage services
S3 (CSV-formatted data), RedShift
Built-in data transformation tools Full control (TensorFlow functionality +
packaging of Python modules as dependencies)
Limited, using “Recipes” (editable in the
console UI)
Model training In the cloud or locally In the cloud
GPU support for training Yes No?
Hyperparameter tuning Full control + automatic tuning Limited manual tuning (regularisation, epochs)
Cross-validation Yes, configurable Yes (configurable train/test sets but no K-fold)
Model versioning Explicit Implicit
Underlying computation engine TensorFlow AWS EMR (Spark MLlib?)
Real-time predictions Yes, using Cloud Engine Prediction API Yes (built-in)
Batch predictions Yes, using Cloud Engine Prediction API Yes (built-in)
Monitoring Yes (Training jobs console, TensorBoard) Yes (CloudWatch and AWS ML UI)
WHAT WE SAW WAS TWO SIMPLE DEMOS…
BUT THE POSSIBILITIES ARE ENDLESS
THANK YOU!
QUESTIONS?

More Related Content

PDF
Machine Learning Using Cloud Services
PDF
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
PDF
Azure AI platform - Automated ML workshop
PPTX
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
PDF
Data Science on Google Cloud Platform
PDF
Productionizing Machine Learning Pipelines with Databricks and Azure ML
PDF
Machine learning at scale by Amy Unruh from Google
PDF
Hopsworks - The Platform for Data-Intensive AI
Machine Learning Using Cloud Services
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
Azure AI platform - Automated ML workshop
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Data Science on Google Cloud Platform
Productionizing Machine Learning Pipelines with Databricks and Azure ML
Machine learning at scale by Amy Unruh from Google
Hopsworks - The Platform for Data-Intensive AI

What's hot (20)

PDF
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
PDF
Nexxworks bootcamp ML6 (27/09/2017)
PPTX
Azure Machine Learning
PDF
CI/CD for Machine Learning with Daniel Kobran
PDF
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
PDF
Google Cloud Platform for Data Science teams
PDF
Cloud Native Data Pipelines
PDF
Deep Learning on Apache Spark
PPTX
Serverless Data Architecture at scale on Google Cloud Platform
PDF
Metaflow: The ML Infrastructure at Netflix
PPTX
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
PPTX
Microsoft Machine Learning Server. Architecture View
PDF
Distributed Deep Learning on Spark
PDF
A Microservices Framework for Real-Time Model Scoring Using Structured Stream...
PPTX
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
PPTX
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
PPTX
ML6 talk at Nexxworks Bootcamp
PPTX
Scalable Machine Learning using R and Azure HDInsight - Parashar
PDF
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
PPTX
Machine Learning and Hadoop
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Nexxworks bootcamp ML6 (27/09/2017)
Azure Machine Learning
CI/CD for Machine Learning with Daniel Kobran
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
Google Cloud Platform for Data Science teams
Cloud Native Data Pipelines
Deep Learning on Apache Spark
Serverless Data Architecture at scale on Google Cloud Platform
Metaflow: The ML Infrastructure at Netflix
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Microsoft Machine Learning Server. Architecture View
Distributed Deep Learning on Spark
A Microservices Framework for Real-Time Model Scoring Using Structured Stream...
Sundar Ranganathan, NetApp + Vinod Iyengar, H2O.ai - Driverless AI integratio...
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
ML6 talk at Nexxworks Bootcamp
Scalable Machine Learning using R and Azure HDInsight - Parashar
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
Machine Learning and Hadoop
Ad

Similar to AWS Machine Learning & Google Cloud Machine Learning (20)

PDF
Machine Learning on the Cloud with Apache MXNet
PDF
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
PPTX
my ppt preentation.pptx
PPTX
Introduction to Machine learning and Deep Learning
PPTX
2018 11 14 Artificial Intelligence and Machine Learning in Azure
PDF
Time series modeling workd AMLD 2018 Lausanne
PPTX
Amazon Web Services (AWS) Presentation
PDF
201908 Overview of Automated ML
PDF
Infrastructure Agnostic Machine Learning Workload Deployment
PPT
Cloud Computing
PDF
Deep Dive into Apache MXNet on AWS
PDF
"Fast Start to Building on AWS", Igor Ivaniuk
PDF
AI LLM Inference and SageMaker Pipeline in AWS
PDF
Democratize ai with google cloud
PDF
Google Cloud: Data Analysis and Machine Learningn Technologies
PDF
Introduction to the AWS Cloud from Digital Tuesday Meetup
PPTX
BigData- On - AWS Cloud -1
PPTX
AWS re:Invent 2016 : announcement, technical demos and feedbacks
PPTX
Designing Artificial Intelligence
PDF
Machine Learning Model as API with AWS Serverless- Loves Cloud
Machine Learning on the Cloud with Apache MXNet
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
my ppt preentation.pptx
Introduction to Machine learning and Deep Learning
2018 11 14 Artificial Intelligence and Machine Learning in Azure
Time series modeling workd AMLD 2018 Lausanne
Amazon Web Services (AWS) Presentation
201908 Overview of Automated ML
Infrastructure Agnostic Machine Learning Workload Deployment
Cloud Computing
Deep Dive into Apache MXNet on AWS
"Fast Start to Building on AWS", Igor Ivaniuk
AI LLM Inference and SageMaker Pipeline in AWS
Democratize ai with google cloud
Google Cloud: Data Analysis and Machine Learningn Technologies
Introduction to the AWS Cloud from Digital Tuesday Meetup
BigData- On - AWS Cloud -1
AWS re:Invent 2016 : announcement, technical demos and feedbacks
Designing Artificial Intelligence
Machine Learning Model as API with AWS Serverless- Loves Cloud
Ad

More from SC5.io (12)

PDF
Transfer learning with Custom Vision
PDF
Practical AI for Business: Bandit Algorithms
PDF
Decision trees & random forests
PDF
Bandit Algorithms
PDF
Angular.js Primer in Aalto University
PDF
Miten design-muutosjohtaminen hyödyttää yrityksiä?
PDF
Securing the client side web
PDF
Engineering HTML5 Applications for Better Performance
PDF
2013 10-02-backbone-robots-aarhus
PDF
2013 10-02-html5-performance-aarhus
PDF
2013 04-02-server-side-backbone
PPTX
Building single page applications
Transfer learning with Custom Vision
Practical AI for Business: Bandit Algorithms
Decision trees & random forests
Bandit Algorithms
Angular.js Primer in Aalto University
Miten design-muutosjohtaminen hyödyttää yrityksiä?
Securing the client side web
Engineering HTML5 Applications for Better Performance
2013 10-02-backbone-robots-aarhus
2013 10-02-html5-performance-aarhus
2013 04-02-server-side-backbone
Building single page applications

Recently uploaded (20)

PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
KodekX | Application Modernization Development
PDF
Advanced IT Governance
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Empathic Computing: Creating Shared Understanding
PDF
Electronic commerce courselecture one. Pdf
PPTX
Big Data Technologies - Introduction.pptx
Understanding_Digital_Forensics_Presentation.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Spectral efficient network and resource selection model in 5G networks
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
20250228 LYD VKU AI Blended-Learning.pptx
KodekX | Application Modernization Development
Advanced IT Governance
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Network Security Unit 5.pdf for BCA BBA.
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
The Rise and Fall of 3GPP – Time for a Sabbatical?
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Empathic Computing: Creating Shared Understanding
Electronic commerce courselecture one. Pdf
Big Data Technologies - Introduction.pptx

AWS Machine Learning & Google Cloud Machine Learning

  • 1. AWS & GOOGLE MACHINE LEARNING SERVICES 11.10.2017 Max Pagels, Machine Learning Specialist @maxpagels, linkedin.com/in/maxpagels/
  • 2. ABOUT ME • BSc, MSc in Computer Science (University of Helsinki) • Currently doing applied machine learning at SC5, a consultancy based in Helsinki • Former CS researcher • Current favourite ML algorithm: LSTM neural networks • Favourite programming languages: JavaScript for full-stack, Python for ML • Other hats I wear: full-stack developer, technical interviewer
  • 3. AGENDA Part 1: Overview of AWS & Google machine learning services 〜 〜 〜 Part 2: Classification demo using Google ML Engine and AWS ML 〜 〜 〜 Part 3 (time permitting): Free-form Q&A
  • 4. IF YOU HAVE ANY QUESTIONS, AT ANY TIME, DO ASK!
  • 6. MACHINE LEARNING LEARNS FROM DATA TO SOLVE COMPLEX TASKS In contrast to traditional programming, machine learning learns from data and automatically produces a program to solve a task. That makes it suitable for tasks that are impossible or infeasible to code by hand, as well as automation of things that are tedious but (until now) required human supervision. ML is the foundation of modern artificial intelligence.
  • 8. TRADITIONAL PROGRAMMING Input → Algorithm → Output
  • 9. MACHINE LEARNING Input & output → Learning algorithm → Program
  • 10. THERE ARE 3 MAIN TYPES OF MACHINE LEARNING SUPERVISED LEARNING Learn from labelled data to build a model for predicting a number (regression) or a a discrete class (classification) on new data UNSUPERVISED LEARNING Find structure in unlabelled data to provide insights REINFORCEMENT LEARNING Take actions in the world, receive rewards, and learn to maximise reward over time
  • 11. MACHINE LEARNING IS RESOURCE- INTENSIVE ML models learn from data using numerical optimisation and linear algebra. It’s not uncommon for each pass over training data to require thousands or even millions of mathematical operations. Inference (predicting/classifying new examples) is also expensive.
  • 12. THE CLOUD OFFERS CPU/GPU COMPUTE ON-DEMAND In addition to infrastructure-as-a-service, the big cloud vendors (Azure, Google, AWS) also offer a number of managed and hybrid ML & AI solutions.
  • 13. LET’S TAKE A CLOSER LOOK AT THE SERVICES PROVIDED BY GOOGLE AND AMAZON (AWS)
  • 15. From their website: “Google Cloud's AI provides modern machine learning services, with pre-trained models and a service to generate your own tailored models. Our neural net-based ML service has better training performance and increased accuracy compared to other large scale deep learning systems. Our services are fast, scalable and easy to use. Major Google applications use Cloud machine learning, including Photos (image search), the Google app (voice search), Translate, and Inbox (Smart Reply). Our platform is now available as a cloud service to bring unmatched scale and speed to your business applications.”
  • 16. BREAKDOWN OF GOOGLE AI SERVICES FULLY MANAGED APIS • Cloud Jobs: machine learning-powered job search engine • Cloud Video Intelligence: extract metadata, identify key nouns, and automatically annotate the content of videos using a REST API • Cloud Vision: image classification, object recognition and OCR as-a-service • Cloud Speech: audio-to-text • Natural Language: extract information about people, places, events etc. Sentiment analysis supported • Cloud Translation: language translation á la Google Translate HYBRID/BYO • Machine Learning Engine: general- purpose machine learning training and inference engine • Implement your own learning algorithms in TensorFlow, provide training data, train in the cloud without worrying about servers • Deploy trained models in the cloud for a scalable server less prediction API • Can also do training and/or inference on your local machine
  • 18. From their website: “Within AWS, we’re focused on bringing that knowledge and capability to you through three layers of the AI stack: Frameworks and Infrastructure with tools like Apache MXNet and TensorFlow, API-driven Services to quickly add intelligence to applications, and Machine Learning Platforms for data scientists.”
  • 19. BREAKDOWN OF AWS AI SERVICES FULLY MANAGED APIS • Amazon Lex: natural language understanding and speech recognition, powered by the same AIs used in Alexa • Amazon Polly: text-to-speech as-a- service • Amazon Rekognition: ready-made image recognition, object recognition, and OCR FULLY MANAGED SERVICES • Amazon Machine Learning: linear and logistic regression as-a-service • Provide data, choose learning algorithm, train in the cloud • Deploy a trained model as a prediction API in the cloud BYO/PLATFORM SERVICES • Amazon EMR: Managed Hadoop/Spark environment, implement your algorithms/training/inference yourself • Amazon Deep Learning AMIs: spin up instances on EC2 preinstalled with TensorFlow, MXnet, Theano, Caffe, CNTK, Torch etc. and handle the rest yourself • Amazon EC2: spin up instances with the CPU/GPU power you require and install whatever you like
  • 20. A NOTE ON PRICING
  • 21. Cloud services are typically pay-as-you-go/pay for what you use. For machine learning, that usually means you pay for time/resources needed to train, and time needed to do predictions. Unless you use a service that explicitly spins up hardware and keeps it running, you typically don’t pay anything if you aren’t doing training/inference. PRICING EXAMPLE (CLOUD INFRASTRUCTURE)
  • 22. PRICING EXAMPLE (CLOUD INFRASTRUCTURE) Example: on AWS EC2, a p2.8xlarge instance has: • 32 vCPUs • 488 GiB RAM • 8 NVIDIA K80 GPUs, 2,496 PPCs and 12GiB of GPU memory per GPU
  • 23. PRICING EXAMPLE (CLOUD INFRASTRUCTURE) Cost of buying one K80 yourself: 5,000 € Cost of buying the equivalent hardware yourself: 50,000 € Cost of running the instance in AWS: about 8 € per hour 50,000 € equals 260 consecutive days of p2.8xlarge use
  • 25. MULTI-CLASS CLASSIFICATION USING GOOGLE ML ENGINE & AWS MACHINE LEARNING
  • 26. THE IRIS DATASET There are lots of freely available ML datasets online (for example on Kaggle). One of them is the legendary Iris flower dataset. It includes three iris species with 50 samples each as well as some properties (features) about each flower: sepal length, sepal width, petal length & petal width (in cm). With Google ML engine and AWS Machine learning, we are going to train an ML model on the iris data to build a classifier that can correctly classify new examples as one of three iris species (classes).
  • 27. LEARNING ALGORITHM: LOGISTIC REGRESSION Logistic regression is a simple learning algorithm. It’s similar to linear regression, but meant for classification problems.
  • 28. LOGISTIC REGRESSION OVERVIEW 1. Assume a linear relationship between features: L = w₁ * sepal_width + w₂ * sepal_length + w₃ * petal_width + w₄ * petal_length 2.Use a sigmoid function to convert the result to a probability of belonging to a class: H(L) = sigmoid(L) = 1 / (1 + e^(-L)) 3.Build 1) & 2) for each possible class 4.Iterate over our dataset, construct H(L) for each example, check how far we were from the correct class (we have the correct answers in our labelled dataset) 5.Adjust weights w₁, w₂, w₃ and w₄ so that, on average, we get things less wrong next time (note: use partial derivatives) 6.Iterate 4)-5) until we achieve good accuracy (i.e. classify as many examples correctly as possible) 7. Stop iterating when accuracy is “good enough” or after some predetermined number of iterations
  • 29. REMEMBER: G-I-G-O GIGO stands for “Garbage in, garbage out”. Without quality, cleaned source data machine learning won’t work well. Some estimates say that feature engineering & data cleaning account for 80% of data scentists’ work
  • 32. Google Cloud Machine Learning Engine AWS Machine Learning Service type Hybrid Fully managed Supported algorithms Linear and non-linear learners (DNNs, linear & logistic regression, Bayesian learners etc.) Only linear learners (linear & logistic regression) Algorithm implementation BYO: Build using Tensorflow (low-level API or Estimators) or Keras (tf.contrib.keras) Pre-defined (linear & logistic multi-class regression) Accepted data sources Google Cloud Storage, BigTable & other Google Cloud platform storage services S3 (CSV-formatted data), RedShift Built-in data transformation tools Full control (TensorFlow functionality + packaging of Python modules as dependencies) Limited, using “Recipes” (editable in the console UI) Model training In the cloud or locally In the cloud GPU support for training Yes No? Hyperparameter tuning Full control + automatic tuning Limited manual tuning (regularisation, epochs) Cross-validation Yes, configurable Yes (configurable train/test sets but no K-fold) Model versioning Explicit Implicit Underlying computation engine TensorFlow AWS EMR (Spark MLlib?) Real-time predictions Yes, using Cloud Engine Prediction API Yes (built-in) Batch predictions Yes, using Cloud Engine Prediction API Yes (built-in) Monitoring Yes (Training jobs console, TensorBoard) Yes (CloudWatch and AWS ML UI)
  • 33. WHAT WE SAW WAS TWO SIMPLE DEMOS…
  • 34. BUT THE POSSIBILITIES ARE ENDLESS