SlideShare a Scribd company logo
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Julien Simon
Principal Technical Evangelist, AI & Machine Learning, AWS
@julsimon
An Introduction
to Reinforcement Learning
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Supervised learning
Run an algorithm on a labelled data set, i.e. a data set containing samples
and answers. Gradually, the model learns how to correctly predict the right
answer. Regression and classification are examples of supervised learning.
Unsupervised learning
Run an algorithm on an unlabelled data set, i.e. a data set containing
samples only. Here, the model progressively learns patterns in data and
organizes samples accordingly. Clustering and topic modeling are examples
of unsupervised learning.
Typesof MachineLearning
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Supervised learning
Unsupervised learning
Types of Machine LearningSOPHISTICATIONOFMLMODELS
AMOUNT OF TRAINING DATA REQUIRED
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Typesof MachineLearning
AMOUNT OFTRAINING DATA REQUIRED
Supervised learning
Unsupervised learning
SOPHISTICATIONOFMLMODELS
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Typesof MachineLearning
Reinforcement learning
(RL)
Supervised learning
Unsupervised learning
AMOUNT OFTRAINING DATA REQUIRED
SOPHISTICATIONOFMLMODELS
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Remember whenyoufirstlearned this?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Or this?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
We didn’t have an extensive labelled data
set back then 
And yet we learned
How?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Defining Reinforcement Learning
An algorithm (aka an agent) interacts with its
environment.
The agent receives a positive or negative reward
for actions that it takes: rewards are computed by
a user-defined function which outputs a numeric
representation of the actions that should be
incentivized.
By trying to maximize the accumulation of
rewards, the agent learns an optimal strategy (aka
policy) for decision making.
Source: Wikipedia
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Usecases
• Large complex problems
• Uncertain, dynamic environments
• Continuous learning
• Supply chain management
• HVAC systems
• Industrial robotics
• Autonomous vehicles
• Portfolio management
• Oil exploration
• etc.
Caterpillar: 250-ton autonomous mining trucks
https://diginomica.com/2017/04/17/sending-disruption-mines/
https://www.cat.com/en_US/articles/customer-stories/built-for-it/thefutureisnow-driverless.html
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Example: navigatingamaze
• Imagine an agent learning to navigate a maze. It can move in certain directions but is
blocked from going through walls.
• The agent discovers its environment (the current maze) one step at at time, receiving a
reward each time: stepping into a dead end is a negative reward, moving one step closer
to the exit is a positive reward.
• After a certain number of steps (or if we found the exit), the current episode ends.
• After a certain number of episodes, the agent uses the action/reward data points to
train a model, in order to make better decisions next time around.
• One critical thing to understand is that the RL model isn’t trained on a predefined set of
labelled mazes (that would be supervised learning).
• This cycle of exploring and training is central to RL: given enough mazes and enough
training time, we would soon enough know how to navigate any maze.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Environment
• The space in which the RL model operates.
• This can be either a real-world environment
or a simulator.
• If you train a physical autonomous vehicle
on a physical road, that would be a real-
world environment.
• If you train a computer program that
models an autonomous vehicle driving on a
road, that would be a simulator… probably
a much safer option!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ExploitationvsExploration
• Selecting the next action is a balance
between exploitation (‘using what you’ve
learned’) and exploration (‘taking a chance
to learn new things’)
• If you favor exploitation, you may never
reach high-value rewards.
• If you favor exploration, you’ll probably run
into trouble very often!
• Initially, the agent will explore at random
for a fixed number of episodes (aka heatup
phase): this generates data for the first
round of training.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Training aRLmodel
1. Formulate the problem: goal, environment, state, actions, reward
2. Define the environment: real-world or simulator?
3. Define the presets
4. Write the training code and the value function
5. Train the model
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AmazonSageMakerRL
Reinforcementlearningforeverydeveloperanddatascientist
Broad support
for frameworks
Broad support for simulation
environments including
SimuLink and MatLab
K E Y F E A T U R E S
TensorFlow,Apache
MXNet, Intel Coach, and
Ray RL support
2D & 3D physics
environments and
OpenAI Gym support
Supports Amazon Sumerian and
Amazon RoboMaker
Fully
managed
Example notebooks
and tutorials
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How can weget developers rolling
withreinforcement learning?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
IntroducingAWS DeepRacer
Fullyautonomous1/18thscaleracecar, drivenbyreinforcementlearning
https://youtu.be/X-6v4RZy-TE
HD video camera
Dual-core Intel
processorFour-wheel drive
Dual power for
compute and drive
Accelerometer
Gyroscope
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DeepRacer
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DeepRacer League
CompetitiveracingleagueforAWSDeepRacer
Compete virtually onlineTrain models with RL
Race in trials Final at AWS re:Invent
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Getting started
http://aws.amazon.com/free
https://ml.aws
https://aws.amazon.com/sagemaker
https://aws.amazon.com/deepracer/
https://github.com/aws/sagemaker-python-sdk
https://github.com/awslabs/amazon-sagemaker-examples
https://medium.com/@julsimon
https://gitlab.com/juliensimon/dlnotebooks
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Julien Simon
Principal Technical Evangelist, AI & Machine Learning, AWS
@julsimon

More Related Content

PPTX
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
PPTX
AWS re:Invent 2018 - AIM401-R2 - Deep Learning Applications with Tensorflow
PPTX
AWS re:Invent 2018 - Machine Learning recap (December 2018)
PPTX
AWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
PPTX
An Introduction to Amazon SageMaker (October 2018)
PPTX
Amazon SageMaker (December 2018)
PPTX
Optimize your machine learning workloads on AWS (March 2019)
PDF
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
AWS re:Invent 2018 - AIM401-R2 - Deep Learning Applications with Tensorflow
AWS re:Invent 2018 - Machine Learning recap (December 2018)
AWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
An Introduction to Amazon SageMaker (October 2018)
Amazon SageMaker (December 2018)
Optimize your machine learning workloads on AWS (March 2019)
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...

What's hot (6)

PPTX
Optimize your Machine Learning Workloads on AWS (July 2019)
PPTX
Building Machine Learning Inference Pipelines at Scale (July 2019)
PPTX
Deep Learning on Amazon SageMaker (October 2018)
PDF
Starting your AI/ML project right (May 2020)
PPTX
Building Machine Learning Models Automatically (June 2020)
PDF
Speed up your Machine Learning workflows with build-in algorithms
Optimize your Machine Learning Workloads on AWS (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)
Deep Learning on Amazon SageMaker (October 2018)
Starting your AI/ML project right (May 2020)
Building Machine Learning Models Automatically (June 2020)
Speed up your Machine Learning workflows with build-in algorithms
Ad

Similar to An Introduction to Reinforcement Learning (December 2018) (20)

PDF
Revving up with Reinforcement Learning by Ricardo Sueiras
PPTX
Simulation To Reality: Reinforcement Learning For Autonomous Driving
PDF
Deep RL for Autonomous Driving exploring applications Cognitive vehicles 2019
PDF
anintroductiontoreinforcementlearning-180912151720.pdf
PPTX
An introduction to reinforcement learning
PPTX
Designing an AI that gains experience for absolute beginners
PDF
An introduction to deep reinforcement learning
PDF
Reinforcement Learning
PPTX
Making smart decisions in real-time with Reinforcement Learning
PPTX
Introduction to Reinforcement Learning.pptx
PPTX
Reinforcement Learning, Application and Q-Learning
PDF
Reinforcement learning in a nutshell
PDF
Horizon: Deep Reinforcement Learning at Scale
PDF
Advances in Reinforcement Learning
PDF
Reinforcement Learning with Amazon SageMaker RL
PDF
Racing with Artificial Intelligence
PDF
SageMaker로 강화학습(RL) 마스터링 :: 남궁선 - AWS Community Day 2019
PDF
Introduction to Reinforcement Learning | IABAC
PPTX
Reinforcement course material samples: lecture 1
PPTX
What Can RL do.pptx
Revving up with Reinforcement Learning by Ricardo Sueiras
Simulation To Reality: Reinforcement Learning For Autonomous Driving
Deep RL for Autonomous Driving exploring applications Cognitive vehicles 2019
anintroductiontoreinforcementlearning-180912151720.pdf
An introduction to reinforcement learning
Designing an AI that gains experience for absolute beginners
An introduction to deep reinforcement learning
Reinforcement Learning
Making smart decisions in real-time with Reinforcement Learning
Introduction to Reinforcement Learning.pptx
Reinforcement Learning, Application and Q-Learning
Reinforcement learning in a nutshell
Horizon: Deep Reinforcement Learning at Scale
Advances in Reinforcement Learning
Reinforcement Learning with Amazon SageMaker RL
Racing with Artificial Intelligence
SageMaker로 강화학습(RL) 마스터링 :: 남궁선 - AWS Community Day 2019
Introduction to Reinforcement Learning | IABAC
Reinforcement course material samples: lecture 1
What Can RL do.pptx
Ad

More from Julien SIMON (20)

PDF
Implementing high-quality and cost-effiient AI applications with small langua...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
PDF
Arcee AI - building and working with small language models (06/25)
PDF
deep_dive_multihead_latent_attention.pdf
PDF
Deep Dive: Model Distillation with DistillKit
PDF
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
PDF
Building High-Quality Domain-Specific Models with Mergekit
PDF
Tailoring Small Language Models for Enterprise Use Cases
PDF
Tailoring Small Language Models for Enterprise Use Cases
PDF
Julien Simon - Deep Dive: Compiling Deep Learning Models
PDF
Tailoring Small Language Models for Enterprise Use Cases
PDF
Julien Simon - Deep Dive - Optimizing LLM Inference
PDF
Julien Simon - Deep Dive - Accelerating Models with Better Attention Layers
PDF
Julien Simon - Deep Dive - Quantizing LLMs
PDF
Julien Simon - Deep Dive - Model Merging
PDF
An introduction to computer vision with Hugging Face
PDF
Reinventing Deep Learning
 with Hugging Face Transformers
PDF
Building NLP applications with Transformers
PPTX
Scale Machine Learning from zero to millions of users (April 2020)
Implementing high-quality and cost-effiient AI applications with small langua...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Trying to figure out MCP by actually building an app from scratch with open s...
Arcee AI - building and working with small language models (06/25)
deep_dive_multihead_latent_attention.pdf
Deep Dive: Model Distillation with DistillKit
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Building High-Quality Domain-Specific Models with Mergekit
Tailoring Small Language Models for Enterprise Use Cases
Tailoring Small Language Models for Enterprise Use Cases
Julien Simon - Deep Dive: Compiling Deep Learning Models
Tailoring Small Language Models for Enterprise Use Cases
Julien Simon - Deep Dive - Optimizing LLM Inference
Julien Simon - Deep Dive - Accelerating Models with Better Attention Layers
Julien Simon - Deep Dive - Quantizing LLMs
Julien Simon - Deep Dive - Model Merging
An introduction to computer vision with Hugging Face
Reinventing Deep Learning
 with Hugging Face Transformers
Building NLP applications with Transformers
Scale Machine Learning from zero to millions of users (April 2020)

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Cloud computing and distributed systems.
PDF
Empathic Computing: Creating Shared Understanding
PPT
Teaching material agriculture food technology
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Big Data Technologies - Introduction.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Unlocking AI with Model Context Protocol (MCP)
Digital-Transformation-Roadmap-for-Companies.pptx
Machine learning based COVID-19 study performance prediction
Understanding_Digital_Forensics_Presentation.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Cloud computing and distributed systems.
Empathic Computing: Creating Shared Understanding
Teaching material agriculture food technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
sap open course for s4hana steps from ECC to s4
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
20250228 LYD VKU AI Blended-Learning.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
The AUB Centre for AI in Media Proposal.docx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Big Data Technologies - Introduction.pptx

An Introduction to Reinforcement Learning (December 2018)

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Julien Simon Principal Technical Evangelist, AI & Machine Learning, AWS @julsimon An Introduction to Reinforcement Learning
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Supervised learning Run an algorithm on a labelled data set, i.e. a data set containing samples and answers. Gradually, the model learns how to correctly predict the right answer. Regression and classification are examples of supervised learning. Unsupervised learning Run an algorithm on an unlabelled data set, i.e. a data set containing samples only. Here, the model progressively learns patterns in data and organizes samples accordingly. Clustering and topic modeling are examples of unsupervised learning. Typesof MachineLearning
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Supervised learning Unsupervised learning Types of Machine LearningSOPHISTICATIONOFMLMODELS AMOUNT OF TRAINING DATA REQUIRED
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Typesof MachineLearning AMOUNT OFTRAINING DATA REQUIRED Supervised learning Unsupervised learning SOPHISTICATIONOFMLMODELS
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Typesof MachineLearning Reinforcement learning (RL) Supervised learning Unsupervised learning AMOUNT OFTRAINING DATA REQUIRED SOPHISTICATIONOFMLMODELS
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Remember whenyoufirstlearned this?
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Or this?
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. We didn’t have an extensive labelled data set back then  And yet we learned How?
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Defining Reinforcement Learning An algorithm (aka an agent) interacts with its environment. The agent receives a positive or negative reward for actions that it takes: rewards are computed by a user-defined function which outputs a numeric representation of the actions that should be incentivized. By trying to maximize the accumulation of rewards, the agent learns an optimal strategy (aka policy) for decision making. Source: Wikipedia
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Usecases • Large complex problems • Uncertain, dynamic environments • Continuous learning • Supply chain management • HVAC systems • Industrial robotics • Autonomous vehicles • Portfolio management • Oil exploration • etc. Caterpillar: 250-ton autonomous mining trucks https://diginomica.com/2017/04/17/sending-disruption-mines/ https://www.cat.com/en_US/articles/customer-stories/built-for-it/thefutureisnow-driverless.html
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Example: navigatingamaze • Imagine an agent learning to navigate a maze. It can move in certain directions but is blocked from going through walls. • The agent discovers its environment (the current maze) one step at at time, receiving a reward each time: stepping into a dead end is a negative reward, moving one step closer to the exit is a positive reward. • After a certain number of steps (or if we found the exit), the current episode ends. • After a certain number of episodes, the agent uses the action/reward data points to train a model, in order to make better decisions next time around. • One critical thing to understand is that the RL model isn’t trained on a predefined set of labelled mazes (that would be supervised learning). • This cycle of exploring and training is central to RL: given enough mazes and enough training time, we would soon enough know how to navigate any maze.
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Environment • The space in which the RL model operates. • This can be either a real-world environment or a simulator. • If you train a physical autonomous vehicle on a physical road, that would be a real- world environment. • If you train a computer program that models an autonomous vehicle driving on a road, that would be a simulator… probably a much safer option!
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. ExploitationvsExploration • Selecting the next action is a balance between exploitation (‘using what you’ve learned’) and exploration (‘taking a chance to learn new things’) • If you favor exploitation, you may never reach high-value rewards. • If you favor exploration, you’ll probably run into trouble very often! • Initially, the agent will explore at random for a fixed number of episodes (aka heatup phase): this generates data for the first round of training.
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Training aRLmodel 1. Formulate the problem: goal, environment, state, actions, reward 2. Define the environment: real-world or simulator? 3. Define the presets 4. Write the training code and the value function 5. Train the model
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AmazonSageMakerRL Reinforcementlearningforeverydeveloperanddatascientist Broad support for frameworks Broad support for simulation environments including SimuLink and MatLab K E Y F E A T U R E S TensorFlow,Apache MXNet, Intel Coach, and Ray RL support 2D & 3D physics environments and OpenAI Gym support Supports Amazon Sumerian and Amazon RoboMaker Fully managed Example notebooks and tutorials
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How can weget developers rolling withreinforcement learning?
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. IntroducingAWS DeepRacer Fullyautonomous1/18thscaleracecar, drivenbyreinforcementlearning https://youtu.be/X-6v4RZy-TE HD video camera Dual-core Intel processorFour-wheel drive Dual power for compute and drive Accelerometer Gyroscope © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS DeepRacer
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS DeepRacer League CompetitiveracingleagueforAWSDeepRacer Compete virtually onlineTrain models with RL Race in trials Final at AWS re:Invent
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Getting started http://aws.amazon.com/free https://ml.aws https://aws.amazon.com/sagemaker https://aws.amazon.com/deepracer/ https://github.com/aws/sagemaker-python-sdk https://github.com/awslabs/amazon-sagemaker-examples https://medium.com/@julsimon https://gitlab.com/juliensimon/dlnotebooks
  • 25. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Julien Simon Principal Technical Evangelist, AI & Machine Learning, AWS @julsimon

Editor's Notes

  • #4: 1/The type of datasets Ground Truth typically helps create can be used to create extremely sophisticated models using a method called ‘supervised’ learning; this is common with computer vision, speech, and language. 2/It’s how we train Rekognition - our computer vision service is trained on tens of millions of labeled images, Polly’s lifelike voices come from hundreds of hours of scripted voice recordings, and so forth. 3/The sheer volume of the data, combined with deep learning neural networks, allows us to train models with human-like capabilities based on that data. 4/At the other end of this spectrum is ‘unsupervised’ learning, where algorithms don’t need large volumes of labeled data. 5/These approaches are commonly used for use cases such as anomaly detection; where the algorithm is only looking for statistical outliers in, say, a stream of data from an IoT temperature sensor. When it detects that the temperature is changing in a meaningful way, the model can send a signal and take action (open a window, for example). 6/These models are no less useful - in fact they are complementary to supervised methods - but they don’t attempt to mimic human level intelligence in the same way.
  • #5: 1/ In the bottom right, we have a no man’s land where for the obvious reasons of not wanting to invest a lot for little gain, there’s no meaningful research happening. 2/ But, there’s fertile ground in the upper left
  • #21: 1/ There are a lot of demands placed on organizations when dealing with documents. What they typically want to be able to do sounds straightforward… 2/ They want to be able to identify documents in any format; 3/ and then extract text from those documents, accurately. 4/ But there are a whole ton of challenges which make this difficult; such as the variety of forms and formats, and the quality. 5/ The way customers try to overcome this complexity today is by either by manual review (which is accurate, but time consuming and expensive), or 6/ with simple OCR and/or.. 7/ template based data extraction (which is fast, but tends not to be accurate enough, so they end up sending the documents to manual review or verification anyway). TRANSITION: we think there is a better way, and that instead of manual reviews, simplistic OCR, and templates, we can replace that heavy lifting with smart, cheap, powerful machine learning…
  • #23: 1/ DeepRacer is a physical device, about the size of a shoe box, which is packed full of everything you need to learn about reinforcement learning through autonomous driving. 2/ It has an HD video camera mounted high up, so it can get a good view of the road ahead; 3/ To make it work, you access a fully configured 3D physics simulator available in the cloud, with a track and a virtual car ready to start training. 4/ All you need to do is provide a simple - or complex - scoring function, using simple Python code, and with a single click, we’ll train the model in the simulator using reinforcement learning in SageMaker - you can watch in real time if you wish to see how the learning is going. 5/ Then just take your model, load it onto DeepRacer, and watch it go… We think this is a really interesting and fun way to get started with reinforcement learning, and as we started to experiment with this internally, a funny thing happened… The teams started racing against each other; continually tweaking and adjusting their reward functions for speed around a virtual track. Factions sprang up, it got pretty competitive, and developer’s knowledge and experience with RL grew almost exponentially… In fact, we had so much fun, that we wanted to bring this to our customers, and so today, I’m also announcing…
  • #24: Here’s how the league will work… 1/ Anyone can build an RL model in SageMaker (or develop on own and bring to SageMaker) 2/ At our 20 or so AWS Summits in 2019 we’ll hold a DeepRacer League Race, you can compete in as many of these as you like. 3/ Winner of each DRL Race and top 10 points getters qualify for the DRL Championship Cup held at re:invent 2019 here in Vegas. 4/ We’ll also have virtual events and tournaments throughout the year, likely about 20 where we will take the winners and top 10 points getters to the Championship Cup at re:invent. 5/ While there will be individual prizes for each race, big prize is Championship Cup at re:Invent 6/ This year, for 2018, because we don’t have as much lead time, we’re doing an accelerated version for our first Championship Cup.