SlideShare a Scribd company logo
Use of Formal Methods
at Amazon Web Services(Chris Newcombe, Tim Rath, Fan Zhang, Bogdan Munteanu, Marc Brooker, Michael Deardeuff )
ASAD RIAZ (021)
MALIK FARHAN (028)
HASSNAIN SHAH (086)
What is AWS?
oCloud services
oDatabase storage
oNetworking
oPay-as-you-go pricing
AWS Services
oS3
oLaunch a virtual machine
oBuild a web app
oMachine learning (Rekognition)
oDatabases (DynomoDB)
oAnalytics
oAR & VR
AWS Business Growth & Cost-efficient
Infrastructure
oS3 grew to store 1 trillion objects. Less than a year later it had
grown to 2 trillion objects, and was regularly handling 1.1 million
requests per second.
oFault tolerant
oReplication
oConsistency
oConcurrency
oLoad Balancing
Complexity
High complexity increases the probability of human error in design,
code & operations.
What we have tried?
oDeep design reviews
oStandard verification techniques
oCode reviews
oFault-injection testing
Still subtle bugs & failure reason? (complexity)
Solution?
oTLA Temporal Logic of Actions+, a formal specification language.
oTLA+ is based on simple discrete math, i.e. basic set theory and predicates, with which all
engineers are familiar.
oTLA+ specification describes the set of all possible legal behaviors.
oTLA+ describes correctness properties (the ‘what’). & the design of the system (the ‘how’).
oUse conventional mathematical reasoning & TLC model checker.
What is TLC?
A tool which takes a TLA+ specification & exhaustively checks the desired correctness properties.
TLA+ (Temporal Logic of Action)
PlusCal (similar to C-style programming language)
PlusCal is automatically translated to TLA+ with a single key press.
System Components Line count (excl. comments) Benefit
S3
Fault-tolerant low-level network
algorithm
804 PlusCal
Found 2 bugs. Found further bugs in
proposed optimizations.
Background redistribution of data 645 PlusCal
Found 1 bug, and found a bug in the first
proposed fix.
DynamoDB
Replication & group- membership
system
939 TLA+
Found 3 bugs, some requiring traces of
35 steps
EBS Volume management 102 PlusCal Found 3 bugs.
Internal distributed lock manager
Lock-free data structure 223 PlusCal
Improved confidence. Failed to find a
liveness bug as we did not check
liveness.
Fault tolerant replication and
reconfiguration algorithm
318 TLA+
Found 1 bug. Verified an aggressive
optimization.
Starting steps of Formal Specifications
1. Safety properties: “what the system is allowed to do”
Example: at all times, all committed data is present and correct.
2. Liveness properties: “what the system must eventually do”
Example: Whenever the system receives a request, it must
eventually respond to that request.
3. Next step: “what must go right”?
4. Conforming to the design: with the goal of confirming design
correctly handles all of the dynamic events in the environment.
What to confirm?
oNetwork errors & repairs
oDisk errors
oCrashes & restarts
oData center failure and repairs
oActions by human operators
5. Using the model checker to verify that the specification of the system in
its environment implements the chosen correctness properties.
TLA & PlusCal Example
The problem
You’re writing software for a bank. You have Alice and Bob as clients,
each with a certain amount of money in their accounts. Alice wants
to send some money to Bob. How do you model this? Assume all you
care about is their bank accounts.
Step One
Assertions & Sets
Can Alice’s account go negative? Asserts in TLA+ used for debugging.
Step Two
We are going to get error at this stage. Tell me why? Tell me how
we are going to fix it.
Fixing the issue
Conclusion
At AWS, formal methods have been a big success. They have helped
us prevent subtle, serious bugs from reaching production, bugs that
we would not have found via any other techniques.
In simple words, whatever we are now, that would not have been
achieved without using formal methods.

More Related Content

PDF
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
PDF
Observability and its application
PDF
IRJET- A Survey on Real Time Object Detection using Voice Activated Smart IoT
PPTX
Best practices with Microsoft Graph: Making your applications more performant...
PDF
Vertex AI: Pipelines for your MLOps workflows
PDF
Keynote: Harnessing the power of Elasticsearch for simplified search
PPTX
What is going on - Application diagnostics on Azure - TechDays Finland
PPTX
How build scalable IoT cloud applications with microservices
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Observability and its application
IRJET- A Survey on Real Time Object Detection using Voice Activated Smart IoT
Best practices with Microsoft Graph: Making your applications more performant...
Vertex AI: Pipelines for your MLOps workflows
Keynote: Harnessing the power of Elasticsearch for simplified search
What is going on - Application diagnostics on Azure - TechDays Finland
How build scalable IoT cloud applications with microservices

What's hot (12)

PPTX
The Internet of Things: Patterns for building real world applications
PDF
Can we build an Azure IoT controlled device in less than 40 minutes that cost...
PDF
Gschwind - AI Everywhere: democratize AI with an open platform and end-to -en...
PPTX
The Future of Energy - Decentral energy distribution in a digital world
PDF
Intro to Machine Learning with H2O and Python - Denver
PPTX
Azure machine learning ile tahminleme modelleri
PPTX
Machine Learning with GraphLab Create
PDF
Transforming data into actionable insights
PPTX
Automate your Machine Learning
PDF
Cómo transformar los datos en análisis con los que tomar decisiones
PPTX
TBuntel WebDU 2011 Preso
PDF
Operationalizing Machine Learning (Rajeev Dutt, CEO, Co-Founder, DimensionalM...
The Internet of Things: Patterns for building real world applications
Can we build an Azure IoT controlled device in less than 40 minutes that cost...
Gschwind - AI Everywhere: democratize AI with an open platform and end-to -en...
The Future of Energy - Decentral energy distribution in a digital world
Intro to Machine Learning with H2O and Python - Denver
Azure machine learning ile tahminleme modelleri
Machine Learning with GraphLab Create
Transforming data into actionable insights
Automate your Machine Learning
Cómo transformar los datos en análisis con los que tomar decisiones
TBuntel WebDU 2011 Preso
Operationalizing Machine Learning (Rajeev Dutt, CEO, Co-Founder, DimensionalM...
Ad

Similar to Use of Formal Methods at Amazon Web Services (20)

POTX
devworkshop-10_28_1015-amazon-conference-presentation
PPTX
Docker/DevOps Meetup: Metrics-Driven Continuous Performance and Scalabilty
PPTX
Chaos engineering & Gameday on AWS
PDF
Is your Automation Infrastructure ‘Well Architected’?
PPTX
Deep Dive: AWS X-Ray London Summit 2017
PDF
An introduction to Workload Modelling for Cloud Applications
PPTX
ConFoo 2017: Introduction to performance optimization of .NET web apps
PPT
16 greg hope_com_wics
PPTX
From Duke of DevOps to Queen of Chaos - Api days 2018
PDF
2016 - 10 questions you should answer before building a new microservice
PDF
Scaling Databricks to Run Data and ML Workloads on Millions of VMs
PDF
Who owns Software Security
PDF
Who Owns Software Security?
PDF
Chaos Engineering - The Art of Breaking Things in Production
PDF
Building a data warehouse with Amazon Redshift … and a quick look at Amazon ...
PPTX
We cant hack ourselves secure
PPTX
Virtual Data : Eliminating the data constraint in Application Development
PDF
Lessons from Large-Scale Cloud Software at Databricks
PPTX
Top Java Performance Problems and Metrics To Check in Your Pipeline
PPTX
5 Years Of Building SaaS On AWS
devworkshop-10_28_1015-amazon-conference-presentation
Docker/DevOps Meetup: Metrics-Driven Continuous Performance and Scalabilty
Chaos engineering & Gameday on AWS
Is your Automation Infrastructure ‘Well Architected’?
Deep Dive: AWS X-Ray London Summit 2017
An introduction to Workload Modelling for Cloud Applications
ConFoo 2017: Introduction to performance optimization of .NET web apps
16 greg hope_com_wics
From Duke of DevOps to Queen of Chaos - Api days 2018
2016 - 10 questions you should answer before building a new microservice
Scaling Databricks to Run Data and ML Workloads on Millions of VMs
Who owns Software Security
Who Owns Software Security?
Chaos Engineering - The Art of Breaking Things in Production
Building a data warehouse with Amazon Redshift … and a quick look at Amazon ...
We cant hack ourselves secure
Virtual Data : Eliminating the data constraint in Application Development
Lessons from Large-Scale Cloud Software at Databricks
Top Java Performance Problems and Metrics To Check in Your Pipeline
5 Years Of Building SaaS On AWS
Ad

More from Sulman Ahmed (20)

PPT
Entrepreneurial Strategy Generating and Exploiting new entries
PPT
Entrepreneurial Intentions and corporate entrepreneurship
PPT
Entrepreneurship main concepts and description
PPTX
Run time Verification using formal methods
PPTX
student learning App
PPTX
Software Engineering Economics Life Cycle.
PPTX
Data mining Techniques
PPTX
Rules of data mining
PPTX
Rules of data mining
PPTX
Classification in data mining
PPTX
Data mining Basics and complete description
PPTX
Data mining Basics and complete description onword
PPT
Dwh lecture-07-denormalization
PPT
Dwh lecture-06-normalization
PPT
Dwh lecture 12-dm
PPT
Dwh lecture 13-process dm
PPT
Dwh lecture 11-molap
PPT
Dwh lecture 10-olap
PPT
Dwh lecture 08-denormalization tech
PPT
Dwh lecture 07-denormalization
Entrepreneurial Strategy Generating and Exploiting new entries
Entrepreneurial Intentions and corporate entrepreneurship
Entrepreneurship main concepts and description
Run time Verification using formal methods
student learning App
Software Engineering Economics Life Cycle.
Data mining Techniques
Rules of data mining
Rules of data mining
Classification in data mining
Data mining Basics and complete description
Data mining Basics and complete description onword
Dwh lecture-07-denormalization
Dwh lecture-06-normalization
Dwh lecture 12-dm
Dwh lecture 13-process dm
Dwh lecture 11-molap
Dwh lecture 10-olap
Dwh lecture 08-denormalization tech
Dwh lecture 07-denormalization

Recently uploaded (20)

PPTX
GDM (1) (1).pptx small presentation for students
PPTX
Institutional Correction lecture only . . .
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Computing-Curriculum for Schools in Ghana
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
RMMM.pdf make it easy to upload and study
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Pharma ospi slides which help in ospi learning
GDM (1) (1).pptx small presentation for students
Institutional Correction lecture only . . .
2.FourierTransform-ShortQuestionswithAnswers.pdf
human mycosis Human fungal infections are called human mycosis..pptx
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
O7-L3 Supply Chain Operations - ICLT Program
Computing-Curriculum for Schools in Ghana
Chinmaya Tiranga quiz Grand Finale.pdf
Microbial diseases, their pathogenesis and prophylaxis
Final Presentation General Medicine 03-08-2024.pptx
RMMM.pdf make it easy to upload and study
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Abdominal Access Techniques with Prof. Dr. R K Mishra
Pharmacology of Heart Failure /Pharmacotherapy of CHF
STATICS OF THE RIGID BODIES Hibbelers.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Final Presentation General Medicine 03-08-2024.pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Pharma ospi slides which help in ospi learning

Use of Formal Methods at Amazon Web Services

  • 1. Use of Formal Methods at Amazon Web Services(Chris Newcombe, Tim Rath, Fan Zhang, Bogdan Munteanu, Marc Brooker, Michael Deardeuff ) ASAD RIAZ (021) MALIK FARHAN (028) HASSNAIN SHAH (086)
  • 2. What is AWS? oCloud services oDatabase storage oNetworking oPay-as-you-go pricing
  • 3. AWS Services oS3 oLaunch a virtual machine oBuild a web app oMachine learning (Rekognition) oDatabases (DynomoDB) oAnalytics oAR & VR
  • 4. AWS Business Growth & Cost-efficient Infrastructure oS3 grew to store 1 trillion objects. Less than a year later it had grown to 2 trillion objects, and was regularly handling 1.1 million requests per second. oFault tolerant oReplication oConsistency oConcurrency oLoad Balancing
  • 5. Complexity High complexity increases the probability of human error in design, code & operations. What we have tried? oDeep design reviews oStandard verification techniques oCode reviews oFault-injection testing Still subtle bugs & failure reason? (complexity)
  • 6. Solution? oTLA Temporal Logic of Actions+, a formal specification language. oTLA+ is based on simple discrete math, i.e. basic set theory and predicates, with which all engineers are familiar. oTLA+ specification describes the set of all possible legal behaviors. oTLA+ describes correctness properties (the ‘what’). & the design of the system (the ‘how’). oUse conventional mathematical reasoning & TLC model checker. What is TLC? A tool which takes a TLA+ specification & exhaustively checks the desired correctness properties.
  • 7. TLA+ (Temporal Logic of Action) PlusCal (similar to C-style programming language) PlusCal is automatically translated to TLA+ with a single key press. System Components Line count (excl. comments) Benefit S3 Fault-tolerant low-level network algorithm 804 PlusCal Found 2 bugs. Found further bugs in proposed optimizations. Background redistribution of data 645 PlusCal Found 1 bug, and found a bug in the first proposed fix. DynamoDB Replication & group- membership system 939 TLA+ Found 3 bugs, some requiring traces of 35 steps EBS Volume management 102 PlusCal Found 3 bugs. Internal distributed lock manager Lock-free data structure 223 PlusCal Improved confidence. Failed to find a liveness bug as we did not check liveness. Fault tolerant replication and reconfiguration algorithm 318 TLA+ Found 1 bug. Verified an aggressive optimization.
  • 8. Starting steps of Formal Specifications 1. Safety properties: “what the system is allowed to do” Example: at all times, all committed data is present and correct. 2. Liveness properties: “what the system must eventually do” Example: Whenever the system receives a request, it must eventually respond to that request. 3. Next step: “what must go right”? 4. Conforming to the design: with the goal of confirming design correctly handles all of the dynamic events in the environment.
  • 9. What to confirm? oNetwork errors & repairs oDisk errors oCrashes & restarts oData center failure and repairs oActions by human operators 5. Using the model checker to verify that the specification of the system in its environment implements the chosen correctness properties.
  • 10. TLA & PlusCal Example The problem You’re writing software for a bank. You have Alice and Bob as clients, each with a certain amount of money in their accounts. Alice wants to send some money to Bob. How do you model this? Assume all you care about is their bank accounts.
  • 12. Assertions & Sets Can Alice’s account go negative? Asserts in TLA+ used for debugging.
  • 13. Step Two We are going to get error at this stage. Tell me why? Tell me how we are going to fix it.
  • 15. Conclusion At AWS, formal methods have been a big success. They have helped us prevent subtle, serious bugs from reaching production, bugs that we would not have found via any other techniques. In simple words, whatever we are now, that would not have been achieved without using formal methods.