SlideShare a Scribd company logo
DevOps Guru for the Serverless applications
Vadym Kazulkin, ip.labs, AWS Community Day NL, October 3 2022
Contact
Vadym Kazulkin
ip.labs GmbH Bonn, Germany
Co-Organizer of the Java User Group Bonn
v.kazulkin@gmail.com
@VKazulkin
https://www.linkedin.com/in/vadymkazulkin
https://www.iplabs.de/
ip.labs
https://www.iplabs.de/
AWS DevOps Guru
What is AWS DevOps Guru
Amazon DevOps Guru is a service powered by machine learning
(ML) that is designed to make it easy to improve an application’s
operational performance and availability
DevOps Guru helps detect behaviors that deviate from normal
operating patterns so you can identify operational issues long
before they impact your customers
• increased latency
• error rates (timeouts, throttles)
• resource constraints (exceeding AWS account limits)
https://aws.amazon.com/devops-guru
Benefits of DevOps Guru
https://aws.amazon.com/devops-guru
How DevOps Guru work
https://aws.amazon.com/devops-guru
Automated reasoning's scientific frontiers
https://www.amazon.science/blog/automated-reasonings-scientific-frontiers
DevOps Guru Example Application
https://github.com/Vadym79/DevOpsGuruWorkshopDemo inspired by https://github.com/aws-samples/serverless-java-frameworks-samples
DevOps Guru Live Demo of Your Choice
DevOps Guru Set Up
DevOps Guru Dashboard
DevOps Guru Reactive Insights
• Warm up the application (takes between 1 and 24 hours) to create a base line
• Design test experiment to provoke errors and latency increase
• Reduce the service quote of the AWS service (API Gateway, Lambda,
DynamoDB)
• Set very low service quotas for the sake of reducing AWS costs
• Add latency artificially
• Stress test with Hey Tool to run into the operational issues
• See if the DevOps Guru recognized the operational issues
• Remediate the operational issues by increasing service quote, removing the
artificial latency or stopping the stress test
• See whether DevOps Guru closes the incident when it’s resolved
DevOps Guru Examples
| CONFIDENTIAL
14 https://github.com/rakyll/hey
DevOps Guru: Recognize operational issues
in DynamoDB
DevOps Guru Examples: DynamoDB Throttling
DevOps Guru Examples: DynamoDB Throttling
stress test and empty burst credits
hey -q 20 -z 15m -c 20 -H "X-API-Key: XXXa6XXXX "
https://XXX.execute-api.eu-central
1.amazonaws.com/prod/products/1
DevOps Guru Examples: DynamoDB Throttling
DevOps Guru Examples: DynamoDB Throttling
DevOps Guru Examples: DynamoDB Throttling
DevOps Guru Examples: DynamoDB Throttling
DevOps Guru Examples: DynamoDB Throttling
DevOps Guru: Recognize operational issues
in API Gateway
DevOps Guru Examples: API Gateway
HTTP 429 „too many requests“ Error
DevOps Guru Examples: API Gateway
HTTP 404 „Not Found“ Error
Query for not existing product id ,e.g. 200
hey -q 1 -z 15m -c 1 -H "X-API-Key: XXXa6XXXX"
https://XXX.execute-api.eu-central-
1.amazonaws.com/prod/products/200
DevOps Guru Examples: API Gateway 4XX Error
DevOps Guru: Recognize operational issues
in Lambda
DevOps Guru Examples: Lambda Throttling 1
hey -q 5 -z 15m -c 5 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-
api.eu-central-1.amazonaws.com/prod/products/2
DevOps Guru Examples: Lambda Throttling 1
DevOps Guru Examples: Lambda Throttling 2
hey -q 1 -z 15m -c 5000 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-api.eu-central-
1.amazonaws.com/prod/products/2
or reach Lambda concurrency execution account limit (without reaching function concurrency limit)
hey -q 1 -z 15m -c 950 https://AAA.execute-api.eu-central amazonaws.com/prod/func1 ... until
hey -q 1 -z 15m -c 950 https://BBB.execute-api.eu-central amazonaws.com/prod/func10
hey -q 1 -z 15m -c 950 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-api.eu-central-
1.amazonaws.com/prod/products/2
DevOps Guru Examples: Lambda Throttling 2
DevOps Guru Examples: Lambda Timeout Error
Add 31 sec latency in the
code of the Lambda function
DevOps Guru Examples: Lambda Timeout Error
Java runtime requires 256 MB
memory to start and execute
this function
DevOps Guru Examples: Lambda Timeout Error
DevOps Guru Examples: Lambda Increased
Latency
Temporary add 28 sec
latency in the code of
the Lambda function
DevOps Guru Examples: Lambda Increased
Latency
DevOps Guru: Recognize operational issues
in SQS
DevOps Guru Examples: Operational Issues in
SQS
Temporary add 26 sec
latency in the code of
the Lambda function
DevOps Guru Examples: Operational Issues in
SQS
DevOps Guru Proactive Insights
DevOps Guru Proactive Examples: Lambda
timeout exceeds recommended SQS visibility
DevOps Guru Proactive Examples: Lambda
timeout exceeds recommended SQS visibility
DevOps Guru Proactive Examples: SQS
triggered Lambda does not have a DLQ
DevOps Guru Proactive Examples: Lambda
function has concurrency spillover
hey -q 1 -z 30m -c 9 -m DELETE -H "X-API-Key: XXXa6XXXX" -H "Content-Type: application/json;charset=utf-8"
https://XXX.execute-api.eu-central-1.amazonaws.com/prod/products/11
• Table or Account Level read/write capacity for
DynamoDB consumption reaching account limit
• Triggered when the account consumed capacity is
approaching table or account-level limits during a period of time
Other operational issues and the proactive
insights 1/2
| CONFIDENTIAL
45
https://aws.amazon.com/de/blogs/aws/automatically-detect-operational-issues-in-lambda-functions-with-amazon-devops-guru-for-serverless/
• DynamoDB table consumed capacity reaching AutoScaling Maximum parameter limit
• Triggered when table consumed capacity is reaching AutoScaling Max parameters limit over a
period.
Other operational issues and the proactive
insights 2/2
| CONFIDENTIAL
46
https://aws.amazon.com/de/blogs/aws/automatically-detect-operational-issues-in-lambda-functions-with-amazon-devops-guru-for-serverless/
DevOps Guru integration in incident
management tools
https://aws.amazon.com/devops-guru
• OPsCenter
• PagerDuty
• Atlassian Opsgenie
DevOps Guru Integration Settings
DevOps Guru Examples: DynamoDB Throttling
DevOps Guru Integration with OpsCenter
DevOps Guru Integration with OpsCenter
DevOps Guru Integration with OpsCenter
OpsItem View
DevOps Guru Integration with OpCenter
Incident Manager View
DevOps Guru Integration with OpCenter
Incident Manager View
DevOps Guru Integration with OpCenter
Incident Confugration
DevOps Guru Integration with OpCenter
Incident Contacts
DevOps Guru Integration with PagerDuty
https://www.pagerduty.com/docs/guides/amazon-devops-guru-integration-guide/
DevOps Guru Integration with PagerDuty
Enter „Integration
URL“ generated by
PagerDuty
DevOps Guru Integration with PagerDuty
DevOps Guru PagerDuty Incidents
DevOps Guru Supported Services
https://aws.amazon.com/de/devops-guru/pricing/
DevOps Guru Cost Estimator
https://aws.amazon.com/de/devops-guru/pricing/
DevOps Guru Conclusions, Obeservations,
Suggestions 1/4
• All errors have been correctly recognized so far
• It took several (at least 7) minutes to create an incident after
anomaly appeared
• Correctly no insights created for the temporary incidents
• Short time Lambda, DynamoDB and API Gateway Throttling
• Recommendations for the insight reason could be more precise
• No differentiation between Lambda throttling because of
reaching individual function concurrency limit or the total
AWS account concurrency limit
DevOps Guru Conclusions,
Obeservations, Suggestions 2/4
• HTTP 4XX Errors
• Time to create DevOps Guru insight is relatively big (more
than 10 minutes). Maybe because of the medium severity
• Are not fine granular splitted between different error codes
(404, 429) which have totally different meaning and cause
• No reference which Lambda function behind the API
Gateway cause these errors (important for 404 error)
DevOps Guru Conclusions,
Obeservations, Suggestions 3/4
• Lambda duration anomalous insights (Duration p90)
• took huge time to create (sometimes more than 30
minutes). Maybe because of the medium severity
• DevOps Guru Proactive Insights
• Stay ongoing long time after the insight occurred only once
• Not always expire quickly after being fixed
• Missed some important ones, like not used Lambda
Provisioned Concurrency for a long period of time
DevOps Guru Conclusions,
Obeservations, Suggestions 4/4
• Log Groups haven’t always been displayed within DevOps Guru
Insight
• Missing Link to Tracing ( e.g. AWS X-Ray)
• Easier integration into AWS Systems Manager Incident Manager
DevOps Guru for RDS
https://aws.amazon.com/devops-guru/features/devops-guru-for-rds/ https://aws.amazon.com/de/blogs/devops/leverage-devops-guru-for-rds-to-detect-anomalies-and-resolve-operational-issues/
Amazon DevOps Guru for the Serverless Applications at  AWS Community Day Benelux 2022
www.iplabs.de
Accelerate Your Photo Business
Get in Touch

More Related Content

PDF
How to reduce cold starts for Java Serverless applications in AWS at Serverle...
PPTX
Vmware training presentation
PDF
What I Wish I Knew About AWS Certification 2022
PDF
Authentification et autorisation d'accès avec AWS IAM
PDF
Introduction_of_CNSec_three_dogs _AWS_Dev_Day_LT
PPTX
AWS VPC Fundamentals- Webinar
PDF
Cloud Security - Security Aspects of Cloud Computing
PDF
ELB & CloudWatch & AutoScaling - AWSマイスターシリーズ
How to reduce cold starts for Java Serverless applications in AWS at Serverle...
Vmware training presentation
What I Wish I Knew About AWS Certification 2022
Authentification et autorisation d'accès avec AWS IAM
Introduction_of_CNSec_three_dogs _AWS_Dev_Day_LT
AWS VPC Fundamentals- Webinar
Cloud Security - Security Aspects of Cloud Computing
ELB & CloudWatch & AutoScaling - AWSマイスターシリーズ

What's hot (20)

PDF
Modèle de sécurité AWS
PDF
AWS 기반 클라우드 아키텍처 모범사례 - 삼성전자 개발자 포털/개발자 워크스페이스 - 정영준 솔루션즈 아키텍트, AWS / 유현성 수석,...
PPTX
Cloud Computing Security
PPT
security Issues of cloud computing
PDF
AWS Black Belt Online Seminar Amazon EC2
PPT
Cloud Computing
PPTX
basic concept of Cloud computing and its architecture
PDF
An introduction to AWS Direct Connect
PPTX
Azure ADアプリケーションを使用した認証のあれやこれ
PDF
ココが違うよEC2 ~オンプレミスVMとの徹底⽐比較~
DOCX
Cloud computing seminar report
PPTX
AWS基礎
PPTX
Azure Cosmos DB を使った クラウドネイティブアプリケーションの 設計パターン
PDF
AWS 初級トレーニング (Windows Server 2012編)
PPTX
AWS S3 | Tutorial For Beginners | AWS S3 Bucket Tutorial | AWS Tutorial For B...
PDF
20170725 black belt_monitoring_on_aws
PDF
Hypervisors and Virtualization - VMware, Hyper-V, XenServer, and KVM
PDF
AWS Black Belt Techシリーズ AWS IAM
PPTX
SaaS.pptx
PDF
Amazon API Gateway and AWS Lambda: Better Together
Modèle de sécurité AWS
AWS 기반 클라우드 아키텍처 모범사례 - 삼성전자 개발자 포털/개발자 워크스페이스 - 정영준 솔루션즈 아키텍트, AWS / 유현성 수석,...
Cloud Computing Security
security Issues of cloud computing
AWS Black Belt Online Seminar Amazon EC2
Cloud Computing
basic concept of Cloud computing and its architecture
An introduction to AWS Direct Connect
Azure ADアプリケーションを使用した認証のあれやこれ
ココが違うよEC2 ~オンプレミスVMとの徹底⽐比較~
Cloud computing seminar report
AWS基礎
Azure Cosmos DB を使った クラウドネイティブアプリケーションの 設計パターン
AWS 初級トレーニング (Windows Server 2012編)
AWS S3 | Tutorial For Beginners | AWS S3 Bucket Tutorial | AWS Tutorial For B...
20170725 black belt_monitoring_on_aws
Hypervisors and Virtualization - VMware, Hyper-V, XenServer, and KVM
AWS Black Belt Techシリーズ AWS IAM
SaaS.pptx
Amazon API Gateway and AWS Lambda: Better Together
Ad

Similar to Amazon DevOps Guru for the Serverless Applications at AWS Community Day Benelux 2022 (20)

PDF
Amazon DevOps Guru for the Serverless Applications at AWS Community Day NL 2023
PDF
Detect operational anomalies in Serverless Applications with Amazon DevOps Gu...
PDF
Detect operational anomalies in Serverless Applications with Amazon DevOps Gu...
PDF
Revolutionize DevOps with ML capabilities. Introduction to Amazon CodeGuru an...
PDF
Revolutionize DevOps with ML capabilities. Deep dive into Amazon CodeGuru and...
PDF
Detect operational anomalies in Serverless Applications with Amazon DevOps Gu...
PDF
Amazon DevOps Guru for Serverless Applications at DevOpsCon 2024 London
PDF
Amazon DevOps Guru for Serverless Applications at JAWS Pankration 2024
PDF
Revolutionize DevOps lifecycle with Amazon CodeCatalyst and DevOps Guru at De...
PDF
Revolutionize DevOps with ML capabilities. Introduction to Amazon CodeGuru an...
PPTX
Agility and Control from AWS [FutureStack16]
PDF
Revolutionize DevOps with ML capabilities. Introduction to Amazon CodeGuru an...
PDF
Operating Microservices at Hyperscale — Tech in Asia PDC 2019
PDF
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
DOCX
AWS devops content
DOCX
AWS devops content 1(2023).docx
DOCX
AWS devops content
DOCX
AWS devops
PDF
Lambda and DynamoDB best practices
PDF
Making sense of AWS Serverless operations AWS Community Day NL 2024-
Amazon DevOps Guru for the Serverless Applications at AWS Community Day NL 2023
Detect operational anomalies in Serverless Applications with Amazon DevOps Gu...
Detect operational anomalies in Serverless Applications with Amazon DevOps Gu...
Revolutionize DevOps with ML capabilities. Introduction to Amazon CodeGuru an...
Revolutionize DevOps with ML capabilities. Deep dive into Amazon CodeGuru and...
Detect operational anomalies in Serverless Applications with Amazon DevOps Gu...
Amazon DevOps Guru for Serverless Applications at DevOpsCon 2024 London
Amazon DevOps Guru for Serverless Applications at JAWS Pankration 2024
Revolutionize DevOps lifecycle with Amazon CodeCatalyst and DevOps Guru at De...
Revolutionize DevOps with ML capabilities. Introduction to Amazon CodeGuru an...
Agility and Control from AWS [FutureStack16]
Revolutionize DevOps with ML capabilities. Introduction to Amazon CodeGuru an...
Operating Microservices at Hyperscale — Tech in Asia PDC 2019
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
AWS devops content
AWS devops content 1(2023).docx
AWS devops content
AWS devops
Lambda and DynamoDB best practices
Making sense of AWS Serverless operations AWS Community Day NL 2024-
Ad

More from Vadym Kazulkin (20)

PDF
How to develop, run and optimize Spring Boot 3 application on AWS Lambda - Wa...
PDF
Event-driven architecture patterns in highly scalable image storage solution-...
PDF
High performance Serverless Java on AWS- Serverless Architecture Javaland 2025
PDF
How to develop, run and optimize Spring Boot 3 application on AWS Lambda-OBI ...
PPTX
Making sense of AWS Serverless operations- AWS User Group Nuremberg
PDF
How to develop, run and optimize Spring Boot 3 application on AWS Lambda at V...
PPTX
Making sense of AWS Serverless operations at Believe in Serverless community ...
PDF
How to develop, run and optimize Spring Boot 3 application on AWS Lambda at I...
PDF
Making sense of AWS Serverless operations - Amarathon Geek China 2024
PDF
Event-driven architecture patterns in highly scalable image storage solution-...
PDF
High performance Serverless Java on AWS- Serverless Meetup Toronto
PDF
High performance Serverless Java on AWS- Serverless Architecture Conference B...
PDF
Making sense of AWS Serverless operations- Serverless Architecture Conference...
PDF
High performance Serverless Java on AWS- AWS Community Day Budapest 2024
PDF
Event-driven architecture patterns in highly scalable image storage solution ...
PDF
High performance Serverless Java on AWS at We Are Developers 2024
PDF
High performance Serverless Java on AWS at AWS Community Day Belfast 2024
PDF
How to develop, run and optimize Spring Boot 3 application on AWS Lambda at J...
PDF
High performance Serverless Java on AWS at Froscon 2024
PDF
How to develop, run and optimize Spring Boot 3 application on AWS Lambda at A...
How to develop, run and optimize Spring Boot 3 application on AWS Lambda - Wa...
Event-driven architecture patterns in highly scalable image storage solution-...
High performance Serverless Java on AWS- Serverless Architecture Javaland 2025
How to develop, run and optimize Spring Boot 3 application on AWS Lambda-OBI ...
Making sense of AWS Serverless operations- AWS User Group Nuremberg
How to develop, run and optimize Spring Boot 3 application on AWS Lambda at V...
Making sense of AWS Serverless operations at Believe in Serverless community ...
How to develop, run and optimize Spring Boot 3 application on AWS Lambda at I...
Making sense of AWS Serverless operations - Amarathon Geek China 2024
Event-driven architecture patterns in highly scalable image storage solution-...
High performance Serverless Java on AWS- Serverless Meetup Toronto
High performance Serverless Java on AWS- Serverless Architecture Conference B...
Making sense of AWS Serverless operations- Serverless Architecture Conference...
High performance Serverless Java on AWS- AWS Community Day Budapest 2024
Event-driven architecture patterns in highly scalable image storage solution ...
High performance Serverless Java on AWS at We Are Developers 2024
High performance Serverless Java on AWS at AWS Community Day Belfast 2024
How to develop, run and optimize Spring Boot 3 application on AWS Lambda at J...
High performance Serverless Java on AWS at Froscon 2024
How to develop, run and optimize Spring Boot 3 application on AWS Lambda at A...

Recently uploaded (20)

PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
cuic standard and advanced reporting.pdf
PPTX
Cloud computing and distributed systems.
PPTX
Big Data Technologies - Introduction.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Electronic commerce courselecture one. Pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Approach and Philosophy of On baking technology
Review of recent advances in non-invasive hemoglobin estimation
sap open course for s4hana steps from ECC to s4
Diabetes mellitus diagnosis method based random forest with bat algorithm
Chapter 3 Spatial Domain Image Processing.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
cuic standard and advanced reporting.pdf
Cloud computing and distributed systems.
Big Data Technologies - Introduction.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Empathic Computing: Creating Shared Understanding
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Encapsulation_ Review paper, used for researhc scholars
Electronic commerce courselecture one. Pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Machine learning based COVID-19 study performance prediction
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Approach and Philosophy of On baking technology

Amazon DevOps Guru for the Serverless Applications at AWS Community Day Benelux 2022

  • 1. DevOps Guru for the Serverless applications Vadym Kazulkin, ip.labs, AWS Community Day NL, October 3 2022
  • 2. Contact Vadym Kazulkin ip.labs GmbH Bonn, Germany Co-Organizer of the Java User Group Bonn v.kazulkin@gmail.com @VKazulkin https://www.linkedin.com/in/vadymkazulkin https://www.iplabs.de/
  • 5. What is AWS DevOps Guru Amazon DevOps Guru is a service powered by machine learning (ML) that is designed to make it easy to improve an application’s operational performance and availability DevOps Guru helps detect behaviors that deviate from normal operating patterns so you can identify operational issues long before they impact your customers • increased latency • error rates (timeouts, throttles) • resource constraints (exceeding AWS account limits) https://aws.amazon.com/devops-guru
  • 6. Benefits of DevOps Guru https://aws.amazon.com/devops-guru
  • 7. How DevOps Guru work https://aws.amazon.com/devops-guru
  • 8. Automated reasoning's scientific frontiers https://www.amazon.science/blog/automated-reasonings-scientific-frontiers
  • 9. DevOps Guru Example Application https://github.com/Vadym79/DevOpsGuruWorkshopDemo inspired by https://github.com/aws-samples/serverless-java-frameworks-samples
  • 10. DevOps Guru Live Demo of Your Choice
  • 14. • Warm up the application (takes between 1 and 24 hours) to create a base line • Design test experiment to provoke errors and latency increase • Reduce the service quote of the AWS service (API Gateway, Lambda, DynamoDB) • Set very low service quotas for the sake of reducing AWS costs • Add latency artificially • Stress test with Hey Tool to run into the operational issues • See if the DevOps Guru recognized the operational issues • Remediate the operational issues by increasing service quote, removing the artificial latency or stopping the stress test • See whether DevOps Guru closes the incident when it’s resolved DevOps Guru Examples | CONFIDENTIAL 14 https://github.com/rakyll/hey
  • 15. DevOps Guru: Recognize operational issues in DynamoDB
  • 16. DevOps Guru Examples: DynamoDB Throttling
  • 17. DevOps Guru Examples: DynamoDB Throttling stress test and empty burst credits hey -q 20 -z 15m -c 20 -H "X-API-Key: XXXa6XXXX " https://XXX.execute-api.eu-central 1.amazonaws.com/prod/products/1
  • 18. DevOps Guru Examples: DynamoDB Throttling
  • 19. DevOps Guru Examples: DynamoDB Throttling
  • 20. DevOps Guru Examples: DynamoDB Throttling
  • 21. DevOps Guru Examples: DynamoDB Throttling
  • 22. DevOps Guru Examples: DynamoDB Throttling
  • 23. DevOps Guru: Recognize operational issues in API Gateway
  • 24. DevOps Guru Examples: API Gateway HTTP 429 „too many requests“ Error
  • 25. DevOps Guru Examples: API Gateway HTTP 404 „Not Found“ Error Query for not existing product id ,e.g. 200 hey -q 1 -z 15m -c 1 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-api.eu-central- 1.amazonaws.com/prod/products/200
  • 26. DevOps Guru Examples: API Gateway 4XX Error
  • 27. DevOps Guru: Recognize operational issues in Lambda
  • 28. DevOps Guru Examples: Lambda Throttling 1 hey -q 5 -z 15m -c 5 -H "X-API-Key: XXXa6XXXX" https://XXX.execute- api.eu-central-1.amazonaws.com/prod/products/2
  • 29. DevOps Guru Examples: Lambda Throttling 1
  • 30. DevOps Guru Examples: Lambda Throttling 2 hey -q 1 -z 15m -c 5000 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-api.eu-central- 1.amazonaws.com/prod/products/2 or reach Lambda concurrency execution account limit (without reaching function concurrency limit) hey -q 1 -z 15m -c 950 https://AAA.execute-api.eu-central amazonaws.com/prod/func1 ... until hey -q 1 -z 15m -c 950 https://BBB.execute-api.eu-central amazonaws.com/prod/func10 hey -q 1 -z 15m -c 950 -H "X-API-Key: XXXa6XXXX" https://XXX.execute-api.eu-central- 1.amazonaws.com/prod/products/2
  • 31. DevOps Guru Examples: Lambda Throttling 2
  • 32. DevOps Guru Examples: Lambda Timeout Error Add 31 sec latency in the code of the Lambda function
  • 33. DevOps Guru Examples: Lambda Timeout Error Java runtime requires 256 MB memory to start and execute this function
  • 34. DevOps Guru Examples: Lambda Timeout Error
  • 35. DevOps Guru Examples: Lambda Increased Latency Temporary add 28 sec latency in the code of the Lambda function
  • 36. DevOps Guru Examples: Lambda Increased Latency
  • 37. DevOps Guru: Recognize operational issues in SQS
  • 38. DevOps Guru Examples: Operational Issues in SQS Temporary add 26 sec latency in the code of the Lambda function
  • 39. DevOps Guru Examples: Operational Issues in SQS
  • 41. DevOps Guru Proactive Examples: Lambda timeout exceeds recommended SQS visibility
  • 42. DevOps Guru Proactive Examples: Lambda timeout exceeds recommended SQS visibility
  • 43. DevOps Guru Proactive Examples: SQS triggered Lambda does not have a DLQ
  • 44. DevOps Guru Proactive Examples: Lambda function has concurrency spillover hey -q 1 -z 30m -c 9 -m DELETE -H "X-API-Key: XXXa6XXXX" -H "Content-Type: application/json;charset=utf-8" https://XXX.execute-api.eu-central-1.amazonaws.com/prod/products/11
  • 45. • Table or Account Level read/write capacity for DynamoDB consumption reaching account limit • Triggered when the account consumed capacity is approaching table or account-level limits during a period of time Other operational issues and the proactive insights 1/2 | CONFIDENTIAL 45 https://aws.amazon.com/de/blogs/aws/automatically-detect-operational-issues-in-lambda-functions-with-amazon-devops-guru-for-serverless/
  • 46. • DynamoDB table consumed capacity reaching AutoScaling Maximum parameter limit • Triggered when table consumed capacity is reaching AutoScaling Max parameters limit over a period. Other operational issues and the proactive insights 2/2 | CONFIDENTIAL 46 https://aws.amazon.com/de/blogs/aws/automatically-detect-operational-issues-in-lambda-functions-with-amazon-devops-guru-for-serverless/
  • 47. DevOps Guru integration in incident management tools https://aws.amazon.com/devops-guru • OPsCenter • PagerDuty • Atlassian Opsgenie
  • 49. DevOps Guru Examples: DynamoDB Throttling
  • 50. DevOps Guru Integration with OpsCenter
  • 51. DevOps Guru Integration with OpsCenter
  • 52. DevOps Guru Integration with OpsCenter OpsItem View
  • 53. DevOps Guru Integration with OpCenter Incident Manager View
  • 54. DevOps Guru Integration with OpCenter Incident Manager View
  • 55. DevOps Guru Integration with OpCenter Incident Confugration
  • 56. DevOps Guru Integration with OpCenter Incident Contacts
  • 57. DevOps Guru Integration with PagerDuty https://www.pagerduty.com/docs/guides/amazon-devops-guru-integration-guide/
  • 58. DevOps Guru Integration with PagerDuty Enter „Integration URL“ generated by PagerDuty
  • 59. DevOps Guru Integration with PagerDuty
  • 61. DevOps Guru Supported Services https://aws.amazon.com/de/devops-guru/pricing/
  • 62. DevOps Guru Cost Estimator https://aws.amazon.com/de/devops-guru/pricing/
  • 63. DevOps Guru Conclusions, Obeservations, Suggestions 1/4 • All errors have been correctly recognized so far • It took several (at least 7) minutes to create an incident after anomaly appeared • Correctly no insights created for the temporary incidents • Short time Lambda, DynamoDB and API Gateway Throttling • Recommendations for the insight reason could be more precise • No differentiation between Lambda throttling because of reaching individual function concurrency limit or the total AWS account concurrency limit
  • 64. DevOps Guru Conclusions, Obeservations, Suggestions 2/4 • HTTP 4XX Errors • Time to create DevOps Guru insight is relatively big (more than 10 minutes). Maybe because of the medium severity • Are not fine granular splitted between different error codes (404, 429) which have totally different meaning and cause • No reference which Lambda function behind the API Gateway cause these errors (important for 404 error)
  • 65. DevOps Guru Conclusions, Obeservations, Suggestions 3/4 • Lambda duration anomalous insights (Duration p90) • took huge time to create (sometimes more than 30 minutes). Maybe because of the medium severity • DevOps Guru Proactive Insights • Stay ongoing long time after the insight occurred only once • Not always expire quickly after being fixed • Missed some important ones, like not used Lambda Provisioned Concurrency for a long period of time
  • 66. DevOps Guru Conclusions, Obeservations, Suggestions 4/4 • Log Groups haven’t always been displayed within DevOps Guru Insight • Missing Link to Tracing ( e.g. AWS X-Ray) • Easier integration into AWS Systems Manager Incident Manager
  • 67. DevOps Guru for RDS https://aws.amazon.com/devops-guru/features/devops-guru-for-rds/ https://aws.amazon.com/de/blogs/devops/leverage-devops-guru-for-rds-to-detect-anomalies-and-resolve-operational-issues/
  • 69. www.iplabs.de Accelerate Your Photo Business Get in Touch