SlideShare a Scribd company logo
1 #Dynatrace
APM in Continuous Delivery and DevOps
(R)Evolutionize APM
2 #Dynatrace
Insert
image here Martin Etmajer
Senior Technology Strategist at Dynatrace
martin.etmajer@dynatrace.com
@metmajer
3 #Dynatrace
Do we have a Software Crisis?
4 #Dynatrace
» projects running over-budget
» projects running over-time
» software was very inefficient
» software was of low quality
» software often did not meet requirements
» code was complex and difficult to maintain
» software was often never delivered
The “Software Crisis” as of 1968
5 #Dynatrace
» projects running over-budget
» projects running over-time
» software was very inefficient
» software was of low quality
» software often did not meet requirements
» code was complex and difficult to maintain
» software was often never delivered
The “Software Crisis” as of 1968 today?
6 #Dynatrace
Status Quo: The CHAOS Manifesto 2013
7 #Dynatrace
Status Quo: The CHAOS Manifesto 2013
8 #Dynatrace
Why can’t making Software
be more like building Bridges?
9 #Dynatrace
How to ignore an Undesirable Situation
Hear no failure
See no failure
Speak no failure
10 #Dynatrace
Houston, we have a Problem! My Problem?
11 #Dynatrace
“We need to create a culture that reinforces the value of taking risks and
learning from failure and the need for repetition and practice to create
mastery.” Gene Kim, The Phoenix Project
A key-principle of DevOps
12 #Dynatrace
Because you don’t want this...
13 #Dynatrace
Unless you work
for the competition 
14 #Dynatrace
15 #Dynatrace
16 #Dynatrace
...and certainly not this...
17 #Dynatrace
18 #Dynatrace
How to Improve?
19 #Dynatrace
Agile and Lean Practices to the Rescue
Continuous Delivery
DevOps
Source: Google Trends
20 #Dynatrace
Agile and Lean Practices to the Rescue
Continuous Delivery
DevOps
Source: Google Trends
SCRUM
21 #Dynatrace
(Very) Recommended Readings
Reliable Software Releases through
Build, Test and Deployment Automation
22 #Dynatrace
The Utmost Goal: Minimize Lead Time
feature lead time time
Customer Users
23 #Dynatrace
The Utmost Goal: Minimize Lead Time
feature lead time time
Customer minimize Users
24 #Dynatrace
The Utmost Goal: Minimize Lead Time
feature lead time time
Customer
You
This is when you
create value!
minimize
25 #Dynatrace
Use Case I:
Uncover Issues (Pro)Actively
Before they affect your Users
26 #Dynatrace
Rate of Diminishing Returns of Fixing Bugs
Developers should
not spend time here!
Low yield!
Concentrate on these!Concentrate on these!Concentrate on these!
27 #Dynatrace
A Project Little Helper: Kanban Board
Shows WIP
Tasks
28 #Dynatrace
Implement and Test...
29 #Dynatrace
Dynatrace in Automated Testing
12 0 120ms
3 1 68ms
Build 20 testPurchase OK
testSearch OK
Build 17 testPurchase OK
testSearch OK
Build 18 testPurchase FAILED
testSearch OK
Build 19 testPurchase OK
testSearch OK
Build # Test Case Status # SQL # Exceptions CPU
12 0 120ms
3 1 68ms
12 5 60ms
3 1 68ms
75 0 230ms
3 1 68ms
Test Framework Results Architectural Data
Regression!
Problem solved!
Exceptions probably reason
for failed tests
Problem fixed but now we have an
architectural regression
Problem fixed but now we have an
architectural regression!Now we have the functional and
architectural confidence
Let’s look behind the scenes
30 #Dynatrace
31 #Dynatrace
#1: Analyze each Test
#2: Metrics for each Test
#3: Regression Detection
based on Metric
32 #Dynatrace
Allow Metrics
to fail builds?
Dynatrace Test Automation Plugin for
Jenkins
33 #Dynatrace
High-level KPIs
per Build
Trending
Test Results
@Project Level
Dynatrace Test Automation Plugin for Jenkins
34 #Dynatrace
Use Case II:
Uncover Issues (Re)Actively
After they affected your Users
35 #Dynatrace
Do we still need War Rooms?
36 #Dynatrace
“I’ve muddled over the same log files for weeks sometimes
to extrapolate the relationships between different systems
[...] before having my eureka moment.”
RecklessKelly (Operator) on reddit
37 #Dynatrace
Can we do Better?
38 #Dynatrace
39 #Dynatrace
Host Health?
40 #Dynatrace
41 #Dynatrace
42 #Dynatrace
Transactions Health?
43 #Dynatrace
Relevance?
44 #Dynatrace
45 #Dynatrace
Dynatrace Session File
46 #Dynatrace
Get Everyone into a War Room?
47 #Dynatrace
Get Everyone into a War Room?
NO!
48 #Dynatrace
Instead?
49 #Dynatrace
Takeaways?
50 #Dynatrace
Takeaways?
51 #Dynatrace
Takeaways?
52 #Dynatrace
Takeaways?
53 #Dynatrace
» identified whether it’s been the host, process or transactions
» identified which critical business functionality was affected
» been able to prioritze the failure and secure evidence
» gotten the right people on the same table
» taken minutes, not weeks!
Awesome! We have...
54 #Dynatrace
Thank you!
55 #Dynatrace

More Related Content

PPTX
Introduction to Automated Deployments with Ansible
PDF
Continuous Delivery 101
PPTX
Automated Deployments
PDF
Git Power Routines
PDF
Using Linux Securely in the Cloud
PDF
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
PDF
Continuous Deployment: The Dirty Details
PDF
Hadoop Summit 2013 : Continuous Integration on top of hadoop
Introduction to Automated Deployments with Ansible
Continuous Delivery 101
Automated Deployments
Git Power Routines
Using Linux Securely in the Cloud
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
Continuous Deployment: The Dirty Details
Hadoop Summit 2013 : Continuous Integration on top of hadoop

What's hot (20)

PPT
Deploying Rails at SeeClickFix
PDF
Devops, Secops, Opsec, DevSec *ops *.* ?
PDF
CI/CD (DevOps) 101
PPTX
OpenSlava 2015 When DevOps Hurts
PDF
"DevOps > CI+CD "
PPTX
Tests your pipeline might be missing
PDF
Introduction to CICD
PPTX
Introduction to CI/CD
PPTX
BsidesMCR_2016-what-can-infosec-learn-from-devops
PDF
Rise of the Machines - Automate your Development
PDF
Puppet Release Workflows at Jive Software
PDF
Bay Area Chef Meetup February
PPTX
Jenkins - From Continuous Integration to Continuous Delivery
PPTX
DOES14 - Gary Gruver - Macy's - Transforming Traditional Enterprise Software ...
PDF
CI and CD Across the Enterprise with Jenkins (devops.com Nov 2014)
PDF
At Your Service: Using Jenkins in Operations
PDF
Code Reviews vs. Pull Requests
PDF
Standardizing Jenkins with CloudBees Jenkins Team
PPTX
STIG Compliance and Remediation with Ansible
PDF
What's Missing? Microservices Meetup at Cisco
Deploying Rails at SeeClickFix
Devops, Secops, Opsec, DevSec *ops *.* ?
CI/CD (DevOps) 101
OpenSlava 2015 When DevOps Hurts
"DevOps > CI+CD "
Tests your pipeline might be missing
Introduction to CICD
Introduction to CI/CD
BsidesMCR_2016-what-can-infosec-learn-from-devops
Rise of the Machines - Automate your Development
Puppet Release Workflows at Jive Software
Bay Area Chef Meetup February
Jenkins - From Continuous Integration to Continuous Delivery
DOES14 - Gary Gruver - Macy's - Transforming Traditional Enterprise Software ...
CI and CD Across the Enterprise with Jenkins (devops.com Nov 2014)
At Your Service: Using Jenkins in Operations
Code Reviews vs. Pull Requests
Standardizing Jenkins with CloudBees Jenkins Team
STIG Compliance and Remediation with Ansible
What's Missing? Microservices Meetup at Cisco
Ad

Viewers also liked (20)

PPTX
Test-Driven Infrastructure with Puppet, Test Kitchen, Serverspec and RSpec
PPTX
Ansible presentation
PPTX
Ansible presentation
PDF
Présentation des nouveautés de Zabbix 3.2 - Zabbix Toulouse #1 - ZUG
PPTX
Deploying On-Prem as SaaS: Why we go with Ansible
PDF
Ansible Overview - System Administration and Maintenance
PDF
IT Infrastructure Monitoring Strategies in Healthcare
PPTX
Introduction to ELK
PPTX
Monitoring in the DevOps Era
PPTX
Centralized Logging System Using ELK Stack
PDF
Ansible - Introduction
PDF
Monitoring all Elements of Your Database Operations With Zabbix
PDF
Logmanagement with Icinga2 and ELK
PDF
ELK introduction
PDF
The Open-Source Monitoring Landscape
ODP
Monitoring with ElasticSearch
PDF
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
PDF
DevOps in the Hybrid Cloud
PPTX
Dynatrace
PPTX
Monitoring Docker with ELK
Test-Driven Infrastructure with Puppet, Test Kitchen, Serverspec and RSpec
Ansible presentation
Ansible presentation
Présentation des nouveautés de Zabbix 3.2 - Zabbix Toulouse #1 - ZUG
Deploying On-Prem as SaaS: Why we go with Ansible
Ansible Overview - System Administration and Maintenance
IT Infrastructure Monitoring Strategies in Healthcare
Introduction to ELK
Monitoring in the DevOps Era
Centralized Logging System Using ELK Stack
Ansible - Introduction
Monitoring all Elements of Your Database Operations With Zabbix
Logmanagement with Icinga2 and ELK
ELK introduction
The Open-Source Monitoring Landscape
Monitoring with ElasticSearch
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
DevOps in the Hybrid Cloud
Dynatrace
Monitoring Docker with ELK
Ad

Similar to (R)Evolutionize APM - APM in Continuous Delivery and DevOps (20)

PDF
Performance Metrics for your Delivery Pipeline - Wolfgang Gottesheim
PPTX
(R)evolutionize APM
PPTX
Java Performance Mistakes
PPTX
Automated Deployments with Ansible
PPTX
Industry Keynote at Large Scale Testing Workshop 2015
PDF
From zero to one - How we evolved our test automation processes and mindset i...
PPTX
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
PDF
DevOps: Find Solutions, Not More Defects
PPTX
DevOps Days Toronto: From 6 Months Waterfall to 1 hour Code Deploys
PDF
Нікіта Галкін “Technical backlog: інструкція до застосування” Kyiv Project Ma...
PDF
DevSecOps | How hard it is?
PPTX
Securely Boosting Any Product with Generative AI APIs - Ruben Sitbon, Theodo ...
PPTX
Technical Debt.pptx
PPTX
Influx data & pagerduty webinar dec 2018
PPTX
InfluxData Webinar 12.12.18
PPTX
Web and App Performance: Top Problems to avoid to keep you out of the News
PDF
An illustrated guide to microservices (ploneconf 10 21-2016)
PPT
Death by Technical Debt: Lessons Learned to Get you Unbuired
PPTX
Critical online success factors with dynatrace
PDF
QCon SF 2017 - Microservices: Service-Oriented Development
Performance Metrics for your Delivery Pipeline - Wolfgang Gottesheim
(R)evolutionize APM
Java Performance Mistakes
Automated Deployments with Ansible
Industry Keynote at Large Scale Testing Workshop 2015
From zero to one - How we evolved our test automation processes and mindset i...
Monitoring Microservices at Scale on OpenShift (OpenShift Commons Briefing #52)
DevOps: Find Solutions, Not More Defects
DevOps Days Toronto: From 6 Months Waterfall to 1 hour Code Deploys
Нікіта Галкін “Technical backlog: інструкція до застосування” Kyiv Project Ma...
DevSecOps | How hard it is?
Securely Boosting Any Product with Generative AI APIs - Ruben Sitbon, Theodo ...
Technical Debt.pptx
Influx data & pagerduty webinar dec 2018
InfluxData Webinar 12.12.18
Web and App Performance: Top Problems to avoid to keep you out of the News
An illustrated guide to microservices (ploneconf 10 21-2016)
Death by Technical Debt: Lessons Learned to Get you Unbuired
Critical online success factors with dynatrace
QCon SF 2017 - Microservices: Service-Oriented Development

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Electronic commerce courselecture one. Pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
KodekX | Application Modernization Development
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Approach and Philosophy of On baking technology
Understanding_Digital_Forensics_Presentation.pptx
Encapsulation_ Review paper, used for researhc scholars
Chapter 3 Spatial Domain Image Processing.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Machine learning based COVID-19 study performance prediction
Electronic commerce courselecture one. Pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
NewMind AI Monthly Chronicles - July 2025
KodekX | Application Modernization Development
Agricultural_Statistics_at_a_Glance_2022_0.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
MYSQL Presentation for SQL database connectivity
Diabetes mellitus diagnosis method based random forest with bat algorithm
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
20250228 LYD VKU AI Blended-Learning.pptx
The AUB Centre for AI in Media Proposal.docx
CIFDAQ's Market Insight: SEC Turns Pro Crypto

(R)Evolutionize APM - APM in Continuous Delivery and DevOps

Editor's Notes

  • #4: I would like to start my talk with a somewhat controversal question – and I’ll promise to give you an answer to it in a bit. Do we have a software crisis?
  • #5: The term “software crisis” was coined by members of the NATO Software Engineering Conference in 1968. They felt a huge gap between what was theoretically possible at that time and what could actually be achieved in software development. At that time, the USA were in the middle of the Cold War with the Sovjet Union and they feared a loss in expertise in software development. Notes: inefficient in terms of resource consumption (run in less time and/or with less memory) low quality in terms of reliability and stability, maintainability, etc. complex and difficult to maintain: spaghetti code; code dependencies are twisted and tangled like a bowl of spaghetti. if you pull on one end, something will move on the other end
  • #6: To me, these points still seem valid today, even if not as severe. Why not have a look at some numbers...
  • #7: The CHAOS Manifesto by the Standish Group shows the results of a research based on roughly 50.000 software development projects around the globe and across verticals from 2002 to 2012. The companies they surveyed are: 50% Fortune 1000-type companies (large) 30% mid-range 20% small-range Success := delivered on time, on budget, with required features and functions Challenged := late, over budget, and/or with less than required features and functions Failed := cancelled prior to completion OR delivered and never used (unusable) If you think that 39% in 2012 are bad, let’s have a look at 2004...
  • #8: In 2004, only 29% of all researched software projects were successful, which makes 2004 the worst of all years since 2001. However, since 2004 there has been a constant increase in success rates. The authors of the manifesto revealed that this increase was due to much better (agile) project management and the use of agile software development practices, such as test-driven development, pair programming, etc.
  • #9: At the end of the 1980s, Alfred Spector, now a VP of Research at Google co-authored a scientific paper, where he tried to answer the following question: “Why are we able to build bridges, which finish on-time and on-budget (and typically do not crash) – but fail to do this when it comes to writing software?” Two answers were almost obvious: Bridges are being built since more than 3000 years, software only since a few decades Bridge building relies on the laws of mathematics and statics, where there is only little room for flexibility, whereas software does not underly strict laws Most surprisingly was the following discovery: it does have something to do with how we deal with mistakes. When a bridge ever falls down, the incident is thoroughly investigated and reported so that future bridge builders can learn from previous mistakes. Not so with software: failures are often covered up, ignored or rationalised (“it’s not a bug, it’s a feature”), with the result being that we are unable to learn from our mistakes.
  • #10: How do you best ignore an undesireable situation? 
  • #11: Is it my problem? Probably not. Finger-pointing and blame-games do not solve the problem and cost precious time and money. But a problem is always an unpleasant situation – not not? My point is that we must establish a culture that accepts errors as part of our daily work and that the ability to quickly and efficiently resolve these errors allows us to learn from our mistakes and to get better day by day – which allows us to outperform others who don’t and be successful in the long-term. And I am certainly not alone with my point of view...
  • #13: Warum das wichtig ist?
  • #20: Companies use Agile and Lean Software Development practices and mindsets to have better control over the outcome of the software development and –delivery process.
  • #21: Agile project management frameworks, such as SCRUM, allow us to better and dynamically react to customer requirements and build quality in our products.
  • #25: It’s essentially about getting features into your users’ hands quickly!
  • #26: I would now like to present you two use-cases how Dynatrace helps you to deal with errors both effectively and efficiently. In this first use-case, I will present to you, how you can proactively uncover issues in your software before they affect your users.
  • #27: Clearly, the focus should be on fixing bugs in Development and Test, rather than in Operations. There is nothing more inefficient to fix a bug only when it has already hit your customers and when the developer does not remember his or her code.
  • #35: In this second use-case, I’d like to show you how you can identify root causes efficiently when a problem occurrs in your production system.
  • #36: "Do we still need War Rooms?" I claim that war rooms should really be a thing of the past. The term “war room” is used in fire-fighting scenarios when subject matter experts are summoned into a room to fix a critical problem. Usually, the people in this room know about the symptoms, but they don’t know much about the root cause or who should actually be involved - and if you ever had to gather insights from manually correlating piles of distributed log files you’ll know as much as I do that this can be quite daunting. Instead, you will want to involve only those people who are really related to the problem and have all the others keep their talents focused on business critical development, testing and operations.
  • #37: Recently, I participated in a discussion on the usefulness of logging on reddit and one Operator came up with this astonishing insight: he admitted it would sometimes take him weeks to figure out what was going on in his systems based on looking at log files. Hmm. I wonder what the deployment rate looked like? Every 2 weeks? Every 4 weeks? Quarterly? If they deployed into production every two weeks, his insights were most probably outdated at the time he had his “eureka” moment.
  • #38: Can we do any better? I say, Yes We Can!
  • #39: Looking at an application through Dynatrace allows you to see the global health status for all transactions, 24/7 no matter the degree of distribution over runtimes and physical or virtual machines. Right here we see that our application is affected by a failure on the Business Backend Server, which is indicated by the red circular segment. We can immediately observe that the failure does not originate from a problem in the infrastructure – however, if it did, we could dig down deeper here...
  • #41: ...and that it too does not originate from the Backend Server’s process – and we could also dig down deeper here and look at the CPU activity, memory consumption, the number of threads over time, the impact of the garbage collector, etc.
  • #42: The root cause, however, can be found in the transactions passing the Business Backend Server and as an Operator you may want to show this to a developer now. Still, you wanna know whether any business critical transactions are affected, such as anything related to a login, a search, a newsletter registration or a purchase.
  • #43: What we see here is: it’s the logins! They now have a 100% failure rate since your last deployment you made 10 minutes ago.
  • #44: What about the relevance? How do you assign priority to an issue? Here you observe that more than 100 login attempts by more than 60 users have failed.
  • #45: Ok, you decide not wo wait any longer and go talk to a developer. Which one? A backend developer in this case.
  • #46: Before you leave, you make a right-click and create a Session File in Dynatrace and save it on your disk. The Dynatrace Session File allows you to secure all evidence and share it with your peers for offline analysis. Think of it as a common language for Dev, Test and Ops.
  • #49: Instead we get a Backend Developer, a Tester and the Operator on the same table and have them look at the evidence in Dynatrace. So what would be the takeaways for the particular roles?
  • #50: The Developer can identify the root-cause by looking at actual method invocations and contextual information in all failed transactions...
  • #51: What the developer looks at here is a PurePath. A PurePath shows all data Dynatrace has recorded on behalf of a particular transaction. It is a combined tree that shows method invocations across runtime and machine boundaries, from the landing of the web request all the way down to the database and everything in between – no more manual correlation of distributed log files. This is root-cause analysis in minutes, not weeks. What you see combined on this dashboard are...
  • #52: The Tester can design new or rework existing test scenarios (whether manual or automated ones) and incorporate method arguments, HTTP parameters, and any other data captured by Dynatrace.
  • #53: And the Operator can configure alerts so that he and the two guys next to him get alerted should this failure ever come back again.