SlideShare a Scribd company logo
Verification Metrics Dave Williamson CPU Verification and Modeling Manager Austin Design Center June 2006
Verification Metrics: Why do we care? Predicting functional closure of a design is hard Design verification is typically the critical path CPU design projects rarely complete on schedule  Cost of failure to predict design closure is significant
Two key types of metrics Verification test plan based metrics Amount of direct tests completed  Amount of random testing completed Number of assertions written  Amount of functional coverage written and hit  Verification reviews completed Health of the design metrics Simulation passing rates Bug rate  Code stability Design reviews completed
Challenges and limitations Limitations of test plan based metrics Will give a best case answer for completion date The plan will grow as testing continues Limitations of health of the design based metrics Can give false impressions if used independent from test plan metrics Requires good historical data on similar project for proper interpretation General concerns to be aware of for all metrics What you measure will affect what you do Gathering metrics is not free Historical data can be misleading Don’t be a slave to the metrics:  they are a great tool, but not the complete answer
Bug rate example Knee in curve
Bug rate by unit example
Functional Coverage closure example New coverage points added

More Related Content

PPTX
Risk and Testing
PPTX
Predictive Analytics in Software Testing
PPTX
Fundamental test process hazahara
PDF
Defect Prevention & Predictive Analytics - XBOSoft Webinar
PDF
Consequences of Mispredictions of Software Reliability
PPTX
Fundamental test process endang
PPTX
Iwsm2014 mispredicting software reliability (rakesh rana)
PPTX
Testing strategies
Risk and Testing
Predictive Analytics in Software Testing
Fundamental test process hazahara
Defect Prevention & Predictive Analytics - XBOSoft Webinar
Consequences of Mispredictions of Software Reliability
Fundamental test process endang
Iwsm2014 mispredicting software reliability (rakesh rana)
Testing strategies

What's hot (20)

PPTX
Decreasing false positives in automated testing
PPTX
Schiable
PPT
Risks of Risk-Based Testing
PPSX
Root cause Analysis of Defects
PPT
PPTX
Fundamental Test Process
PPTX
What will testing look like in year 2020
PPTX
Technical Test Analyst - Introduction
PPTX
Reliable Relevant Metrics to the Right Audience - Manual Testing Whitepaper
PPTX
Fundamental test process_rendi_saputra_infosys_USR
PPTX
Risk-based Testing
PDF
Risk based testing a new case study
PPTX
Test Progress Monitoring and Control
PDF
Root Cause Analysis for Software Testers
PPT
Practical Application Of Risk Based Testing Methods
PDF
Advanced Defect Management
PDF
BugDay Bangkok 2009 Defect Management
PPTX
Software testing principles
PPTX
U08784 part 2 presentation
PDF
Effective Test Estimation
Decreasing false positives in automated testing
Schiable
Risks of Risk-Based Testing
Root cause Analysis of Defects
Fundamental Test Process
What will testing look like in year 2020
Technical Test Analyst - Introduction
Reliable Relevant Metrics to the Right Audience - Manual Testing Whitepaper
Fundamental test process_rendi_saputra_infosys_USR
Risk-based Testing
Risk based testing a new case study
Test Progress Monitoring and Control
Root Cause Analysis for Software Testers
Practical Application Of Risk Based Testing Methods
Advanced Defect Management
BugDay Bangkok 2009 Defect Management
Software testing principles
U08784 part 2 presentation
Effective Test Estimation
Ad

Viewers also liked (8)

PPT
Mr Beans Xmas
PPSX
القطع الزائد
PDF
Livro curso de direito constitucional positivo 1⪠parte - jos㩠afonso
DOCX
Obra hamlet
PPTX
Urgencias urologicas y esplecnetomia
PPTX
Media evaluation
PPT
L2 modes of documentary
PDF
La haine introduction
Mr Beans Xmas
القطع الزائد
Livro curso de direito constitucional positivo 1⪠parte - jos㩠afonso
Obra hamlet
Urgencias urologicas y esplecnetomia
Media evaluation
L2 modes of documentary
La haine introduction
Ad

Similar to Arm validation metrics (20)

PDF
CPU Verification
PDF
Basics of Functional Verification - Arrow Devices
PDF
Verification Planning and Metrics to Ensure Efficient Program Execution
PPT
Dv club foils_intel_austin
PPTX
ST UNIT-1.pptx
PPT
want to contact me login to www.stqa.org
PPT
Slides chapter 15
PPTX
Automating The Process For Building Reliable Software
PPTX
4th lecture on Software Testing given to KTU students.
PPTX
Testability: Factors and Strategy
PPT
system verilog
PPT
Software process and project metrics
PPT
Rob Baarda - Are Real Test Metrics Predictive for the Future?
PPTX
Comprehensive Analysis of Metrics in Software Engineering for Enhanced Projec...
PPTX
Enabling Visual Analytics with Unity - Exploring Regression Test Results in A...
PDF
Importance of software quality metrics
PPTX
A Software Testing Intro
PPT
Lecture3
PPTX
Software Measurement and Metrics.pptx
PPTX
Software metrics
CPU Verification
Basics of Functional Verification - Arrow Devices
Verification Planning and Metrics to Ensure Efficient Program Execution
Dv club foils_intel_austin
ST UNIT-1.pptx
want to contact me login to www.stqa.org
Slides chapter 15
Automating The Process For Building Reliable Software
4th lecture on Software Testing given to KTU students.
Testability: Factors and Strategy
system verilog
Software process and project metrics
Rob Baarda - Are Real Test Metrics Predictive for the Future?
Comprehensive Analysis of Metrics in Software Engineering for Enhanced Projec...
Enabling Visual Analytics with Unity - Exploring Regression Test Results in A...
Importance of software quality metrics
A Software Testing Intro
Lecture3
Software Measurement and Metrics.pptx
Software metrics

More from Obsidian Software (20)

PDF
Zhang rtp q307
PDF
Zehr dv club_12052006
PDF
Yang greenstein part_2
PDF
Yang greenstein part_1
PDF
Williamson arm validation metrics
PDF
Whipp q3 2008_sv
PPT
Vishakantaiah validating
PDF
Validation and-design-in-a-small-team-environment
PDF
Tobin verification isglobal
PDF
Tierney bq207
PDF
The validation attitude
PPT
Thaker q3 2008
PDF
Thaker q3 2008
PDF
Strickland dvclub
PDF
Stinson post si and verification
PDF
Shultz dallas q108
PDF
Shreeve dv club_ams
PDF
Sharam salamian
PDF
Schulz sv q2_2009
PDF
Schulz dallas q1_2008
Zhang rtp q307
Zehr dv club_12052006
Yang greenstein part_2
Yang greenstein part_1
Williamson arm validation metrics
Whipp q3 2008_sv
Vishakantaiah validating
Validation and-design-in-a-small-team-environment
Tobin verification isglobal
Tierney bq207
The validation attitude
Thaker q3 2008
Thaker q3 2008
Strickland dvclub
Stinson post si and verification
Shultz dallas q108
Shreeve dv club_ams
Sharam salamian
Schulz sv q2_2009
Schulz dallas q1_2008

Arm validation metrics

  • 1. Verification Metrics Dave Williamson CPU Verification and Modeling Manager Austin Design Center June 2006
  • 2. Verification Metrics: Why do we care? Predicting functional closure of a design is hard Design verification is typically the critical path CPU design projects rarely complete on schedule Cost of failure to predict design closure is significant
  • 3. Two key types of metrics Verification test plan based metrics Amount of direct tests completed Amount of random testing completed Number of assertions written Amount of functional coverage written and hit Verification reviews completed Health of the design metrics Simulation passing rates Bug rate Code stability Design reviews completed
  • 4. Challenges and limitations Limitations of test plan based metrics Will give a best case answer for completion date The plan will grow as testing continues Limitations of health of the design based metrics Can give false impressions if used independent from test plan metrics Requires good historical data on similar project for proper interpretation General concerns to be aware of for all metrics What you measure will affect what you do Gathering metrics is not free Historical data can be misleading Don’t be a slave to the metrics: they are a great tool, but not the complete answer
  • 5. Bug rate example Knee in curve
  • 6. Bug rate by unit example
  • 7. Functional Coverage closure example New coverage points added

Editor's Notes

  • #3: 1. More so than other areas of processor design, visibility of completion is still fairly low at the end of the project. Dreaded “when will we find the last bug” question. 2. Verification complexity increases non-linearly with design complexity 3. empirical evidence shows that projects are almost always delayed. Best case they hit the externally published schedule, but usually this is the 2 nd or 3 rd internal schedule… 4. Conservative estimates means lost design win opportunities, Optimistic estimates means slipped schedules or buggy silicon
  • #4: 1. Verification metrics are what is controlled by the DV team, health of the design is somewhat out of the control of the DV team 2. All metrics can be applied to full chip, or unit level of the design
  • #5: Test plan only covers what you know to do, not what do don’t know yet you need to do Test plan is non-exhaustive and when you find bugs in the design, new corner cases are exposed. This will happen all the way to the end of the project (historical data can help) Health of the design can look better or worse than what it really is based on what is currently happening on the testing side Most health of the design metrics are trailing indicators, so you really need good historical data on similar projects to make full use of them Need to be careful to avoid meeting the letter of the law but not the intent: For example, if you have hard metrics on cycles run per week or tests written per week, test/cycle quality might go down. Need to think up front about how you want to use metrics to make sure you track the right things and also need to account for the time to build the infrastructure required to do it Historical data is very useful, but every project is different, and generally speaking future projects are more complex than previous ones, so needs to be taken with a grain of salt Metrics won’t replace subjective gut feel from experience. If gut feel is that the design is not ready for tapeout, then it probably isn’t. Need to take metric results with a grain of salt. This applies to the final ‘when we done’ as well as determing critical priorities throughout the project
  • #6: Total bug graph fairly linear with one pronounced knee at about the 75% point Bugs per week pretty sporadic until it drops off at knee This is 4 week rolling average…results are even more sporadic if raw count is used
  • #7: Breakdown by unit can be useful to indicate early stablilty of certain units (or point to deficit testing) Relative number of bugs found per area is roughly consistent with expectations based on complexity of each unit SIMD unit was an early focus and got stable before the rest of the design
  • #8: Getting up to low 90% happens pretty quickly and most of the time is spent on closing the final 5% of the points Expect to have a few dips along the way as new coverage that wasn’t originally planned is added to the design May improve tracking in the future…breakout crosses vs. single points, add some way to indicate priority of points