SlideShare a Scribd company logo
Four Years of
           Quantum Simulation

                       Insights and Lessons Learned
                         Verifying the QoS Engine of
                    an Innovative Network Processor


2/3/2009    Copyright © 2009, Achilles Test Systems, Inc.
About the Author
     • Co-Founder: EDA Software (2006-present)
            – DV Notebook Series, Achilles Test Systems
     • Technical Leader: Arch, Micro-Arch, & Verification
            – Packet Scheduler of Quantum Flow Processor, Cisco
     • Principal HW Engineer: RTL
            – Packet Scheduler on forwarding ASIC, Hammerhead
     • Principal HW Engineer: Architectural modeling & RTL
            – Bandwidth Aware Memory Controller, C-port
     • Senior SW Engineer: EDA Software
            – RTL Compilation, ASIC Emulator, Meta Systems
     • Senior HW Engineer: RTL & Embedded Software
            – Ethernet and ATM Switching, UB Networks
 2/3/2009              Copyright © 2009, Achilles Test Systems, Inc.   2
Lessons Apply to Many SoCs

 • Features, complexity, ship date
     –   Innovative enough to represent some risk
     –   Leap-frog current (next) generation
     –   Tested at many layers of abstraction: unit, chip, architectural
     –   Diverse IP blocks

 • Typical SoC team demographics
     – Multi-site, Multi-discipline, Multi-group

 • Challenges
     – How to leverage corporate resources
     – How to track, prioritize, and close issues
     – How to maintain momentum and focus
  2/3/2009              Copyright © 2009, Achilles Test Systems, Inc.      3
Take-home Message

• Use more CPUs, not more bodies
   –   100’s of CPU years of simulation
   –   1000’s of semi-directed test cases
   –   1M’s of test executions, stored results
   –   Methodical closure of issues and exceptions

• Requires investment in environment
   – Increase productivity by 2 orders of magnitude
   – Find ways to keep the work load manageable
   – This talk presents a few ideas…




  2/3/2009            Copyright © 2009, Achilles Test Systems, Inc.   4
Related DV Club Presentations

 • Kersten Eder: University of Bristol
      – Genetic Programming in Automated Test Code Generation
        for a Multi-Treaded Microprocessor
      – DV Club: Bristol – Q4 2008
 • Wilson Snyder: SiCortex
      – Test E.R. Our hope to triage a million tests a day
      – DV Club: Boston – Q4 2007
 • Don Steiss: Cisco
      – Layered Self-checking Test Generation
      – DV Club: Dallas - Q2 2007
 • Shahram Salamian: Intel
      – CPU Verification Metrics
      – DV Club: Austin – Q2 2006

 2/3/2009            Copyright © 2009, Achilles Test Systems, Inc.   5
Hardware Testing Schedules
                       Arch.
                      Testing

                                         RTL Testing

                                                                   QA Lab
                                                                   Testing
Theory:
Reality:                        $$$ SoC           $ FPGA
             Arch.                                              Arch.
            Testing                                            Testing

            RTL Testing                                                   RTL Testing


                  Emulation     QA Lab                                     QA Lab Testing
                                Testing


 2/3/2009                 Copyright © 2009, Achilles Test Systems, Inc.                     6
Planning for First-Pass Silicon

 • Assume a typical SoC flow
      – New architecture, next-gen features
      – Mitigate risk, build confidence
                                                                        Arch.             SoC
                                                                       Testing
 • Leverage corporate resources
                                                                       RTL Testing
      – How many CPUs? (5, 50, 500, more)
      – Assume 1-2 hours per test                                                    QA Lab
                                                                                     Testing
      – How many test runs prior to tape-out?
            • 10 CPUs, 120 runs/day, 44K runs per year
            • 200 CPUs, 2400 runs/day, 876K runs per year

 • Constraints / Obstacles
      – Small team, often new to problem domain
      – No pre-existing tests for next-gen features
      – Millions of cycles needed per test

 2/3/2009              Copyright © 2009, Achilles Test Systems, Inc.                      7
Getting Help on Test Creation

• End-user test descriptions
   – e.g. Simplified Command Line Interface (CLI)
   – Architects, marketing, FAEs, SW, and QA contribute to testing
   – Jump starts testing with 50-100 important configurations

• Have S/W team review testing configurations
   – Architectural simulation pre-dates register definition
   – Create a Config API, higher level than registers
                                                                        ASIC Testing OR
• Lessons learned on past projects…                                     Platform Software
   – Simple CLI is not good for all tasks
         • Hard to create certain low-level cases                           Config API
         • Missing some chassis-level details                           reg-R/W      C
   – Rigid implementation of test language
         • Lex and Yacc require expert S/W design                       RTL OR Ref Model
                                                                         ASIC
  2/3/2009              Copyright © 2009, Achilles Test Systems, Inc.                  8
Leveraging a High Speed Model

 • Emulation: Ideal
      – Fast and cycle accurate, but expensive

 • Otherwise: Fast simulation                                   Performance
      –     C model for requirements testing                    > 1MHz
      –     C & RTL cosim for comparison                        < 1KHz
      –     Discrete-event when possible                        T=f(pps)
      –     Cycle-accurate when necessary                       T=f(Hz)

 • Look for transitive correctness
      – IF           (C satisfies requirements)                            C
      – AND          (C==RTL)
      – THEN         RTL satisfies requirements
                                                                  RTL

 2/3/2009               Copyright © 2009, Achilles Test Systems, Inc.          9
Create a Balanced Testing Pipeline


            Automatic           CPU Farm                  Results
            Test                Intelligent               Scoring &
            Generation          Launch                    Browsing

                1                      2                         3
        ~100 CPUs : ~1000 Tests a day (~ 85% existing,~ 15% new)


                            Detect, Debug, Diagn
                                                 ose,
                            Report, Repair, Rerun
   Assume 6-8 tests                                                   Keep team busy
  debugged per day                                                    without drowning
                                                                      them


 2/3/2009             Copyright © 2009, Achilles Test Systems, Inc.                10
Need a Game-Changing Technology


 • Consider using an SQL database
      – Store test results in DB
            • Track failures, history, fixes, complement bug DB
      – Store test lists in DB
            • Tune run lists to maximize value
      – Store actual tests in DB
            • Treat a test as a data structure


 • But…
      – Hand-coded C↔SQL interface can be rigid
            • Impedes the creation of diverse test types and result formats
      – Need a light-weight way to upload any test description
 2/3/2009               Copyright © 2009, Achilles Test Systems, Inc.   1 2 3   11
Automated Test Generation

 • Plan the test population
      – A certain number of constrained-random tests
      – e.g. 1,000 - 10,000 directed configs
             10 - 100 input scenarios per config
             • Yields 10K-100K unique tests

 • Upload directed tests to DB
      – Small programs perform parameter sweeps
      – Heuristic coverage mutations of bin-neighbors

 • Lesson learned by analyzing generated tests…
      –     Found some unintended common factors
      –     Snow-ball effects: too much in-breeding
      –     Try to limit the influence of any one test
      –     Only possible when functional coverage is persistent

 2/3/2009                Copyright © 2009, Achilles Test Systems, Inc.   1 2 3   12
Test Generation w/ Feedback

 • Testing for numeric error vs. pass/fail
      – Performance (QoS) divergence from a theoretical model
      – Also useful in Analog/Mixed-Signal electrical compliance

 • Optimization: Particle Swarms or Genetic Algorithms
      –     Apply stimulus
      –     Measure error in test result
      –     Create modified tests to amplify result
      –     Only possible if both test and result are data

 • But…
      – Optimizations only amplify problems
      – They do not provide coverage closure
      – Most effective when used with coverage sweeps

 2/3/2009                Copyright © 2009, Achilles Test Systems, Inc.   1 2 3   13
Regression Launch Functions

• Linux farm management
   – Simple user-managed queuing
   – Lighten the load on corporate job queuing                        DB           CPU
                                                                                   Farm
   – Custom queuing policies: e.g. time-of-day

• Rerunning <test, seed> pairs
   – Every failing <test, seed> pair is a mini bug report
   – Aggressive re-run of failing <test, seed> pairs
   – Automatically isolate the check-in that causes a new failure
        • n-ary search using the same test and seed
                                             If only I had
• Interesting challenges                   run a regression!
   – Fairness and perception in a linux-farm
   – C-only simulations vs. licensed apps
 2/3/2009             Copyright © 2009, Achilles Test Systems, Inc.        1 2 3   14
Coverage Tables and Test Lists

• Functional coverage tables
   – Capture coverage assertions from all model types
   – Assertions in C & RTL, special regs in FPGAs

• When results, coverage, & tests list are data…
   –   compute expected time per test list
   –   compute expected coverage per test list
   –   optimize out tests that don’t add coverage
   –   swap out one test for a newer tests w/ same coverage
   –   select groups tests to run for any reason
   –   rotate among test lists w/ similar coverage

• But…
   – Creating test lists should not require SQL knowledge
   – Coverage tables grow large, performance can suffer

 2/3/2009            Copyright © 2009, Achilles Test Systems, Inc.   1 2 3   15
Checking and Result Management

• To monitor output from 1000 test a day...
   – Create automated results checkers
   – Flag feature and performance issues      (end of simulation)
   – Flag C/RTL mismatch or assertion failure (within 10 cycles)

• But…
   – Limitations of a checker can impose limits on test stimulus
   – Test harness can become too narrowly focused

• Lessons learned
   – Need to create diverse drivers and checkers
   – Not every stimulus should pass every checker
   – Emphasis on flexibility, agility

  2/3/2009           Copyright © 2009, Achilles Test Systems, Inc.   1 2 3   16
Leverage (the Right) Reusable Pieces

 • Open source web packages
     –   Sort results by error type and severity
     –   Drill down: result, history, coverage, code version
     –   Small status pages for each regression
     –   Display useful pre-made queries/graphs via web site

 • Create web/DB mirrors for each model type
     – Models: C, Cosim, and RTL-only
     – Results and coverage per model type
     – Higher level management views for group status

 • Lessons learned…
     – Hand coded CGI can be laborious
     – Other teams could not directly re-use our work
     – Should not require users to know SQL
  2/3/2009             Copyright © 2009, Achilles Test Systems, Inc.   1 2 3   17
Lessons Learned

• Drive to SoC tape-out with high confidence
   –   Methodical closure of feature issues and exceptions
   –   Exceed 10K tests & 100K captured results per engineer
   –   Over 1M CPU hours of simulation (125 CPU years)
   –   Keep team fed with work and information
   –   Small team, Very high productivity

• Would definitely do it again, but…
   – Make it easy enough to be useful on smaller projects
        • 1-5 engineers, 5-10 sims per day
   –   Leverage recent Web 2.0 advances
   –   Canned database-backed platforms
   –   Wikis
   –   Social collaboration and tagging

 2/3/2009             Copyright © 2009, Achilles Test Systems, Inc.   18
Thank You

                                   Questions?
                chris.kappler@achillestest.com


2/3/2009   Copyright © 2009, Achilles Test Systems, Inc.

More Related Content

PDF
The Cortex-A15 Verification Story
PDF
Sharam salamian
PDF
Stinson post si and verification
PDF
Zhang rtp q307
PPT
Validating Next Generation CPUs
PDF
Zehr dv club_12052006
PPTX
20110812 CyberTAN presentation
PDF
Validation and-design-in-a-small-team-environment
The Cortex-A15 Verification Story
Sharam salamian
Stinson post si and verification
Zhang rtp q307
Validating Next Generation CPUs
Zehr dv club_12052006
20110812 CyberTAN presentation
Validation and-design-in-a-small-team-environment

What's hot (20)

PDF
ANSYS SCADE Usage for Unmanned Aircraft Vehicles
PDF
Apache Big Data Europe 2016
PDF
Being Agile with Scrum - koders.co
PDF
Strickland dvclub
PDF
Jai kumar fpga_prototyping
PDF
Run Scala Faster with GraalVM on any Platform / GraalVMで、どこでもScalaを高速実行しよう by...
PDF
PowerDRC/LVS 2.0 Overview
PPS
JedaOverview
PDF
Rethinking Testing
PDF
Implementing Electrical and Simulation Rule Checks to ensure Signal Quality
PDF
Respond flow chart (rfc)
PDF
Comprehensive Performance Testing: From Early Dev to Live Production
PPTX
Triad Semiconductor Analog and Mixed Signal ASIC Company Overview
PDF
In Sync Running Apps On Oracle
PDF
WALA Tutorial at PLDI 2010
PDF
DevOps Tooling event Amazic
DOC
Kishore ems resume
PDF
Reliability Testing in OPNFV
PPTX
Bringing Engineering Analysis Codes Into Real-Time Full-Scope Simulators
PPTX
Combining requirements engineering and testing in agile.
ANSYS SCADE Usage for Unmanned Aircraft Vehicles
Apache Big Data Europe 2016
Being Agile with Scrum - koders.co
Strickland dvclub
Jai kumar fpga_prototyping
Run Scala Faster with GraalVM on any Platform / GraalVMで、どこでもScalaを高速実行しよう by...
PowerDRC/LVS 2.0 Overview
JedaOverview
Rethinking Testing
Implementing Electrical and Simulation Rule Checks to ensure Signal Quality
Respond flow chart (rfc)
Comprehensive Performance Testing: From Early Dev to Live Production
Triad Semiconductor Analog and Mixed Signal ASIC Company Overview
In Sync Running Apps On Oracle
WALA Tutorial at PLDI 2010
DevOps Tooling event Amazic
Kishore ems resume
Reliability Testing in OPNFV
Bringing Engineering Analysis Codes Into Real-Time Full-Scope Simulators
Combining requirements engineering and testing in agile.
Ad

Viewers also liked (13)

PPTX
Lululemon PPT (1)
PDF
Expertsamhället - 17 april 2013
PPTX
Сапиенс Консалтинг. Семинар во Владимире
PDF
Federal Government Service
PPT
第2回別府路地裏バル参加店説明会用
PDF
Learning in the Digital Era
PDF
The Validation Attitude
PDF
第2回別府路地裏バル参加店説明会用
PPTX
Ignite ppt
PDF
AM37x EVM
PPT
TLM Based Software Control of UVCs for Vertical Verification Reuse
PPT
Cell Verification Lead
TXT
Placas de Sinalização de Segurança
Lululemon PPT (1)
Expertsamhället - 17 april 2013
Сапиенс Консалтинг. Семинар во Владимире
Federal Government Service
第2回別府路地裏バル参加店説明会用
Learning in the Digital Era
The Validation Attitude
第2回別府路地裏バル参加店説明会用
Ignite ppt
AM37x EVM
TLM Based Software Control of UVCs for Vertical Verification Reuse
Cell Verification Lead
Placas de Sinalização de Segurança
Ad

Similar to Insights and Lessons Learned Verifying the QoS Engine of a Network Processor (20)

PDF
Intel Atom Processor Pre-Silicon Verification Experience
PDF
Public vs. Private Cloud Performance by Flex
PDF
OOW09 Ebs Tuning Final
PDF
PDF
Learning on Deep Learning
PPTX
SQL Server in the AWS Cloud
PDF
Best practices in Deploying SUSE CaaS Platform v3
PPTX
Cost Effectively Run Multiple Oracle Database Copies at Scale
PPT
Dealing with the Three Horrible Problems in Verification
PDF
Technical Lessons Learned Turning the Agile Dials to Eleven!
PPTX
Simulating Networks Using Cisco Modeling Labs (TechWiseTV Workshop)
PDF
Cray HPC Environments for Leading Edge Simulations
PDF
Introduction to tempest
PDF
Designing for Testability - Rohit Nayak
PDF
Architecture for Massively Parallel HDL Simulations
PPTX
The_Little_Jenkinsfile_That_Could
PPTX
Web Application Release
PDF
Kotlin @ Coupang Backed - JetBrains Day seoul 2018
PPTX
Lambda architecture on Spark, Kafka for real-time large scale ML
PDF
Run Your Oracle BI QA Cycles More Effectively
Intel Atom Processor Pre-Silicon Verification Experience
Public vs. Private Cloud Performance by Flex
OOW09 Ebs Tuning Final
Learning on Deep Learning
SQL Server in the AWS Cloud
Best practices in Deploying SUSE CaaS Platform v3
Cost Effectively Run Multiple Oracle Database Copies at Scale
Dealing with the Three Horrible Problems in Verification
Technical Lessons Learned Turning the Agile Dials to Eleven!
Simulating Networks Using Cisco Modeling Labs (TechWiseTV Workshop)
Cray HPC Environments for Leading Edge Simulations
Introduction to tempest
Designing for Testability - Rohit Nayak
Architecture for Massively Parallel HDL Simulations
The_Little_Jenkinsfile_That_Could
Web Application Release
Kotlin @ Coupang Backed - JetBrains Day seoul 2018
Lambda architecture on Spark, Kafka for real-time large scale ML
Run Your Oracle BI QA Cycles More Effectively

More from DVClub (20)

PDF
IP Reuse Impact on Design Verification Management Across the Enterprise
PDF
Cisco Base Environment Overview
PDF
Intel Xeon Pre-Silicon Validation: Introduction and Challenges
PDF
Verification of Graphics ASICs (Part II)
PDF
Verification of Graphics ASICs (Part I)
PDF
Stop Writing Assertions! Efficient Verification Methodology
PPT
Verification Automation Using IPXACT
PDF
Validation and Design in a Small Team Environment
PDF
Trends in Mixed Signal Validation
PDF
Verification In A Global Design Community
PDF
Design Verification Using SystemC
PDF
Verification Strategy for PCI-Express
PDF
SystemVerilog Assertions (SVA) in the Design/Verification Process
PDF
Efficiency Through Methodology
PDF
Pre-Si Verification for Post-Si Validation
PDF
OpenSPARC T1 Processor
PDF
Using Assertions in AMS Verification
PDF
Low-Power Design and Verification
PDF
UVM Update: Register Package
PDF
Verification of the QorIQ Communication Platform Containing CoreNet Fabric wi...
IP Reuse Impact on Design Verification Management Across the Enterprise
Cisco Base Environment Overview
Intel Xeon Pre-Silicon Validation: Introduction and Challenges
Verification of Graphics ASICs (Part II)
Verification of Graphics ASICs (Part I)
Stop Writing Assertions! Efficient Verification Methodology
Verification Automation Using IPXACT
Validation and Design in a Small Team Environment
Trends in Mixed Signal Validation
Verification In A Global Design Community
Design Verification Using SystemC
Verification Strategy for PCI-Express
SystemVerilog Assertions (SVA) in the Design/Verification Process
Efficiency Through Methodology
Pre-Si Verification for Post-Si Validation
OpenSPARC T1 Processor
Using Assertions in AMS Verification
Low-Power Design and Verification
UVM Update: Register Package
Verification of the QorIQ Communication Platform Containing CoreNet Fabric wi...

Recently uploaded (20)

PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Electronic commerce courselecture one. Pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Modernizing your data center with Dell and AMD
PDF
KodekX | Application Modernization Development
PPTX
A Presentation on Artificial Intelligence
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
Teaching material agriculture food technology
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Network Security Unit 5.pdf for BCA BBA.
Digital-Transformation-Roadmap-for-Companies.pptx
cuic standard and advanced reporting.pdf
Encapsulation theory and applications.pdf
MYSQL Presentation for SQL database connectivity
Electronic commerce courselecture one. Pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Encapsulation_ Review paper, used for researhc scholars
Modernizing your data center with Dell and AMD
KodekX | Application Modernization Development
A Presentation on Artificial Intelligence
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
NewMind AI Weekly Chronicles - August'25 Week I
Per capita expenditure prediction using model stacking based on satellite ima...
“AI and Expert System Decision Support & Business Intelligence Systems”
CIFDAQ's Market Insight: SEC Turns Pro Crypto
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Teaching material agriculture food technology
The Rise and Fall of 3GPP – Time for a Sabbatical?
Network Security Unit 5.pdf for BCA BBA.

Insights and Lessons Learned Verifying the QoS Engine of a Network Processor

  • 1. Four Years of Quantum Simulation Insights and Lessons Learned Verifying the QoS Engine of an Innovative Network Processor 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc.
  • 2. About the Author • Co-Founder: EDA Software (2006-present) – DV Notebook Series, Achilles Test Systems • Technical Leader: Arch, Micro-Arch, & Verification – Packet Scheduler of Quantum Flow Processor, Cisco • Principal HW Engineer: RTL – Packet Scheduler on forwarding ASIC, Hammerhead • Principal HW Engineer: Architectural modeling & RTL – Bandwidth Aware Memory Controller, C-port • Senior SW Engineer: EDA Software – RTL Compilation, ASIC Emulator, Meta Systems • Senior HW Engineer: RTL & Embedded Software – Ethernet and ATM Switching, UB Networks 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 2
  • 3. Lessons Apply to Many SoCs • Features, complexity, ship date – Innovative enough to represent some risk – Leap-frog current (next) generation – Tested at many layers of abstraction: unit, chip, architectural – Diverse IP blocks • Typical SoC team demographics – Multi-site, Multi-discipline, Multi-group • Challenges – How to leverage corporate resources – How to track, prioritize, and close issues – How to maintain momentum and focus 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 3
  • 4. Take-home Message • Use more CPUs, not more bodies – 100’s of CPU years of simulation – 1000’s of semi-directed test cases – 1M’s of test executions, stored results – Methodical closure of issues and exceptions • Requires investment in environment – Increase productivity by 2 orders of magnitude – Find ways to keep the work load manageable – This talk presents a few ideas… 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 4
  • 5. Related DV Club Presentations • Kersten Eder: University of Bristol – Genetic Programming in Automated Test Code Generation for a Multi-Treaded Microprocessor – DV Club: Bristol – Q4 2008 • Wilson Snyder: SiCortex – Test E.R. Our hope to triage a million tests a day – DV Club: Boston – Q4 2007 • Don Steiss: Cisco – Layered Self-checking Test Generation – DV Club: Dallas - Q2 2007 • Shahram Salamian: Intel – CPU Verification Metrics – DV Club: Austin – Q2 2006 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 5
  • 6. Hardware Testing Schedules Arch. Testing RTL Testing QA Lab Testing Theory: Reality: $$$ SoC $ FPGA Arch. Arch. Testing Testing RTL Testing RTL Testing Emulation QA Lab QA Lab Testing Testing 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 6
  • 7. Planning for First-Pass Silicon • Assume a typical SoC flow – New architecture, next-gen features – Mitigate risk, build confidence Arch. SoC Testing • Leverage corporate resources RTL Testing – How many CPUs? (5, 50, 500, more) – Assume 1-2 hours per test QA Lab Testing – How many test runs prior to tape-out? • 10 CPUs, 120 runs/day, 44K runs per year • 200 CPUs, 2400 runs/day, 876K runs per year • Constraints / Obstacles – Small team, often new to problem domain – No pre-existing tests for next-gen features – Millions of cycles needed per test 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 7
  • 8. Getting Help on Test Creation • End-user test descriptions – e.g. Simplified Command Line Interface (CLI) – Architects, marketing, FAEs, SW, and QA contribute to testing – Jump starts testing with 50-100 important configurations • Have S/W team review testing configurations – Architectural simulation pre-dates register definition – Create a Config API, higher level than registers ASIC Testing OR • Lessons learned on past projects… Platform Software – Simple CLI is not good for all tasks • Hard to create certain low-level cases Config API • Missing some chassis-level details reg-R/W C – Rigid implementation of test language • Lex and Yacc require expert S/W design RTL OR Ref Model ASIC 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 8
  • 9. Leveraging a High Speed Model • Emulation: Ideal – Fast and cycle accurate, but expensive • Otherwise: Fast simulation Performance – C model for requirements testing > 1MHz – C & RTL cosim for comparison < 1KHz – Discrete-event when possible T=f(pps) – Cycle-accurate when necessary T=f(Hz) • Look for transitive correctness – IF (C satisfies requirements) C – AND (C==RTL) – THEN RTL satisfies requirements RTL 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 9
  • 10. Create a Balanced Testing Pipeline Automatic CPU Farm Results Test Intelligent Scoring & Generation Launch Browsing 1 2 3 ~100 CPUs : ~1000 Tests a day (~ 85% existing,~ 15% new) Detect, Debug, Diagn ose, Report, Repair, Rerun Assume 6-8 tests Keep team busy debugged per day without drowning them 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 10
  • 11. Need a Game-Changing Technology • Consider using an SQL database – Store test results in DB • Track failures, history, fixes, complement bug DB – Store test lists in DB • Tune run lists to maximize value – Store actual tests in DB • Treat a test as a data structure • But… – Hand-coded C↔SQL interface can be rigid • Impedes the creation of diverse test types and result formats – Need a light-weight way to upload any test description 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 1 2 3 11
  • 12. Automated Test Generation • Plan the test population – A certain number of constrained-random tests – e.g. 1,000 - 10,000 directed configs 10 - 100 input scenarios per config • Yields 10K-100K unique tests • Upload directed tests to DB – Small programs perform parameter sweeps – Heuristic coverage mutations of bin-neighbors • Lesson learned by analyzing generated tests… – Found some unintended common factors – Snow-ball effects: too much in-breeding – Try to limit the influence of any one test – Only possible when functional coverage is persistent 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 1 2 3 12
  • 13. Test Generation w/ Feedback • Testing for numeric error vs. pass/fail – Performance (QoS) divergence from a theoretical model – Also useful in Analog/Mixed-Signal electrical compliance • Optimization: Particle Swarms or Genetic Algorithms – Apply stimulus – Measure error in test result – Create modified tests to amplify result – Only possible if both test and result are data • But… – Optimizations only amplify problems – They do not provide coverage closure – Most effective when used with coverage sweeps 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 1 2 3 13
  • 14. Regression Launch Functions • Linux farm management – Simple user-managed queuing – Lighten the load on corporate job queuing DB CPU Farm – Custom queuing policies: e.g. time-of-day • Rerunning <test, seed> pairs – Every failing <test, seed> pair is a mini bug report – Aggressive re-run of failing <test, seed> pairs – Automatically isolate the check-in that causes a new failure • n-ary search using the same test and seed If only I had • Interesting challenges run a regression! – Fairness and perception in a linux-farm – C-only simulations vs. licensed apps 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 1 2 3 14
  • 15. Coverage Tables and Test Lists • Functional coverage tables – Capture coverage assertions from all model types – Assertions in C & RTL, special regs in FPGAs • When results, coverage, & tests list are data… – compute expected time per test list – compute expected coverage per test list – optimize out tests that don’t add coverage – swap out one test for a newer tests w/ same coverage – select groups tests to run for any reason – rotate among test lists w/ similar coverage • But… – Creating test lists should not require SQL knowledge – Coverage tables grow large, performance can suffer 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 1 2 3 15
  • 16. Checking and Result Management • To monitor output from 1000 test a day... – Create automated results checkers – Flag feature and performance issues (end of simulation) – Flag C/RTL mismatch or assertion failure (within 10 cycles) • But… – Limitations of a checker can impose limits on test stimulus – Test harness can become too narrowly focused • Lessons learned – Need to create diverse drivers and checkers – Not every stimulus should pass every checker – Emphasis on flexibility, agility 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 1 2 3 16
  • 17. Leverage (the Right) Reusable Pieces • Open source web packages – Sort results by error type and severity – Drill down: result, history, coverage, code version – Small status pages for each regression – Display useful pre-made queries/graphs via web site • Create web/DB mirrors for each model type – Models: C, Cosim, and RTL-only – Results and coverage per model type – Higher level management views for group status • Lessons learned… – Hand coded CGI can be laborious – Other teams could not directly re-use our work – Should not require users to know SQL 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 1 2 3 17
  • 18. Lessons Learned • Drive to SoC tape-out with high confidence – Methodical closure of feature issues and exceptions – Exceed 10K tests & 100K captured results per engineer – Over 1M CPU hours of simulation (125 CPU years) – Keep team fed with work and information – Small team, Very high productivity • Would definitely do it again, but… – Make it easy enough to be useful on smaller projects • 1-5 engineers, 5-10 sims per day – Leverage recent Web 2.0 advances – Canned database-backed platforms – Wikis – Social collaboration and tagging 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc. 18
  • 19. Thank You Questions? chris.kappler@achillestest.com 2/3/2009 Copyright © 2009, Achilles Test Systems, Inc.