SlideShare a Scribd company logo
The Validation
  Attitude
           Bob Colwell
             April 2010




                     1
                          1
Attitude
Icould talk about techniques, tools,
 FV
Environments, algorithms, machinery

Languages,   suites, training

but I think attitude is more important
 than any of those

                                   2
                                       2
No Perfect Designs
     Nothing is perfect, everything has bugs
           – Shortcomings, compromises, defects, design errata, gaffes, goofs,
             fumbles, errors, boneheaded mistakes, bobbles, bungles, boo-boos
           – But not all bugs are equal!
     Can’t test to saturation: schedule matters too
     Why is everything always so darned buggy?
           –   Software…need say no more…
           –   Why did Titanic not have waterproof compartments?
           –   Why did Ford Pinto have gas tank in back?
           –   Why did Challenger fly with leaky O-rings?
           –   Why did torpedoes not explode in WWII?



Entropy has a preferred direction
   Only genius could paint Mona Lisa,
      but any small child can destroy it quickly
   1000 ways to do things wrong, 1 or 2 that work
  4/4/07                               Bob Colwell                               3
                                                                                     3
Prescription: SW visualization, tools to localize
bugs, diagnose problems, and instrument behavior 4   4
Accidents Are Inevitable
             – It's the nature of engineering
               to push designs to edge of
               failure (schedule, reliability,
               thermals, materials, tools,
               judgment of unknowns)
             – P(accident) = ε , for ε ≠ 0
             – World rewards this behavior
                    Cool new features + first to
                      market often preferred to
                      dependability
                    Other markets (life-support)
                      make (or should make) this
                      trade-off differently!

4/4/07      Bob Colwell                      5
                                                   5
Isn’t that just                 ?

             Close. But Murphy is not
                quite right.

                  1. #Near-misses >> #disasters
                  2. Competent design/test finds
                     simple errors
                  3. Complex sequences & unlikely
   4/4/07            event Colwell
                        Bob cascades survive to prod’n
                                                    6
                                                         6
Failures Getting Worse
   Mechanical          things usually fail predictably due to physics
        – Wings bend, bridges groan, engines rattle, knees ache
        – By contrast, computer-based things fail “all over the place”


Helpful Engineering Attitude:
   1.     Nature does not want your
          engineered system to work; will
          actively work against you
   2.     Your design will do only what
          you’ve constrained it to do, only
          as long as it has to
   3.     Watch out for…
Normalization of deviance
   (Challenger O-rings, Apollo
   1 fire)

        4/4/07                           Bob Colwell                     7
                                                                             7
The Steely-Eyed Missile Validator
 Apollo           12
   2nd try to land on moon, launched 11/14/69
   36 seconds after liftoff, spacecraft struck by lightning => power
    surge                                                       r t ant             o
     –   All telemetry went haywire; book said to abort liftoff
                                                                       m os    t imp T?”
                                                         said 3 t w                HA
   –     Both spacecraft pilot and mission controller were furiously considering that option
   – But John Aaron was on shift, and thought he’d seen this malfunction beforeas T
                                             once “Wha
 During testing 1 ac As
                                   imov e observed test that went off into weeds
                        year earlier, Aaron ar
                                                          e
                 Isa to investigateien–c him to obscure SCE subsystem
                               in sc this led
                      ords
   – Aaron took it on himself

                   w
   In critical “abort or not” few seconds, with lives on line, Aaron made one of
    most famous calls in NASA history
     –   “Flight, try SCE to ‘Aux’”
     –   Neither Flight nor spacecraft pilot Conrad knew what that even meant, but Alan Bean tried it
     –   Telemetry came right back, vaulted Aaron into validation stardom


   He could have blown off earlier test, but he didn’t
   His inner validator wanted to know “what just happened?”                                            8
                                                                                                            8
Complexity Implies Surprises
 …and surprises are bad
 Chaos effects in complex µ P’s
   – Decomposability is a fundamental tenet of
     complex system design
   – Butterfly wings ruin decomposability
   – “Improve design, get slower performance” not
     at all uncommon
 We  must stop designing large
  systems as though small ones simply
  scale up
                                             9
   – lesson from comm engineers: assume errors      9
Thinking about validation
Abilityto think in analogies is highest
 form of intelligence
  – IQ tests like “a:b :: c:d”
  – Hofstadter's book: numerical sequences
Analogies  may illuminate a subject in
 a way that direct introspection cannot
  – They drive our minds to their creative limits



                                                    10
                                                         10
Listen to Your Inner Validator
0, 1, 2, …?


You    knew it wouldn’t be 3, didn’t you?
  – You sensed something’s not quite as it seems
Answer: 0, 1, 2, 720!, …
  = 0, 1, 2, 6!!
                          D. Hofstadter, Fluid Concepts and Creative Analogies
  = 0, 1!, 2!!, 3!!!, …
That was the voice of your inner
 validator that you were hearing
                                                                          11
                                                                                 11
Lesson: Trust Nothing
             Hyatt  Regency
              hotel, Missouri,
              1980
             Catwalks on rods

             40’ threaded rods
              with nuts halfway
             Killed 114,
              injured 200
                            12
                                 12
What Happened?
            Spec  was marginal
            40’ threaded rods
             “too hard”, changed
             to 2x20’ by contractor
            No simulation, no test



            Who goofed?
            Engineer, contractor,
            inspector…everyone
                               13
                                    13
Therac-25
            Medical  particle
             accelerator
            Electrons,
             protons, X-rays
            Six fatalities
             from poor
             system/SW
             design
              – And blind naïve
                               14
                faith in computers!
                                  14
Question Everything
Test   assumptions as well as design
 – If assumptions are broken, design surely is too
 – Try to “catch the field goals”




                                               15
                                                     15
Fight Urge to Relax Requirements
  Challenger
    – Not ok to slip design assumptions (launch temp,
      # of unburnt O-rings) to suit desires
  Airbus
    – Blaming pilot not reasonable explanation; pilot
      is part of system design
  Runway       “incursions” up 71% since ‘93
    – Near-misses are trying to tell us something

    Diane Vaughan, The Challenger Launch Decision, Chicago Press
      1996; Nancy Leveson, Safeware, Addison-Wesley 1995
                                                             16
                                                               16
If You Didn’t Test It,
              It Doesn’t Work




Mir:   fire extinguishers bolted to wall
  – Still had strong metal launch straps
  – Had never been needed before, so never tested
  – Discovered with a roaring fire several feet away
                                                17
                                                     17
Complexity Makes Everything Worse
   Some        things must be complicated to do their job
          – Our brains, for example
   But      complex sequences are root of most disasters
          – Challenger, Bhopal, Chernobyl, FDIV, Exxon Valdez
   Where       does complexity come from? Why does it
         keep increasing? Where are the limits?
          – Pentium 4
   “in  the small” vs “in the large” design (micros vs
    comm systems)
   What to do? Vigilance, testing, awareness…we are
    all validators
4/4/07                           Bob Colwell                    18
                                                                     18
What To Do
   Get the spec right
   Design for correctness but…
   design knowing perfection is unattainable
   Users are part of the system
   Formal methods
   Pre-production testing and validation
   Post-production testing and verification
   Education of the public
                                            19
                                                 19
Roles
Engineers must stand
 their ground
  – There are always doubts,
    incomplete data; don’t let
    ‘em use those against you
       Judgment is crucially
         needed -- YOURS
  –Remember the Challenger                          mgt                 HR                engineer
      “My God, Thiokol, when do you want me to launch? Next April?”
  –Be careful with “data”
      “Risk assessment data is like a captured spy; if you torture it long enough, it will tell you
      anything you want to know…” (Wm. Ruckelshaus)
  –Crushing, conflicting demands are norm
       Design must push the envelope w/o ceding responsibility
       Validation establishes whether they've pushed it too far
       Management must beware overriding tech judgment
       Public must understand limits of human design process
  All players must value roles of others!
  4/4/07                                     Bob Colwell                                              20
                                                                                                           20
Roles cont.
   Management
      – wants to assume a product is safe
      – knows nothing’s ever perfect,
          comes a time to “shoot the engineers” or they’ll never
            stop tinkering

   Validators
      – want to prove a product is safe
      – assume it is not by default
      – only informed arbiters of when product is ready

don’t fall for “might as well sign, we’re                  21
                                                                21
Future Directions:
           Public Expectations
Andy Grove’s FDIV epiphany
      Paradoxically, the more high tech, the more public expects of product
      Users caused Chernobyl, TMI by going “off book”, but prevented many
      other disasters with real-time creativity…lessons are subtle


Takes exquisite understanding & judgment to discern
  accidents from reasonable risk-taking and
  bonehead errors or incompetence
      This is what a jury must do.
      How?



Can’t keep trending this way
                                                                     22
                                                                          22
Future of Validation
Multiple Culture Changes Needed
 Public needs to stop expecting perfection
 Design teams must explicitly limit complexity
     and avoid auto-scale-up assumptions
 Companies must mature past point of viewing
 validation as an unpleasant overhead
      does your company have “Validation Fellows?”


Validation is a profession of its own.
 Cultivate the Validation Attitude!
                                       23
                                                 23
The End




          24
               24

More Related Content

PPTX
Crime Scene Processing Ol
PPT
Dv club foils_intel_austin
PDF
Ludden power7 verification
PDF
3 d to _hpc
PPT
Dill may-2008
PDF
Chris brown ti
PDF
Benjamin q4 2008_bristol
PDF
Jonathan bromley doulos
Crime Scene Processing Ol
Dv club foils_intel_austin
Ludden power7 verification
3 d to _hpc
Dill may-2008
Chris brown ti
Benjamin q4 2008_bristol
Jonathan bromley doulos

More from Obsidian Software (20)

PDF
Zhang rtp q307
PDF
Zehr dv club_12052006
PDF
Yang greenstein part_2
PDF
Yang greenstein part_1
PDF
Williamson arm validation metrics
PDF
Whipp q3 2008_sv
PPT
Vishakantaiah validating
PDF
Validation and-design-in-a-small-team-environment
PDF
Tobin verification isglobal
PDF
Tierney bq207
PDF
The validation attitude
PPT
Thaker q3 2008
PDF
Thaker q3 2008
PDF
Strickland dvclub
PDF
Stinson post si and verification
PDF
Shultz dallas q108
PDF
Shreeve dv club_ams
PDF
Sharam salamian
PDF
Schulz sv q2_2009
PDF
Schulz dallas q1_2008
Zhang rtp q307
Zehr dv club_12052006
Yang greenstein part_2
Yang greenstein part_1
Williamson arm validation metrics
Whipp q3 2008_sv
Vishakantaiah validating
Validation and-design-in-a-small-team-environment
Tobin verification isglobal
Tierney bq207
The validation attitude
Thaker q3 2008
Thaker q3 2008
Strickland dvclub
Stinson post si and verification
Shultz dallas q108
Shreeve dv club_ams
Sharam salamian
Schulz sv q2_2009
Schulz dallas q1_2008
Ad

Colwell validation attitude

  • 1. The Validation Attitude Bob Colwell April 2010 1 1
  • 2. Attitude Icould talk about techniques, tools, FV Environments, algorithms, machinery Languages, suites, training but I think attitude is more important than any of those 2 2
  • 3. No Perfect Designs  Nothing is perfect, everything has bugs – Shortcomings, compromises, defects, design errata, gaffes, goofs, fumbles, errors, boneheaded mistakes, bobbles, bungles, boo-boos – But not all bugs are equal!  Can’t test to saturation: schedule matters too  Why is everything always so darned buggy? – Software…need say no more… – Why did Titanic not have waterproof compartments? – Why did Ford Pinto have gas tank in back? – Why did Challenger fly with leaky O-rings? – Why did torpedoes not explode in WWII? Entropy has a preferred direction Only genius could paint Mona Lisa, but any small child can destroy it quickly 1000 ways to do things wrong, 1 or 2 that work 4/4/07 Bob Colwell 3 3
  • 4. Prescription: SW visualization, tools to localize bugs, diagnose problems, and instrument behavior 4 4
  • 5. Accidents Are Inevitable – It's the nature of engineering to push designs to edge of failure (schedule, reliability, thermals, materials, tools, judgment of unknowns) – P(accident) = ε , for ε ≠ 0 – World rewards this behavior Cool new features + first to market often preferred to dependability Other markets (life-support) make (or should make) this trade-off differently! 4/4/07 Bob Colwell 5 5
  • 6. Isn’t that just ? Close. But Murphy is not quite right. 1. #Near-misses >> #disasters 2. Competent design/test finds simple errors 3. Complex sequences & unlikely 4/4/07 event Colwell Bob cascades survive to prod’n 6 6
  • 7. Failures Getting Worse  Mechanical things usually fail predictably due to physics – Wings bend, bridges groan, engines rattle, knees ache – By contrast, computer-based things fail “all over the place” Helpful Engineering Attitude: 1. Nature does not want your engineered system to work; will actively work against you 2. Your design will do only what you’ve constrained it to do, only as long as it has to 3. Watch out for… Normalization of deviance (Challenger O-rings, Apollo 1 fire) 4/4/07 Bob Colwell 7 7
  • 8. The Steely-Eyed Missile Validator  Apollo 12  2nd try to land on moon, launched 11/14/69  36 seconds after liftoff, spacecraft struck by lightning => power surge r t ant o – All telemetry went haywire; book said to abort liftoff m os t imp T?” said 3 t w HA – Both spacecraft pilot and mission controller were furiously considering that option – But John Aaron was on shift, and thought he’d seen this malfunction beforeas T once “Wha  During testing 1 ac As imov e observed test that went off into weeds year earlier, Aaron ar e Isa to investigateien–c him to obscure SCE subsystem in sc this led ords – Aaron took it on himself w  In critical “abort or not” few seconds, with lives on line, Aaron made one of most famous calls in NASA history – “Flight, try SCE to ‘Aux’” – Neither Flight nor spacecraft pilot Conrad knew what that even meant, but Alan Bean tried it – Telemetry came right back, vaulted Aaron into validation stardom  He could have blown off earlier test, but he didn’t  His inner validator wanted to know “what just happened?” 8 8
  • 9. Complexity Implies Surprises …and surprises are bad Chaos effects in complex µ P’s – Decomposability is a fundamental tenet of complex system design – Butterfly wings ruin decomposability – “Improve design, get slower performance” not at all uncommon We must stop designing large systems as though small ones simply scale up 9 – lesson from comm engineers: assume errors 9
  • 10. Thinking about validation Abilityto think in analogies is highest form of intelligence – IQ tests like “a:b :: c:d” – Hofstadter's book: numerical sequences Analogies may illuminate a subject in a way that direct introspection cannot – They drive our minds to their creative limits 10 10
  • 11. Listen to Your Inner Validator 0, 1, 2, …? You knew it wouldn’t be 3, didn’t you? – You sensed something’s not quite as it seems Answer: 0, 1, 2, 720!, … = 0, 1, 2, 6!! D. Hofstadter, Fluid Concepts and Creative Analogies = 0, 1!, 2!!, 3!!!, … That was the voice of your inner validator that you were hearing 11 11
  • 12. Lesson: Trust Nothing Hyatt Regency hotel, Missouri, 1980 Catwalks on rods 40’ threaded rods with nuts halfway Killed 114, injured 200 12 12
  • 13. What Happened?  Spec was marginal  40’ threaded rods “too hard”, changed to 2x20’ by contractor  No simulation, no test  Who goofed? Engineer, contractor, inspector…everyone 13 13
  • 14. Therac-25 Medical particle accelerator Electrons, protons, X-rays Six fatalities from poor system/SW design – And blind naïve 14 faith in computers! 14
  • 15. Question Everything Test assumptions as well as design – If assumptions are broken, design surely is too – Try to “catch the field goals” 15 15
  • 16. Fight Urge to Relax Requirements Challenger – Not ok to slip design assumptions (launch temp, # of unburnt O-rings) to suit desires Airbus – Blaming pilot not reasonable explanation; pilot is part of system design Runway “incursions” up 71% since ‘93 – Near-misses are trying to tell us something Diane Vaughan, The Challenger Launch Decision, Chicago Press 1996; Nancy Leveson, Safeware, Addison-Wesley 1995 16 16
  • 17. If You Didn’t Test It, It Doesn’t Work Mir: fire extinguishers bolted to wall – Still had strong metal launch straps – Had never been needed before, so never tested – Discovered with a roaring fire several feet away 17 17
  • 18. Complexity Makes Everything Worse  Some things must be complicated to do their job – Our brains, for example  But complex sequences are root of most disasters – Challenger, Bhopal, Chernobyl, FDIV, Exxon Valdez  Where does complexity come from? Why does it keep increasing? Where are the limits? – Pentium 4  “in the small” vs “in the large” design (micros vs comm systems)  What to do? Vigilance, testing, awareness…we are all validators 4/4/07 Bob Colwell 18 18
  • 19. What To Do  Get the spec right  Design for correctness but…  design knowing perfection is unattainable  Users are part of the system  Formal methods  Pre-production testing and validation  Post-production testing and verification  Education of the public 19 19
  • 20. Roles Engineers must stand their ground – There are always doubts, incomplete data; don’t let ‘em use those against you Judgment is crucially needed -- YOURS –Remember the Challenger mgt HR engineer “My God, Thiokol, when do you want me to launch? Next April?” –Be careful with “data” “Risk assessment data is like a captured spy; if you torture it long enough, it will tell you anything you want to know…” (Wm. Ruckelshaus) –Crushing, conflicting demands are norm Design must push the envelope w/o ceding responsibility Validation establishes whether they've pushed it too far Management must beware overriding tech judgment Public must understand limits of human design process All players must value roles of others! 4/4/07 Bob Colwell 20 20
  • 21. Roles cont. Management – wants to assume a product is safe – knows nothing’s ever perfect, comes a time to “shoot the engineers” or they’ll never stop tinkering Validators – want to prove a product is safe – assume it is not by default – only informed arbiters of when product is ready don’t fall for “might as well sign, we’re 21 21
  • 22. Future Directions: Public Expectations Andy Grove’s FDIV epiphany Paradoxically, the more high tech, the more public expects of product Users caused Chernobyl, TMI by going “off book”, but prevented many other disasters with real-time creativity…lessons are subtle Takes exquisite understanding & judgment to discern accidents from reasonable risk-taking and bonehead errors or incompetence This is what a jury must do. How? Can’t keep trending this way 22 22
  • 23. Future of Validation Multiple Culture Changes Needed Public needs to stop expecting perfection Design teams must explicitly limit complexity and avoid auto-scale-up assumptions Companies must mature past point of viewing validation as an unpleasant overhead does your company have “Validation Fellows?” Validation is a profession of its own. Cultivate the Validation Attitude! 23 23
  • 24. The End 24 24