SlideShare a Scribd company logo
Integrating Model Checking and Procedural Languages  David Owen July 19, 2004
Overview Background:  verification / search tools, criteria for when to use which tool, combining different strategies. Experiments:  flight guidance system, leader election protocol, dining philosophers, resource arbiter. Implementation:  Lurch, our random simulation tool for finite-state models. Lean:  Lurch + machine learning. Lean experiment:  Chemical factory optimization.
A Continuum of Testing and Verification Tools A range of tools exists, from traditional software testing to automated verification. Simulation tools that approximate full verification but work on more complex models. Sophisticated testing tools capable of detecting more complex errors. Automated Verification Traditional Software Testing Complex Models Simple Errors Simple Models Complex Errors Tools to Approximate Full Verification More Sophisticated Testing Tools Real Languages Model Checking
Changing Expectations of a Software Analyst Cobleigh et.al. idea—three modes of analysis. Exploratory mode:  quick feedback needed to learn how the system works and refine properties. Fault-finding mode:  short and clear error traces needed for debugging. Maintenance mode:  completeness, scalability needed to verify overall system. Different tools have different strengths. Simulation tools good for exploratory mode. Symbolic model checking good for short error traces. Explicit-state model checking good for speed and scalability.
Combining Complimentary Strategies Different tools have different strengths and weaknesses. Cobleigh et.al. suggest “The Right Algorithm at the Right Time” (ICSE 2001). We’ve had some success with a different approach, combining complimentary strategies (regardless of analyst’s mode). Start with a quick, incomplete tool; if no errors found after a few seconds use a model checker (complete verification). Quick, Incomplete Search Model Checker No Errors Found Done Errors Found
Random Simulation of Concurrent System Models Randomized algorithms known to be simple, fast and effective in many domains. West used random simulation to detect errors in concurrent system models. This approach was surprisingly successful. Success was attributed to the fact that most errors detected are much less complex than the overall system. We have implemented a similar random simulation in a tool called Lurch. Added early stopping heuristics. C code can be included in the model.
Flight Guidance System Experiment Work with Mats Heimdahl and Jimin Gao (University of Minnesota). Ran Lurch, NuSMV on model representing mode logic from a Rockwell-Collins flight guidance system. Seeded faults based on developers’ revision history. Used NuSMV to (exhaustively) determine what properties were violated by faulty specifications. Tried to find the violations with Lurch (random simulation of the model). Put Lurch and NuSMV results together to evaluate combined strategy.
Flight Guidance System Experiment (2) 5,910 3.92 141,000 14,000 14,000 27,600 12,200 3,890 141,000 1.49 1.03 4.43 Combined   average   median   max 8,200 3,540 141,000 14,000 14,000 27,600 12,200 3,890 141,000 4,380 3,290 17,500 NuSMV   average   median   max 553 40.1 5,400 1.49 1.03 4.43 Lurch   average   median   max Overall Lurch ? Lurch > 5 Lurch < 5 Property violations not detected by Lurch Combined strategy improves average by over ½ hour. Time (seconds) to verify or find error plotted; combined = Lurch for 5 sec., then SPIN if no property violations found by Lurch.
Leader Election Protocol Experiment Protocol published as an example for SPIN (Holzmann 1997 TSE article). N  processes communicating via message queues interact to choose one leader process. Checked for liveness property  always(eventually(one “leader” chosen)) . Ran Lurch + SPIN combination strategy on original and two fault-seeded versions of the model. Seeded faults:  where a process is sending out a message, the wrong message type was used. Two different fault-seeded versions created:  one that turned out easy, another that turned out harder.
Leader Election Protocol Experiment (2) 20.4 0.173 249 20.4 0.183 195 0.137 0.128 0.173 54.2 9.67 249 Combined   average   median   max 23.4 0.125 244 31.2 3.21 190 0.059 0.055 0.08 49.2 4.67 244 SPIN   average   median   max 1.60 0.183 7.19 0.137 0.128 0.173 Lurch   average   median   max Overall Fault 2 Fault 1 Correct Although SPIN alone is better on the correct and first fault-seeded versions, average for combined strategy is still better overall. Time (seconds) to verify or find error plotted; combined = Lurch for 5 sec., then SPIN if no property violations found by Lurch.
Leader Election Protocol Experiment (3) This plot shows the time required for Lurch and SPIN running on a model with both of the seeded faults described previously. Instances with an odd number of processes are much more difficult for SPIN, but not for Lurch. This demonstrates a well-known benefit of some randomized algorithms: less sensitivity to (apparently) minor changes in the input.
Dining Philosophers Experiment Two different versions of the problem: Normal:  n  philosophers seated around a table; each repeatedly tries to acquire left and right forks, eat, and then set down the forks. No loop: same as normal version, except philosophers only try to eat once. Both versions of the problem contain two deadlocks at depth  n . We ran Lurch, SPIN and NuSMV, until the  shortest path  to a deadlock was found. The normal version was harder for NuSMV and Lurch; the no-loop version was harder for SPIN.
Dining Philosophers Experiment (2) 35 0.135 555 0.281 0.063 1.19 69.8 0.223 555 Combined (NuSMV)   average   median   max 46.3 3.07 550 4.99 2.12 19.4 87.5 5.15 550 NuSMV   average   median   max 2.56 0.135 34.9 0.281 0.063 1.19 4.83 0.223 34.9 Combined (SPIN)   average   median   max 19.5 0.49 236 34 0.741 236 4.99 0.47 29.9 SPIN   average   median   max 0.806 0.135 6.83 0.281 0.063 1.19 1.33 0.223 6.83 Lurch   average   median   max Overall No Loop Normal In both cases, the combined strategy (Lurch + SPIN or Lurch + NuSMV) saves time. Time (seconds) to find shortest path plotted; combined = Lurch for 5 sec., then SPIN if no property violations found by Lurch.
Lurch Input Models: C Code + Finite-State Machines Lurch transitions may refer to arbitrary C code. For example, we could use a C variable for the turn variable in our producer-consumer model: enum {P,C} turn = P; %% pr_wait;  (turn==P);  -;  produce; produce;  -;  {turn=C;};  pr_wait; cs_wait;  (turn==C);  -;  consume; consume;  -;  {turn=P;};  cs_wait; Parenthesis and brackets within transitions mark references to C expressions and statements. %%  separates C and finite-state machines. Each finite-state machine is a list of transitions.
RA-RRE Model Work with John Powell (NASA JPL). Resource arbitration (RA) system on board a robotic remote exploration (RRE) vehicle User processes make requests for RRE resources through a message queue.  User processes run concurrently with an arbiter process, which responds to requests in the queue. Arbiter will Grant, Deny, Pend, Rescind or Deny and Rescind a resource request. Abiter filters out nonsense messages and ignores them.
RA-RRE Model (2) Large Stateflow® model: C code embedded inside states to represent complex internal system behaviors. JPL’s HiVy translator used to generate Promela (SPIN’s input language) with embedded C code. Translated from Stateflow® to Lurch with C code references in transitions. While it can be very difficult to correctly use Promela’s C code embedding features, Powell reports that it was not difficult to use C code in Lurch models, even after just 15 hours of informal training. Lurch results matched SPIN’s, finding deadlocks in six different versions of the model. Different versions created by running HiVy translator with or without various optimizations, and running models with minor fixes put into the code.
RA-RRE Model (3) Easily instrumented to provide visibility into embedded C code errors.  This led to discovery of error relating to fundamental system specification conflicts. Masked errors in embedded C code as syntactic / semantic problems embedding C into Promela. Diagnosis of Error Causes Easily accomplished with minimal training. Steep learning curve. Embedded C Code Found multiple variations on deadlock over properties. Model too large to verify properties. Finding Errors—Property Violation Found Deadlock Found Deadlock Finding Errors—Deadlock Lurch SPIN Powell’s conclusion:  compared to SPIN, Lurch easy to use for models with embedded C code; Lurch found same errors consistently.
Lurch Implementation Lurch’s partial, random search procedure: Partial : there is no guarantee that all behavior will be explored. Random : the choice of which behavior to explore is nondeterministic. step(Q, state) while (Q not empty) tr := pop(Q) exec_outputs(tr, state) for (tr' in same machine as tr) del(Q, tr') check(state) fault_check(state) deadlock_check(state) cycle_check(state) search(iterations, depth) for (i in iterations) for (m in machines) state[m] = 0 for (d in depth) for (tr in transitions) if (check_inputs(tr)) random_push (Q, tr) step(Q, state) check(state) The basic search procedure repeated each time tick. Each iteration explores one global state path through the behavior of the system.  A path is divided into “time ticks.”  At each time tick a state vector (with a value for each machine) is updated.
Lurch Implementation (2) The  step  function is called at each time tick along a global state path. Input is a queue of transitions whose inputs are satisfied, along with the state vector. Transitions are popped from the queue, and their outputs are executed. The effect of transitions executed is stored in the state vector. Only one transition from each machine can be executed at each time step; others are discarded from the queue.
With the step function as-is (as described in the previous slide), Lurch simulates  synchronous  execution of finite-state machines:  at each time step, every machine is given a chance to move forward. If the step function is modified so that only one transition (one out of all the machines) is executed at each time step, Lurch simulates asynchronous execution of the system:  all interleavings of machine behaviors are considered. Lurch Implementation (3) asynchronous synchronous state = < 1, 1, 1 > state = < 0, 0, 0 > state = < 1, 1, 1 > state = < 1, 1, 0 > state = < 1, 0, 0 > state = < 0, 0, 0 >
Lurch Implementation (4) At each time tick along a path Lurch checks for local-state faults, deadlocks and cycles. Local state faults can be found directly from the state vector—if one of the machines is in a state corresponding to a fault, Lurch reports that the fault was reached. A deadlock occurs when Lurch reaches the end of a global state path (a state for which no new transition’s inputs are satisfied) but not all machines are in a state identified as a legal end state. Deadlocks are found by looping through the state vector to make sure all local states are legal end states (this is done only when Lurch is at the end of a global state path).
Other Applications for Lurch’s Random Simulation Game playing experiments:  n -queens, tic-tac-toe Lurch is really a fast generator of consistent temporal sequences—so what else can we use it for? If we generate a score for each temporal sequence, we can use a machine learner to suggest what makes some sequences better than others. Lurch + Machine Learning = “Lean,” a randomized heuristic search tool for finite-state models (with optional C code).
Lean: Combining “Test” and “Task” Traditional view: specialized devices for different tasks. Diagnosis, configuration, testing... Alternative: one environment where “test” and “task” are implemented together: Write down what is known about a domain. Add an oracle to score a single run (i.e., score the temporal sequences generated by Lurch). Instead of different devices for “test” and “task” “ Lean” = Lurch + learn Run Lurch on sample space of options. Learn—apply machine learning to find “nudges,” which are suggestions for which transitions lead to runs with higher scores. Apply “nudges” in the form of transition probabilities, and run Lurch again, expecting better scores.
Chemical Factory (Lean) Work with Tom Burkleau, Portland State University. Finite-state machine model of commercial vodka distillery plant.  Multiple machines representing the space of options, the model of the production facility, and the relation between production parts. Nominal Model (composite) Faulty Model (composite)
Optimizing Nominal Model After 7 scored runs of Lurch, plus machine learning to find “nudges”:
26 repeats of <LURCH,learn> Change learning classes:  Class1: fixed Class2: movable Learn selectors for class2 Negate them (removes the bug) 1 more repeat of <LURCH,learn> Question: is this simulation or optimization or parameter tuning or fault localization or diagnosis or configuration? Answer: all of the above Optimizing Faulty Model Gone! Fixed, refuses to budge
Conclusion Combination and model checking of random simulation (Lurch) (SPIN or NuSMV) can be faster and more efficient than model checking alone, without sacrificing completeness. FGS (Heimdahl, Gao at UMN), leader election protocol, dining philosophers experiments. Lurch allows (easy-to-use) references to arbitrary C code. RA-RRE model experiments (Powell at JPL). Lurch uses a simple random search procedure, plus early stopping heuristics and modifications for asynchronous models, hierarchical models, etc. Lean = Lurch + machine learning. Chemical factory optimization experiment (Burkleau at PSU).

More Related Content

PDF
Time series models iv
PPT
Process Synchronization And Deadlocks
PPTX
Control structures in c++
KEY
An introduction to mutation testing
PDF
FPGA Coding Guidelines
DOCX
Adsa u1 ver 1.0
PPT
Week6 testing-intro
PPTX
07 flow control
Time series models iv
Process Synchronization And Deadlocks
Control structures in c++
An introduction to mutation testing
FPGA Coding Guidelines
Adsa u1 ver 1.0
Week6 testing-intro
07 flow control

What's hot (20)

PPT
03 conditions loops
PDF
Symbolic Execution (introduction and hands-on)
DOCX
Critical section operating system
DOCX
Adsa u4 ver 1.0
PDF
Effective Fault-Localization Techniques for Concurrent Software
PPT
Semaphores and Monitors
PPTX
Loop control structure
PPTX
Python Flow Control
PDF
Design for Testability
PDF
Automated Vulnerability Testing Using Machine Learning and Metaheuristic Search
PPTX
Process synchronization
PPTX
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
PPTX
Critical Section in Operating System
PPTX
Operating system critical section
PPTX
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
PPTX
Semophores and it's types
PPTX
Analytics tools and Instruments
PPT
Process synchronization(deepa)
PPTX
Java Chapter 05 - Conditions & Loops: part 3
PPTX
Critical section problem in operating system.
03 conditions loops
Symbolic Execution (introduction and hands-on)
Critical section operating system
Adsa u4 ver 1.0
Effective Fault-Localization Techniques for Concurrent Software
Semaphores and Monitors
Loop control structure
Python Flow Control
Design for Testability
Automated Vulnerability Testing Using Machine Learning and Metaheuristic Search
Process synchronization
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Critical Section in Operating System
Operating system critical section
Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...
Semophores and it's types
Analytics tools and Instruments
Process synchronization(deepa)
Java Chapter 05 - Conditions & Loops: part 3
Critical section problem in operating system.
Ad

Similar to Integrating Model Checking and Procedural Languages (20)

PDF
Monkeys in Lab Coats: Applying Failure Testing Research @Netflix
PDF
Property-based testing an open-source compiler, pflua (FOSDEM 2015)
PPT
Dill may-2008
PDF
Orchestrated Chaos: Applying Failure Testing Research at Scale.
PPTX
Software Testing_mmmmmmmmmmmmmmmmmmmmmmm
PDF
Lionel Briand ICSM 2011 Keynote
PDF
Bdd and-testing
PDF
Behaviour Driven Development and Thinking About Testing
 
PDF
Finger pointing
PPT
Dealing with the Three Horrible Problems in Verification
PDF
Scalable and Cost-Effective Model-Based Software Verification and Testing
PDF
Software Development Lifecycle Presentation
PPT
03 search blind
PDF
MBT_Installers_Dev_Env
PPT
1 blind search
KEY
Enforcing Behavioral Constraints in Evolving Aspect-Oriented Programs
PDF
Formal Verification
PPTX
Fault tolerance techniques tsp
PDF
OSMC 2015: Testing in Production by Devdas Bhagat
PDF
OSMC 2015 | Testing in Production by Devdas Bhagat
Monkeys in Lab Coats: Applying Failure Testing Research @Netflix
Property-based testing an open-source compiler, pflua (FOSDEM 2015)
Dill may-2008
Orchestrated Chaos: Applying Failure Testing Research at Scale.
Software Testing_mmmmmmmmmmmmmmmmmmmmmmm
Lionel Briand ICSM 2011 Keynote
Bdd and-testing
Behaviour Driven Development and Thinking About Testing
 
Finger pointing
Dealing with the Three Horrible Problems in Verification
Scalable and Cost-Effective Model-Based Software Verification and Testing
Software Development Lifecycle Presentation
03 search blind
MBT_Installers_Dev_Env
1 blind search
Enforcing Behavioral Constraints in Evolving Aspect-Oriented Programs
Formal Verification
Fault tolerance techniques tsp
OSMC 2015: Testing in Production by Devdas Bhagat
OSMC 2015 | Testing in Production by Devdas Bhagat
Ad

More from butest (20)

PDF
EL MODELO DE NEGOCIO DE YOUTUBE
DOC
1. MPEG I.B.P frame之不同
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPT
Timeline: The Life of Michael Jackson
DOCX
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPTX
Com 380, Summer II
PPT
PPT
DOCX
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
DOC
MICHAEL JACKSON.doc
PPTX
Social Networks: Twitter Facebook SL - Slide 1
PPT
Facebook
DOCX
Executive Summary Hare Chevrolet is a General Motors dealership ...
DOC
Welcome to the Dougherty County Public Library's Facebook and ...
DOC
NEWS ANNOUNCEMENT
DOC
C-2100 Ultra Zoom.doc
DOC
MAC Printing on ITS Printers.doc.doc
DOC
Mac OS X Guide.doc
DOC
hier
DOC
WEB DESIGN!
EL MODELO DE NEGOCIO DE YOUTUBE
1. MPEG I.B.P frame之不同
LESSONS FROM THE MICHAEL JACKSON TRIAL
Timeline: The Life of Michael Jackson
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
LESSONS FROM THE MICHAEL JACKSON TRIAL
Com 380, Summer II
PPT
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
MICHAEL JACKSON.doc
Social Networks: Twitter Facebook SL - Slide 1
Facebook
Executive Summary Hare Chevrolet is a General Motors dealership ...
Welcome to the Dougherty County Public Library's Facebook and ...
NEWS ANNOUNCEMENT
C-2100 Ultra Zoom.doc
MAC Printing on ITS Printers.doc.doc
Mac OS X Guide.doc
hier
WEB DESIGN!

Integrating Model Checking and Procedural Languages

  • 1. Integrating Model Checking and Procedural Languages David Owen July 19, 2004
  • 2. Overview Background: verification / search tools, criteria for when to use which tool, combining different strategies. Experiments: flight guidance system, leader election protocol, dining philosophers, resource arbiter. Implementation: Lurch, our random simulation tool for finite-state models. Lean: Lurch + machine learning. Lean experiment: Chemical factory optimization.
  • 3. A Continuum of Testing and Verification Tools A range of tools exists, from traditional software testing to automated verification. Simulation tools that approximate full verification but work on more complex models. Sophisticated testing tools capable of detecting more complex errors. Automated Verification Traditional Software Testing Complex Models Simple Errors Simple Models Complex Errors Tools to Approximate Full Verification More Sophisticated Testing Tools Real Languages Model Checking
  • 4. Changing Expectations of a Software Analyst Cobleigh et.al. idea—three modes of analysis. Exploratory mode: quick feedback needed to learn how the system works and refine properties. Fault-finding mode: short and clear error traces needed for debugging. Maintenance mode: completeness, scalability needed to verify overall system. Different tools have different strengths. Simulation tools good for exploratory mode. Symbolic model checking good for short error traces. Explicit-state model checking good for speed and scalability.
  • 5. Combining Complimentary Strategies Different tools have different strengths and weaknesses. Cobleigh et.al. suggest “The Right Algorithm at the Right Time” (ICSE 2001). We’ve had some success with a different approach, combining complimentary strategies (regardless of analyst’s mode). Start with a quick, incomplete tool; if no errors found after a few seconds use a model checker (complete verification). Quick, Incomplete Search Model Checker No Errors Found Done Errors Found
  • 6. Random Simulation of Concurrent System Models Randomized algorithms known to be simple, fast and effective in many domains. West used random simulation to detect errors in concurrent system models. This approach was surprisingly successful. Success was attributed to the fact that most errors detected are much less complex than the overall system. We have implemented a similar random simulation in a tool called Lurch. Added early stopping heuristics. C code can be included in the model.
  • 7. Flight Guidance System Experiment Work with Mats Heimdahl and Jimin Gao (University of Minnesota). Ran Lurch, NuSMV on model representing mode logic from a Rockwell-Collins flight guidance system. Seeded faults based on developers’ revision history. Used NuSMV to (exhaustively) determine what properties were violated by faulty specifications. Tried to find the violations with Lurch (random simulation of the model). Put Lurch and NuSMV results together to evaluate combined strategy.
  • 8. Flight Guidance System Experiment (2) 5,910 3.92 141,000 14,000 14,000 27,600 12,200 3,890 141,000 1.49 1.03 4.43 Combined average median max 8,200 3,540 141,000 14,000 14,000 27,600 12,200 3,890 141,000 4,380 3,290 17,500 NuSMV average median max 553 40.1 5,400 1.49 1.03 4.43 Lurch average median max Overall Lurch ? Lurch > 5 Lurch < 5 Property violations not detected by Lurch Combined strategy improves average by over ½ hour. Time (seconds) to verify or find error plotted; combined = Lurch for 5 sec., then SPIN if no property violations found by Lurch.
  • 9. Leader Election Protocol Experiment Protocol published as an example for SPIN (Holzmann 1997 TSE article). N processes communicating via message queues interact to choose one leader process. Checked for liveness property always(eventually(one “leader” chosen)) . Ran Lurch + SPIN combination strategy on original and two fault-seeded versions of the model. Seeded faults: where a process is sending out a message, the wrong message type was used. Two different fault-seeded versions created: one that turned out easy, another that turned out harder.
  • 10. Leader Election Protocol Experiment (2) 20.4 0.173 249 20.4 0.183 195 0.137 0.128 0.173 54.2 9.67 249 Combined average median max 23.4 0.125 244 31.2 3.21 190 0.059 0.055 0.08 49.2 4.67 244 SPIN average median max 1.60 0.183 7.19 0.137 0.128 0.173 Lurch average median max Overall Fault 2 Fault 1 Correct Although SPIN alone is better on the correct and first fault-seeded versions, average for combined strategy is still better overall. Time (seconds) to verify or find error plotted; combined = Lurch for 5 sec., then SPIN if no property violations found by Lurch.
  • 11. Leader Election Protocol Experiment (3) This plot shows the time required for Lurch and SPIN running on a model with both of the seeded faults described previously. Instances with an odd number of processes are much more difficult for SPIN, but not for Lurch. This demonstrates a well-known benefit of some randomized algorithms: less sensitivity to (apparently) minor changes in the input.
  • 12. Dining Philosophers Experiment Two different versions of the problem: Normal: n philosophers seated around a table; each repeatedly tries to acquire left and right forks, eat, and then set down the forks. No loop: same as normal version, except philosophers only try to eat once. Both versions of the problem contain two deadlocks at depth n . We ran Lurch, SPIN and NuSMV, until the shortest path to a deadlock was found. The normal version was harder for NuSMV and Lurch; the no-loop version was harder for SPIN.
  • 13. Dining Philosophers Experiment (2) 35 0.135 555 0.281 0.063 1.19 69.8 0.223 555 Combined (NuSMV) average median max 46.3 3.07 550 4.99 2.12 19.4 87.5 5.15 550 NuSMV average median max 2.56 0.135 34.9 0.281 0.063 1.19 4.83 0.223 34.9 Combined (SPIN) average median max 19.5 0.49 236 34 0.741 236 4.99 0.47 29.9 SPIN average median max 0.806 0.135 6.83 0.281 0.063 1.19 1.33 0.223 6.83 Lurch average median max Overall No Loop Normal In both cases, the combined strategy (Lurch + SPIN or Lurch + NuSMV) saves time. Time (seconds) to find shortest path plotted; combined = Lurch for 5 sec., then SPIN if no property violations found by Lurch.
  • 14. Lurch Input Models: C Code + Finite-State Machines Lurch transitions may refer to arbitrary C code. For example, we could use a C variable for the turn variable in our producer-consumer model: enum {P,C} turn = P; %% pr_wait; (turn==P); -; produce; produce; -; {turn=C;}; pr_wait; cs_wait; (turn==C); -; consume; consume; -; {turn=P;}; cs_wait; Parenthesis and brackets within transitions mark references to C expressions and statements. %% separates C and finite-state machines. Each finite-state machine is a list of transitions.
  • 15. RA-RRE Model Work with John Powell (NASA JPL). Resource arbitration (RA) system on board a robotic remote exploration (RRE) vehicle User processes make requests for RRE resources through a message queue. User processes run concurrently with an arbiter process, which responds to requests in the queue. Arbiter will Grant, Deny, Pend, Rescind or Deny and Rescind a resource request. Abiter filters out nonsense messages and ignores them.
  • 16. RA-RRE Model (2) Large Stateflow® model: C code embedded inside states to represent complex internal system behaviors. JPL’s HiVy translator used to generate Promela (SPIN’s input language) with embedded C code. Translated from Stateflow® to Lurch with C code references in transitions. While it can be very difficult to correctly use Promela’s C code embedding features, Powell reports that it was not difficult to use C code in Lurch models, even after just 15 hours of informal training. Lurch results matched SPIN’s, finding deadlocks in six different versions of the model. Different versions created by running HiVy translator with or without various optimizations, and running models with minor fixes put into the code.
  • 17. RA-RRE Model (3) Easily instrumented to provide visibility into embedded C code errors. This led to discovery of error relating to fundamental system specification conflicts. Masked errors in embedded C code as syntactic / semantic problems embedding C into Promela. Diagnosis of Error Causes Easily accomplished with minimal training. Steep learning curve. Embedded C Code Found multiple variations on deadlock over properties. Model too large to verify properties. Finding Errors—Property Violation Found Deadlock Found Deadlock Finding Errors—Deadlock Lurch SPIN Powell’s conclusion: compared to SPIN, Lurch easy to use for models with embedded C code; Lurch found same errors consistently.
  • 18. Lurch Implementation Lurch’s partial, random search procedure: Partial : there is no guarantee that all behavior will be explored. Random : the choice of which behavior to explore is nondeterministic. step(Q, state) while (Q not empty) tr := pop(Q) exec_outputs(tr, state) for (tr' in same machine as tr) del(Q, tr') check(state) fault_check(state) deadlock_check(state) cycle_check(state) search(iterations, depth) for (i in iterations) for (m in machines) state[m] = 0 for (d in depth) for (tr in transitions) if (check_inputs(tr)) random_push (Q, tr) step(Q, state) check(state) The basic search procedure repeated each time tick. Each iteration explores one global state path through the behavior of the system. A path is divided into “time ticks.” At each time tick a state vector (with a value for each machine) is updated.
  • 19. Lurch Implementation (2) The step function is called at each time tick along a global state path. Input is a queue of transitions whose inputs are satisfied, along with the state vector. Transitions are popped from the queue, and their outputs are executed. The effect of transitions executed is stored in the state vector. Only one transition from each machine can be executed at each time step; others are discarded from the queue.
  • 20. With the step function as-is (as described in the previous slide), Lurch simulates synchronous execution of finite-state machines: at each time step, every machine is given a chance to move forward. If the step function is modified so that only one transition (one out of all the machines) is executed at each time step, Lurch simulates asynchronous execution of the system: all interleavings of machine behaviors are considered. Lurch Implementation (3) asynchronous synchronous state = < 1, 1, 1 > state = < 0, 0, 0 > state = < 1, 1, 1 > state = < 1, 1, 0 > state = < 1, 0, 0 > state = < 0, 0, 0 >
  • 21. Lurch Implementation (4) At each time tick along a path Lurch checks for local-state faults, deadlocks and cycles. Local state faults can be found directly from the state vector—if one of the machines is in a state corresponding to a fault, Lurch reports that the fault was reached. A deadlock occurs when Lurch reaches the end of a global state path (a state for which no new transition’s inputs are satisfied) but not all machines are in a state identified as a legal end state. Deadlocks are found by looping through the state vector to make sure all local states are legal end states (this is done only when Lurch is at the end of a global state path).
  • 22. Other Applications for Lurch’s Random Simulation Game playing experiments: n -queens, tic-tac-toe Lurch is really a fast generator of consistent temporal sequences—so what else can we use it for? If we generate a score for each temporal sequence, we can use a machine learner to suggest what makes some sequences better than others. Lurch + Machine Learning = “Lean,” a randomized heuristic search tool for finite-state models (with optional C code).
  • 23. Lean: Combining “Test” and “Task” Traditional view: specialized devices for different tasks. Diagnosis, configuration, testing... Alternative: one environment where “test” and “task” are implemented together: Write down what is known about a domain. Add an oracle to score a single run (i.e., score the temporal sequences generated by Lurch). Instead of different devices for “test” and “task” “ Lean” = Lurch + learn Run Lurch on sample space of options. Learn—apply machine learning to find “nudges,” which are suggestions for which transitions lead to runs with higher scores. Apply “nudges” in the form of transition probabilities, and run Lurch again, expecting better scores.
  • 24. Chemical Factory (Lean) Work with Tom Burkleau, Portland State University. Finite-state machine model of commercial vodka distillery plant. Multiple machines representing the space of options, the model of the production facility, and the relation between production parts. Nominal Model (composite) Faulty Model (composite)
  • 25. Optimizing Nominal Model After 7 scored runs of Lurch, plus machine learning to find “nudges”:
  • 26. 26 repeats of <LURCH,learn> Change learning classes: Class1: fixed Class2: movable Learn selectors for class2 Negate them (removes the bug) 1 more repeat of <LURCH,learn> Question: is this simulation or optimization or parameter tuning or fault localization or diagnosis or configuration? Answer: all of the above Optimizing Faulty Model Gone! Fixed, refuses to budge
  • 27. Conclusion Combination and model checking of random simulation (Lurch) (SPIN or NuSMV) can be faster and more efficient than model checking alone, without sacrificing completeness. FGS (Heimdahl, Gao at UMN), leader election protocol, dining philosophers experiments. Lurch allows (easy-to-use) references to arbitrary C code. RA-RRE model experiments (Powell at JPL). Lurch uses a simple random search procedure, plus early stopping heuristics and modifications for asynchronous models, hierarchical models, etc. Lean = Lurch + machine learning. Chemical factory optimization experiment (Burkleau at PSU).

Editor's Notes

  • #4: Simulation and testing tools are capable of finding more complex errors: deadlocks, safety property violations. Many incomplete techniques scale to large systems but can’t handle liveness properties. In explicit-state model checkers (e.g., SPIN) you have to turn off the shorter counter example features to handle larger models.
  • #6: Quick, incomplete search is Lurch: random simulation of finite-state model + early stopping heuristics. Lurch is fast, good at finding short counter examples, and allows easy integration of C code, but it’s not complete. Model Checking (SPIN or SMV in our experiments) can be slow but is complete. Together, we get a complete technique that’s often much faster. Next three experiments show how combined approach works well…
  • #7: Randomized algorithms good at, e.g., optimization, sorting, planning problems… Concurrent system models = the finite-state machine models that work with a model checker. Errors less complex: in a system with many components, a small number of key components interact to cause the error. many different interleavings of system behavior lead to the same error (where partial-order reduction works for explicit-state model checking). Lurch is our implementation of random simulation for finite-state models.
  • #8: All safety properties (i.e., proof of violation is a single global state)
  • #9: Much of the time difference here could be explained by the fact that Lurch is explicit-state, vs. NuSMV, which is symbolic. Following examples show how Lurch + explicit state (SPIN) combined strategy is also very effective. Most memory used by Lurch in any of these runs: 37.3 Mb, compared to NuSMV: 200-300 Mb
  • #10: Seeded faults: at a point where a process is sending out a message, the wrong message type was used. Two different fault-seeded versions were created: one for which it turned out to be easy to find the violation and another for which it turned out to be much more difficult.
  • #12: Not dealing with combined strategy here, but just showing the benefit of randomization. Even when both (Lurch and SPIN) are explicit-state approaches, there can be a big difference in performance for some input models.
  • #13: For this problem, you know the theoretical shortest path ahead of time; it’s n . Previous two experiments (it could be argued) favor explicit-state model checking, which which tends to be good at quickly finding long error traces. This experiment is different, since we force the explicit-state techniques (Lurch and SPIN) to run until the shortest path is found—generally a symbolic model checker would be much better at finding shortest paths (in fact it’s guaranteed to find the shortest).
  • #14: Lurch alone is actually the fastest for this experiment, but it’s important to remember that we sacrifice completeness if no model checker is used. It’s only the bottom four rows that are complete techniques. Note that results for the combined strategies are skewed by very high outliers; for example, for the largest of the normal instances of the problem (with the loop), Lurch would be cut off at 5 seconds, just before finding the deadlock at 6.83; then either SPIN would run an additional 29.9 seconds or NuSMV an additional 550 seconds. Because of this we’re working on a better stopping criteria (not just the arbitrary 5 second time limit) based on approximate global state coverage measures.
  • #15: All of the models in experiments above use C in the input models: The flight guidance system model, for macros defined in the RSML specification language not easily represented as finite-state machines. The leader election protocol, to represent various features of SPIN’s input language promela, including bounded message channels (abstract models of transmission medium). The dining philosophers models, to make modeling easier and more concise.
  • #19: At each time tick, all of the transitions whose inputs are satisfied are put into a queue, in random order. The order they end up in the queue determines which will be executed: only one transition from each machine can be executed—it’s whichever is popped first.
  • #21: n: transitions are synchronous mode even when everything else is asynchronous.
  • #28: Future work: improve early stopping heuristics for combination strategy (don’t just stop at arbtrary 5 seconds…) Currently stopping is based on number of new vs. redundant global states found (as percentage decreases, search “saturates”); New idea: use global transition (not global state) saturation Also, try to use Lean as a guided random simulation tool, and see how that compares with the current Lurch, model checkers, and the various approximate model checking tools / applications out there.