Integrating Model Checking and Procedural Languages

Integrating Model Checking and Procedural Languages David Owen July 19, 2004

Overview Background: verification / search tools, criteria for when to use which tool, combining different strategies. Experiments: flight guidance system, leader election protocol, dining philosophers, resource arbiter. Implementation: Lurch, our random simulation tool for finite-state models. Lean: Lurch + machine learning. Lean experiment: Chemical factory optimization.

A Continuum of Testing and Verification Tools A range of tools exists, from traditional software testing to automated verification. Simulation tools that approximate full verification but work on more complex models. Sophisticated testing tools capable of detecting more complex errors. Automated Verification Traditional Software Testing Complex Models Simple Errors Simple Models Complex Errors Tools to Approximate Full Verification More Sophisticated Testing Tools Real Languages Model Checking

Changing Expectations of a Software Analyst Cobleigh et.al. idea—three modes of analysis. Exploratory mode: quick feedback needed to learn how the system works and refine properties. Fault-finding mode: short and clear error traces needed for debugging. Maintenance mode: completeness, scalability needed to verify overall system. Different tools have different strengths. Simulation tools good for exploratory mode. Symbolic model checking good for short error traces. Explicit-state model checking good for speed and scalability.

Combining Complimentary Strategies Different tools have different strengths and weaknesses. Cobleigh et.al. suggest “The Right Algorithm at the Right Time” (ICSE 2001). We’ve had some success with a different approach, combining complimentary strategies (regardless of analyst’s mode). Start with a quick, incomplete tool; if no errors found after a few seconds use a model checker (complete verification). Quick, Incomplete Search Model Checker No Errors Found Done Errors Found

Random Simulation of Concurrent System Models Randomized algorithms known to be simple, fast and effective in many domains. West used random simulation to detect errors in concurrent system models. This approach was surprisingly successful. Success was attributed to the fact that most errors detected are much less complex than the overall system. We have implemented a similar random simulation in a tool called Lurch. Added early stopping heuristics. C code can be included in the model.

Flight Guidance System Experiment Work with Mats Heimdahl and Jimin Gao (University of Minnesota). Ran Lurch, NuSMV on model representing mode logic from a Rockwell-Collins flight guidance system. Seeded faults based on developers’ revision history. Used NuSMV to (exhaustively) determine what properties were violated by faulty specifications. Tried to find the violations with Lurch (random simulation of the model). Put Lurch and NuSMV results together to evaluate combined strategy.

Flight Guidance System Experiment (2) 5,910 3.92 141,000 14,000 14,000 27,600 12,200 3,890 141,000 1.49 1.03 4.43 Combined average median max 8,200 3,540 141,000 14,000 14,000 27,600 12,200 3,890 141,000 4,380 3,290 17,500 NuSMV average median max 553 40.1 5,400 1.49 1.03 4.43 Lurch average median max Overall Lurch ? Lurch > 5 Lurch < 5 Property violations not detected by Lurch Combined strategy improves average by over ½ hour. Time (seconds) to verify or find error plotted; combined = Lurch for 5 sec., then SPIN if no property violations found by Lurch.

Leader Election Protocol Experiment Protocol published as an example for SPIN (Holzmann 1997 TSE article). N processes communicating via message queues interact to choose one leader process. Checked for liveness property always(eventually(one “leader” chosen)) . Ran Lurch + SPIN combination strategy on original and two fault-seeded versions of the model. Seeded faults: where a process is sending out a message, the wrong message type was used. Two different fault-seeded versions created: one that turned out easy, another that turned out harder.

Leader Election Protocol Experiment (2) 20.4 0.173 249 20.4 0.183 195 0.137 0.128 0.173 54.2 9.67 249 Combined average median max 23.4 0.125 244 31.2 3.21 190 0.059 0.055 0.08 49.2 4.67 244 SPIN average median max 1.60 0.183 7.19 0.137 0.128 0.173 Lurch average median max Overall Fault 2 Fault 1 Correct Although SPIN alone is better on the correct and first fault-seeded versions, average for combined strategy is still better overall. Time (seconds) to verify or find error plotted; combined = Lurch for 5 sec., then SPIN if no property violations found by Lurch.

Leader Election Protocol Experiment (3) This plot shows the time required for Lurch and SPIN running on a model with both of the seeded faults described previously. Instances with an odd number of processes are much more difficult for SPIN, but not for Lurch. This demonstrates a well-known benefit of some randomized algorithms: less sensitivity to (apparently) minor changes in the input.

Dining Philosophers Experiment Two different versions of the problem: Normal: n philosophers seated around a table; each repeatedly tries to acquire left and right forks, eat, and then set down the forks. No loop: same as normal version, except philosophers only try to eat once. Both versions of the problem contain two deadlocks at depth n . We ran Lurch, SPIN and NuSMV, until the shortest path to a deadlock was found. The normal version was harder for NuSMV and Lurch; the no-loop version was harder for SPIN.

Dining Philosophers Experiment (2) 35 0.135 555 0.281 0.063 1.19 69.8 0.223 555 Combined (NuSMV) average median max 46.3 3.07 550 4.99 2.12 19.4 87.5 5.15 550 NuSMV average median max 2.56 0.135 34.9 0.281 0.063 1.19 4.83 0.223 34.9 Combined (SPIN) average median max 19.5 0.49 236 34 0.741 236 4.99 0.47 29.9 SPIN average median max 0.806 0.135 6.83 0.281 0.063 1.19 1.33 0.223 6.83 Lurch average median max Overall No Loop Normal In both cases, the combined strategy (Lurch + SPIN or Lurch + NuSMV) saves time. Time (seconds) to find shortest path plotted; combined = Lurch for 5 sec., then SPIN if no property violations found by Lurch.

Lurch Input Models: C Code + Finite-State Machines Lurch transitions may refer to arbitrary C code. For example, we could use a C variable for the turn variable in our producer-consumer model: enum {P,C} turn = P; %% pr_wait; (turn==P); -; produce; produce; -; {turn=C;}; pr_wait; cs_wait; (turn==C); -; consume; consume; -; {turn=P;}; cs_wait; Parenthesis and brackets within transitions mark references to C expressions and statements. %% separates C and finite-state machines. Each finite-state machine is a list of transitions.

RA-RRE Model Work with John Powell (NASA JPL). Resource arbitration (RA) system on board a robotic remote exploration (RRE) vehicle User processes make requests for RRE resources through a message queue. User processes run concurrently with an arbiter process, which responds to requests in the queue. Arbiter will Grant, Deny, Pend, Rescind or Deny and Rescind a resource request. Abiter filters out nonsense messages and ignores them.

RA-RRE Model (2) Large Stateflow® model: C code embedded inside states to represent complex internal system behaviors. JPL’s HiVy translator used to generate Promela (SPIN’s input language) with embedded C code. Translated from Stateflow® to Lurch with C code references in transitions. While it can be very difficult to correctly use Promela’s C code embedding features, Powell reports that it was not difficult to use C code in Lurch models, even after just 15 hours of informal training. Lurch results matched SPIN’s, finding deadlocks in six different versions of the model. Different versions created by running HiVy translator with or without various optimizations, and running models with minor fixes put into the code.

RA-RRE Model (3) Easily instrumented to provide visibility into embedded C code errors. This led to discovery of error relating to fundamental system specification conflicts. Masked errors in embedded C code as syntactic / semantic problems embedding C into Promela. Diagnosis of Error Causes Easily accomplished with minimal training. Steep learning curve. Embedded C Code Found multiple variations on deadlock over properties. Model too large to verify properties. Finding Errors—Property Violation Found Deadlock Found Deadlock Finding Errors—Deadlock Lurch SPIN Powell’s conclusion: compared to SPIN, Lurch easy to use for models with embedded C code; Lurch found same errors consistently.

Lurch Implementation Lurch’s partial, random search procedure: Partial : there is no guarantee that all behavior will be explored. Random : the choice of which behavior to explore is nondeterministic. step(Q, state) while (Q not empty) tr := pop(Q) exec_outputs(tr, state) for (tr' in same machine as tr) del(Q, tr') check(state) fault_check(state) deadlock_check(state) cycle_check(state) search(iterations, depth) for (i in iterations) for (m in machines) state[m] = 0 for (d in depth) for (tr in transitions) if (check_inputs(tr)) random_push (Q, tr) step(Q, state) check(state) The basic search procedure repeated each time tick. Each iteration explores one global state path through the behavior of the system. A path is divided into “time ticks.” At each time tick a state vector (with a value for each machine) is updated.

Lurch Implementation (2) The step function is called at each time tick along a global state path. Input is a queue of transitions whose inputs are satisfied, along with the state vector. Transitions are popped from the queue, and their outputs are executed. The effect of transitions executed is stored in the state vector. Only one transition from each machine can be executed at each time step; others are discarded from the queue.

With the step function as-is (as described in the previous slide), Lurch simulates synchronous execution of finite-state machines: at each time step, every machine is given a chance to move forward. If the step function is modified so that only one transition (one out of all the machines) is executed at each time step, Lurch simulates asynchronous execution of the system: all interleavings of machine behaviors are considered. Lurch Implementation (3) asynchronous synchronous state = < 1, 1, 1 > state = < 0, 0, 0 > state = < 1, 1, 1 > state = < 1, 1, 0 > state = < 1, 0, 0 > state = < 0, 0, 0 >

Lurch Implementation (4) At each time tick along a path Lurch checks for local-state faults, deadlocks and cycles. Local state faults can be found directly from the state vector—if one of the machines is in a state corresponding to a fault, Lurch reports that the fault was reached. A deadlock occurs when Lurch reaches the end of a global state path (a state for which no new transition’s inputs are satisfied) but not all machines are in a state identified as a legal end state. Deadlocks are found by looping through the state vector to make sure all local states are legal end states (this is done only when Lurch is at the end of a global state path).

Other Applications for Lurch’s Random Simulation Game playing experiments: n -queens, tic-tac-toe Lurch is really a fast generator of consistent temporal sequences—so what else can we use it for? If we generate a score for each temporal sequence, we can use a machine learner to suggest what makes some sequences better than others. Lurch + Machine Learning = “Lean,” a randomized heuristic search tool for finite-state models (with optional C code).

Lean: Combining “Test” and “Task” Traditional view: specialized devices for different tasks. Diagnosis, configuration, testing... Alternative: one environment where “test” and “task” are implemented together: Write down what is known about a domain. Add an oracle to score a single run (i.e., score the temporal sequences generated by Lurch). Instead of different devices for “test” and “task” “ Lean” = Lurch + learn Run Lurch on sample space of options. Learn—apply machine learning to find “nudges,” which are suggestions for which transitions lead to runs with higher scores. Apply “nudges” in the form of transition probabilities, and run Lurch again, expecting better scores.

Chemical Factory (Lean) Work with Tom Burkleau, Portland State University. Finite-state machine model of commercial vodka distillery plant. Multiple machines representing the space of options, the model of the production facility, and the relation between production parts. Nominal Model (composite) Faulty Model (composite)

Optimizing Nominal Model After 7 scored runs of Lurch, plus machine learning to find “nudges”:

26 repeats of <LURCH,learn> Change learning classes: Class1: fixed Class2: movable Learn selectors for class2 Negate them (removes the bug) 1 more repeat of <LURCH,learn> Question: is this simulation or optimization or parameter tuning or fault localization or diagnosis or configuration? Answer: all of the above Optimizing Faulty Model Gone! Fixed, refuses to budge

Conclusion Combination and model checking of random simulation (Lurch) (SPIN or NuSMV) can be faster and more efficient than model checking alone, without sacrificing completeness. FGS (Heimdahl, Gao at UMN), leader election protocol, dining philosophers experiments. Lurch allows (easy-to-use) references to arbitrary C code. RA-RRE model experiments (Powell at JPL). Lurch uses a simple random search procedure, plus early stopping heuristics and modifications for asynchronous models, hierarchical models, etc. Lean = Lurch + machine learning. Chemical factory optimization experiment (Burkleau at PSU).

Integrating Model Checking and Procedural Languages

More Related Content

What's hot (20)

Similar to Integrating Model Checking and Procedural Languages (20)

More from butest (20)

Integrating Model Checking and Procedural Languages

Editor's Notes