SlideShare a Scribd company logo
Dependability engineering

                                       Lecture 1




Dependability engineering, CSE, 2012               Slide 1
Software dependability
                                       •   Software customers expect all
                                           software to be dependable.
                                           However, for non-critical
                                           applications, they may be willing to
                                           accept some system failures.
                                       •   Some critical systems have very high
                                           dependability requirements and
                                           special software engineering
                                           techniques may be used to achieve
                                           this.
                                           –   Medical systems
                                           –   Telecommunications and power
                                               systems

Dependability engineering, CSE, 2012
                                           –   Aerospace systems            Slide 2
Dependability achievement
   •       Fault avoidance
         –      The system is developed in such a way that human error is
                avoided and thus system faults are minimised.
         –      The development process is organised so that faults in the
                system are detected and repaired before delivery to the
                customer.
   •       Fault detection
         –      Verification and validation techniques are used to discover
                and remove faults in a system before it is deployed.
   •       Fault tolerance
         –      The system is designed so that faults in the delivered
                software do not result in system failure.

Dependability engineering, CSE, 2012                                     Slide 3
Regulated systems
  •       Many critical systems are regulated systems, which
          means that their use must be approved by an
          external regulator before the systems go into service.
        –      Nuclear systems
        –      Air traffic control systems
        –      Medical devices

  •       A safety and dependability case has to be approved
          by the regulator. Therefore, critical systems
          development has to create the evidence to convince
          a regulator that the system is dependable, safe and
          secure.
Dependability engineering, CSE, 2012                       Slide 4
The increasing costs of residual
              fault removal




Dependability engineering, CSE, 2012   Slide 5
Diversity and redundancy
                                       •   Redundancy
                                           –   Keep more than 1 version of a critical
                                               component available so that if one
                                               fails then a backup is available.
                                       •   Diversity
                                           –   Provide the same functionality in
                                               different ways so that they will not fail
                                               in the same way.
                                       •   However, diversity adds
                                           complexity – more chance of
                                           errors.
                                       •   Some engineers advocate
                                           simplicity and extensive V & V
Dependability engineering, CSE, 2012
                                           rather then redundancy.      Slide 6
Diversity and redundancy
                         examples
                                       •   Redundancy. Where availability
                                           is critical (e.g. in e-commerce
                                           systems), companies normally
                                           keep backup servers and switch
                                           to these automatically if failure
                                           occurs.
                                       •   Diversity. To provide resilience
                                           against external attacks, different
                                           servers may be implemented
                                           using different operating systems
                                           (e.g. Windows and Linux)

Dependability engineering, CSE, 2012                                     Slide 7
Dependable processes
 •    To ensure a minimal number of software faults, it is
      important to have a well-defined, repeatable
      software process.
 •    A well-defined repeatable process is one that does
      not depend entirely on individual skills; rather can be
      enacted by different people.
 •    Regulators use information about the process to
      check if good software engineering practice has
      been used.
 •         For fault detection, it is clear that the process
           activities should include significant effort devoted to
Dependability engineering, CSE, 2012 and validation.
           verification                                        Slide 8
Attributes of dependable
                          processes
    Process characteristic             Description
    Documentable                       The process should have a defined process model
                                       that sets out the activities in the process and the
                                       documentation that is to be produced during these
                                       activities.
    Standardized                       A comprehensive set of software development
                                       standards covering software production and
                                       documentation should be available.
    Auditable                          The process should be understandable by people
                                       apart from process participants, who can check that
                                       process standards are being followed and make
                                       suggestions for process improvement.
    Diverse                            The process should include redundant and diverse
                                       verification and validation activities.
    Robust                             The process should be able to recover from failures
                                       of individual process activities.

Dependability engineering, CSE, 2012                                                   Slide 9
Process diversity and
                            redundancy
  •       Process activities, such as validation, should not
          depend on a single approach, such as testing, to
          validate the system
  •       Rather, multiple different process activities the
          complement each other and allow for cross-checking
          help to avoid process errors, which may lead to errors
          in the software

      Reviews                          Automated analysis   Testing



Dependability engineering, CSE, 2012                                  Slide 10
Validation activities
   •       Requirements reviews.                       Reviews


   •       Requirements management.Reviews                          Automated analysis

   •       Formal specification.               Automated analysis

   •       System modeling                     Automated analysis


   •       Design and code inspection.Reviews
   •       Static analysis.            Automated analysis

   •       Test planning and management.Testing



Dependability engineering, CSE, 2012                                              Slide 11
System fault tolerance
   •       Fault tolerance means that the system can continue
           in operation in spite of software fault i.e. the fault
           does not lead to a failure
   •       Fault tolerance is required where there are high
           availability requirements, no ‘fail safe’ state or where
           system failure costs are very high.
   •       Even if the system has been proved to conform to its
           specification, it must also be fault tolerant as there
           may be specification errors or the validation may be
           incorrect.



Dependability engineering, CSE, 2012                           Slide 12
Dependable system
                              architectures
  •       Dependable systems architectures are used in
          situations where fault tolerance is essential. These
          architectures are generally all based on redundancy
          and diversity.
  •       Examples of situations where dependable
          architectures are used:
        –      Flight control systems, where system failure could threaten
               the safety of passengers
        –      Reactor systems where failure of a control system could lead
               to a chemical or nuclear emergency
        –      Telecommunication systems, where there is a need for 24/7
               availability.
Dependability engineering, CSE, 2012                                   Slide 13
Protection systems
  •       A specialized system that is associated with some
          other control system, which can take emergency
          action if a failure occurs.
        –      System to stop a train if it passes a red light
        –      System to shut down a reactor if temperature/pressure are
               too high

  •       Protection systems independently monitor the
          controlled system and the environment.
  •       If a problem is detected, it issues commands to take
          emergency action to shut down the system and avoid
          a catastrophe.
Dependability engineering, CSE, 2012                                 Slide 14
Sizewell B reactor
                                       •   Software controlled
                                           protection system in
                                           Sizewell B reactor
                                       •   Go-live in 1993
                                       •   Software protection
                                           system has been claimed
                                           to have a 1/10000 PFD




Dependability engineering, CSE, 2012                              Slide 15
Protection system architecture




Dependability engineering, CSE, 2012   Slide 16
Protection system functionality
  •       Protection systems are redundant because they
          include monitoring and control capabilities that
          replicate those in the control software.
  •       Protection systems should be diverse and use
          different technology from the control software.
  •       They are simpler than the control system so more
          effort can be expended in validation and
          dependability assurance.
  •       Aim is to ensure that there is a low probability of
          failure on demand for the protection system.

Dependability engineering, CSE, 2012                            Slide 17
Self-monitoring architectures
  •       Multi-channel architectures where the system
          monitors its own operations and takes action if
          inconsistencies are detected.
  •       The same computation is carried out on each channel
          and the results are compared. If the results are
          identical and are produced at the same time, then it is
          assumed that the system is operating correctly.
  •       If the results are different, then a failure is assumed
          and a failure exception is raised.



Dependability engineering, CSE, 2012                          Slide 18
Self-monitoring architecture




Dependability engineering, CSE, 2012       Slide 19
Self-monitoring systems
  •       Hardware in each channel has to be diverse so that
          common mode hardware failure will not lead to each
          channel producing the same results.
  •       Software in each channel must also be
          diverse, otherwise the same software error would
          affect each channel.
  •       If high-availability is required, you may use several
          self-checking systems in parallel.
        –      This is the approach used in the Airbus family of aircraft for
               their flight control systems.


Dependability engineering, CSE, 2012                                      Slide 20
Airbus 340
                                             •   Airbus were the first
                                                 commercial aircraft
                                                 manufacturer to use ‘fly
                                                 by wire’ flight control
                                                 systems
                                             •   Fly by wire systems are
                                                 lighter (saving
                                                 fuel), can be more fuel
                                                 efficient and can detect
                                                 and prevent pilot
                                                 actions that are
                                                 potentially dangerous.
Dependability engineering, CSE, 2012                                Slide 21
Airbus flight control system
                      architecture




Dependability engineering, CSE, 2012         Slide 22
Airbus architecture discussion
  •       The Airbus FCS has 5 separate computers, any one
          of which can run the control software.
  •       Extensive use has been made of diversity
        –      Primary and secondary systems use different processors.
        –      Primary and secondary systems use chipsets from different
               manufacturers.
        –      Software in secondary systems is less complex than in
               primary system – provides only critical functionality.
        –      Software in each channel is developed in different
               programming languages by different teams.
        –      Different programming languages used in primary and
               secondary systems.
Dependability engineering, CSE, 2012                                    Slide 23
Key points
  •       Dependability in a program can be achieved by avoiding the
          introduction of faults, by detecting and removing faults before
          system deployment, and by including fault tolerance facilities.
  •       The use of redundancy and diversity in hardware, software
          processes and software systems is essential for the
          development of dependable systems.
  •       The use of a well-defined, repeatable process is essential if
          faults in a system are to be minimized.
  •       Dependable system architectures are system architectures that
          are designed for fault tolerance. Architectural styles that support
          fault tolerance include protection systems, self-monitoring
          architectures and N-version programming.

Dependability engineering, CSE, 2012                                   Slide 24

More Related Content

PPTX
CS 5032 L7 dependability engineering 2013
PPTX
CS 5032 L2 dependability and security 2013
PPTX
Wind River For Medical
PPTX
CS 5032 L8 dependability engineering 2 2013
PDF
Lee Technologies Corporate Brochure
PDF
Verrex portfolio 2012
PDF
Intro softwareeng
PPT
Sa 007 availability
CS 5032 L7 dependability engineering 2013
CS 5032 L2 dependability and security 2013
Wind River For Medical
CS 5032 L8 dependability engineering 2 2013
Lee Technologies Corporate Brochure
Verrex portfolio 2012
Intro softwareeng
Sa 007 availability

What's hot (20)

PDF
Poole.eric
PDF
Visure Requirements for Product and Embedded Devolpment - Visure Solutions - ...
PDF
Embedded Systems Q and A M.Sc.(IT) PART II SEM III
 
PDF
OpSource Application Operations
PDF
Ppg Capabilities 2010
PDF
Jim.free
PDF
Soa test methodology
PDF
Odum.t.averbeck.r
PDF
Environmental stress screening
PPTX
Software Architecture Second Lecture
PPTX
Software Evolution
PDF
Emerson Migration Services
PDF
JBoss Health Check
PPTX
Software Lifecycle
PDF
Exp eng brochure
PPTX
Ch11-Software Engineering 9
PPTX
Galorath.dan
PDF
ICTSS 2010 - Iterative Software Testing Process for Scrum and Waterfall Projects
PDF
ClinicalGradeMobileHealth mHIseminar.Beaulieu
PPTX
Ch10 dependable systems
Poole.eric
Visure Requirements for Product and Embedded Devolpment - Visure Solutions - ...
Embedded Systems Q and A M.Sc.(IT) PART II SEM III
 
OpSource Application Operations
Ppg Capabilities 2010
Jim.free
Soa test methodology
Odum.t.averbeck.r
Environmental stress screening
Software Architecture Second Lecture
Software Evolution
Emerson Migration Services
JBoss Health Check
Software Lifecycle
Exp eng brochure
Ch11-Software Engineering 9
Galorath.dan
ICTSS 2010 - Iterative Software Testing Process for Scrum and Waterfall Projects
ClinicalGradeMobileHealth mHIseminar.Beaulieu
Ch10 dependable systems
Ad

Similar to Dependablity Engineering 1 (CS 5032 2012) (20)

PPTX
Dependability Engineering 2 (CS 5032 2012)
PPTX
Quality & Reliability in Software Engineering
PPTX
Static analysis and reliability testing (CS 5032 2012)
PPTX
Ch13.pptx
PPTX
Ch13-Software Engineering 9
PPTX
CS5032 L11 validation and reliability testing 2013
PPTX
ch10.pptx
PDF
Unit 2-software development process notes
PDF
Separation of concerns is a design concept [Dij82] that suggests that any com...
PPTX
Ch11 reliability engineering
PDF
ASE_Chap1 - Compatibility Mode for advance software
PPTX
RTS fault tolerance, Reliability evaluation
PPTX
real time systems fault tolerance, Redundancy
PDF
Successive Software Reliability Growth Model: A Modular Approach
PDF
Michael.aguilar
PPTX
Reliability and security specification (CS 5032 2012)
PPT
Ch20
PPTX
Ch11 - Reliability Engineering
Dependability Engineering 2 (CS 5032 2012)
Quality & Reliability in Software Engineering
Static analysis and reliability testing (CS 5032 2012)
Ch13.pptx
Ch13-Software Engineering 9
CS5032 L11 validation and reliability testing 2013
ch10.pptx
Unit 2-software development process notes
Separation of concerns is a design concept [Dij82] that suggests that any com...
Ch11 reliability engineering
ASE_Chap1 - Compatibility Mode for advance software
RTS fault tolerance, Reliability evaluation
real time systems fault tolerance, Redundancy
Successive Software Reliability Growth Model: A Modular Approach
Michael.aguilar
Reliability and security specification (CS 5032 2012)
Ch20
Ch11 - Reliability Engineering
Ad

More from Ian Sommerville (20)

PPTX
Ultra Large Scale Systems
PPTX
Resp modellingintro
PPTX
Resilience and recovery
PPTX
LSCITS-engineering
PPTX
Requirements reality
PPTX
Dependability requirements for LSCITS
PPTX
Conceptual systems design
PPTX
Requirements Engineering for LSCITS
PPTX
An introduction to LSCITS
PPTX
Internet worm-case-study
PPTX
Designing software for a million users
PPTX
Security case buffer overflow
PPTX
CS5032 Case study Ariane 5 launcher failure
PPTX
CS5032 Case study Kegworth air disaster
PPTX
CS5032 L19 cybersecurity 1
PPTX
CS5032 L20 cybersecurity 2
PPTX
L17 CS5032 critical infrastructure
PPTX
CS5032 Case study Maroochy water breach
PPTX
CS 5032 L18 Critical infrastructure 2: SCADA systems
PPTX
CS5032 L9 security engineering 1 2013
Ultra Large Scale Systems
Resp modellingintro
Resilience and recovery
LSCITS-engineering
Requirements reality
Dependability requirements for LSCITS
Conceptual systems design
Requirements Engineering for LSCITS
An introduction to LSCITS
Internet worm-case-study
Designing software for a million users
Security case buffer overflow
CS5032 Case study Ariane 5 launcher failure
CS5032 Case study Kegworth air disaster
CS5032 L19 cybersecurity 1
CS5032 L20 cybersecurity 2
L17 CS5032 critical infrastructure
CS5032 Case study Maroochy water breach
CS 5032 L18 Critical infrastructure 2: SCADA systems
CS5032 L9 security engineering 1 2013

Recently uploaded (20)

PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
NewMind AI Monthly Chronicles - July 2025
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
MYSQL Presentation for SQL database connectivity
PDF
KodekX | Application Modernization Development
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Approach and Philosophy of On baking technology
PPTX
Cloud computing and distributed systems.
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Machine learning based COVID-19 study performance prediction
Network Security Unit 5.pdf for BCA BBA.
Agricultural_Statistics_at_a_Glance_2022_0.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
CIFDAQ's Market Insight: SEC Turns Pro Crypto
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation theory and applications.pdf
Empathic Computing: Creating Shared Understanding
NewMind AI Monthly Chronicles - July 2025
The AUB Centre for AI in Media Proposal.docx
The Rise and Fall of 3GPP – Time for a Sabbatical?
MYSQL Presentation for SQL database connectivity
KodekX | Application Modernization Development
Dropbox Q2 2025 Financial Results & Investor Presentation
Advanced methodologies resolving dimensionality complications for autism neur...
Approach and Philosophy of On baking technology
Cloud computing and distributed systems.
20250228 LYD VKU AI Blended-Learning.pptx

Dependablity Engineering 1 (CS 5032 2012)

  • 1. Dependability engineering Lecture 1 Dependability engineering, CSE, 2012 Slide 1
  • 2. Software dependability • Software customers expect all software to be dependable. However, for non-critical applications, they may be willing to accept some system failures. • Some critical systems have very high dependability requirements and special software engineering techniques may be used to achieve this. – Medical systems – Telecommunications and power systems Dependability engineering, CSE, 2012 – Aerospace systems Slide 2
  • 3. Dependability achievement • Fault avoidance – The system is developed in such a way that human error is avoided and thus system faults are minimised. – The development process is organised so that faults in the system are detected and repaired before delivery to the customer. • Fault detection – Verification and validation techniques are used to discover and remove faults in a system before it is deployed. • Fault tolerance – The system is designed so that faults in the delivered software do not result in system failure. Dependability engineering, CSE, 2012 Slide 3
  • 4. Regulated systems • Many critical systems are regulated systems, which means that their use must be approved by an external regulator before the systems go into service. – Nuclear systems – Air traffic control systems – Medical devices • A safety and dependability case has to be approved by the regulator. Therefore, critical systems development has to create the evidence to convince a regulator that the system is dependable, safe and secure. Dependability engineering, CSE, 2012 Slide 4
  • 5. The increasing costs of residual fault removal Dependability engineering, CSE, 2012 Slide 5
  • 6. Diversity and redundancy • Redundancy – Keep more than 1 version of a critical component available so that if one fails then a backup is available. • Diversity – Provide the same functionality in different ways so that they will not fail in the same way. • However, diversity adds complexity – more chance of errors. • Some engineers advocate simplicity and extensive V & V Dependability engineering, CSE, 2012 rather then redundancy. Slide 6
  • 7. Diversity and redundancy examples • Redundancy. Where availability is critical (e.g. in e-commerce systems), companies normally keep backup servers and switch to these automatically if failure occurs. • Diversity. To provide resilience against external attacks, different servers may be implemented using different operating systems (e.g. Windows and Linux) Dependability engineering, CSE, 2012 Slide 7
  • 8. Dependable processes • To ensure a minimal number of software faults, it is important to have a well-defined, repeatable software process. • A well-defined repeatable process is one that does not depend entirely on individual skills; rather can be enacted by different people. • Regulators use information about the process to check if good software engineering practice has been used. • For fault detection, it is clear that the process activities should include significant effort devoted to Dependability engineering, CSE, 2012 and validation. verification Slide 8
  • 9. Attributes of dependable processes Process characteristic Description Documentable The process should have a defined process model that sets out the activities in the process and the documentation that is to be produced during these activities. Standardized A comprehensive set of software development standards covering software production and documentation should be available. Auditable The process should be understandable by people apart from process participants, who can check that process standards are being followed and make suggestions for process improvement. Diverse The process should include redundant and diverse verification and validation activities. Robust The process should be able to recover from failures of individual process activities. Dependability engineering, CSE, 2012 Slide 9
  • 10. Process diversity and redundancy • Process activities, such as validation, should not depend on a single approach, such as testing, to validate the system • Rather, multiple different process activities the complement each other and allow for cross-checking help to avoid process errors, which may lead to errors in the software Reviews Automated analysis Testing Dependability engineering, CSE, 2012 Slide 10
  • 11. Validation activities • Requirements reviews. Reviews • Requirements management.Reviews Automated analysis • Formal specification. Automated analysis • System modeling Automated analysis • Design and code inspection.Reviews • Static analysis. Automated analysis • Test planning and management.Testing Dependability engineering, CSE, 2012 Slide 11
  • 12. System fault tolerance • Fault tolerance means that the system can continue in operation in spite of software fault i.e. the fault does not lead to a failure • Fault tolerance is required where there are high availability requirements, no ‘fail safe’ state or where system failure costs are very high. • Even if the system has been proved to conform to its specification, it must also be fault tolerant as there may be specification errors or the validation may be incorrect. Dependability engineering, CSE, 2012 Slide 12
  • 13. Dependable system architectures • Dependable systems architectures are used in situations where fault tolerance is essential. These architectures are generally all based on redundancy and diversity. • Examples of situations where dependable architectures are used: – Flight control systems, where system failure could threaten the safety of passengers – Reactor systems where failure of a control system could lead to a chemical or nuclear emergency – Telecommunication systems, where there is a need for 24/7 availability. Dependability engineering, CSE, 2012 Slide 13
  • 14. Protection systems • A specialized system that is associated with some other control system, which can take emergency action if a failure occurs. – System to stop a train if it passes a red light – System to shut down a reactor if temperature/pressure are too high • Protection systems independently monitor the controlled system and the environment. • If a problem is detected, it issues commands to take emergency action to shut down the system and avoid a catastrophe. Dependability engineering, CSE, 2012 Slide 14
  • 15. Sizewell B reactor • Software controlled protection system in Sizewell B reactor • Go-live in 1993 • Software protection system has been claimed to have a 1/10000 PFD Dependability engineering, CSE, 2012 Slide 15
  • 16. Protection system architecture Dependability engineering, CSE, 2012 Slide 16
  • 17. Protection system functionality • Protection systems are redundant because they include monitoring and control capabilities that replicate those in the control software. • Protection systems should be diverse and use different technology from the control software. • They are simpler than the control system so more effort can be expended in validation and dependability assurance. • Aim is to ensure that there is a low probability of failure on demand for the protection system. Dependability engineering, CSE, 2012 Slide 17
  • 18. Self-monitoring architectures • Multi-channel architectures where the system monitors its own operations and takes action if inconsistencies are detected. • The same computation is carried out on each channel and the results are compared. If the results are identical and are produced at the same time, then it is assumed that the system is operating correctly. • If the results are different, then a failure is assumed and a failure exception is raised. Dependability engineering, CSE, 2012 Slide 18
  • 20. Self-monitoring systems • Hardware in each channel has to be diverse so that common mode hardware failure will not lead to each channel producing the same results. • Software in each channel must also be diverse, otherwise the same software error would affect each channel. • If high-availability is required, you may use several self-checking systems in parallel. – This is the approach used in the Airbus family of aircraft for their flight control systems. Dependability engineering, CSE, 2012 Slide 20
  • 21. Airbus 340 • Airbus were the first commercial aircraft manufacturer to use ‘fly by wire’ flight control systems • Fly by wire systems are lighter (saving fuel), can be more fuel efficient and can detect and prevent pilot actions that are potentially dangerous. Dependability engineering, CSE, 2012 Slide 21
  • 22. Airbus flight control system architecture Dependability engineering, CSE, 2012 Slide 22
  • 23. Airbus architecture discussion • The Airbus FCS has 5 separate computers, any one of which can run the control software. • Extensive use has been made of diversity – Primary and secondary systems use different processors. – Primary and secondary systems use chipsets from different manufacturers. – Software in secondary systems is less complex than in primary system – provides only critical functionality. – Software in each channel is developed in different programming languages by different teams. – Different programming languages used in primary and secondary systems. Dependability engineering, CSE, 2012 Slide 23
  • 24. Key points • Dependability in a program can be achieved by avoiding the introduction of faults, by detecting and removing faults before system deployment, and by including fault tolerance facilities. • The use of redundancy and diversity in hardware, software processes and software systems is essential for the development of dependable systems. • The use of a well-defined, repeatable process is essential if faults in a system are to be minimized. • Dependable system architectures are system architectures that are designed for fault tolerance. Architectural styles that support fault tolerance include protection systems, self-monitoring architectures and N-version programming. Dependability engineering, CSE, 2012 Slide 24