Joint Aviation Authorities JAA Releability Program Training

Disclaimer
This presentation has been prepared with utmost care. However, no claims can
be filed, based on the content of this presentation. The official texts of the
Regulation, the Implementing Rules, the Acceptable Means of Compliance and
Guidance Material always have preference over this presentation.
The JAA Training Organisation, nor the instructor can be held liable for errors,
omissions or misinterpretations in this presentation. In case of doubt, consult
your competent authority.

List of Abbreviations
A/C Aircraft
ACAM Aircraft Continues Airworthiness Management
ACAS Airborne Collision Avoidance System
ACJ Advisory Circular Joint
AD Airworthiness Directives
AD Accidental Damage
ADF Automatic Direction Finder
ADR Accidental Damage Rating
AGNA Advisory Group of National Authorities
ALI Airworthiness Limitation Item
ALS Airworthiness Limitations Selection
AMC Acceptable Means of Compliance
AMM Aircraft Maintenance Manual
ANC Air Navigation Commission
AOC Air Operator Certificate
ARC Airworthiness Review Certificate
ASM Ageing System Maintenance
ATA Air Transport Association
ATL Aircraft Technical Log
ATSRAC Aircraft Ageing Transport Systems Rulemaking Advisory Committee
AWO All Weather Operations

BFE Buyer Furnished Equipment
BR Basic Regulation
BITE Build In Test Equipment
CA Competent Authority
CAA Civil Aviation Authority
CAMO Continuing Airworthiness Management Organisation
CAT Commercial Air Transport
CDCCL Critical Design Configuration Control Limitations
CDL Configuration Defiation List
CMCC Certification Maintenance Co-ordination Committee
CMIS Crew Member Interphone System
CMM Component Maintenance Manual
CMPA Complex Motor Powered Aircraft
CMR Certification Maintenance requirements
CofA Certificate of Airworthiness
CPCP Corrosion Protection Control Program
CRI Certification Review Item
CRS Certificate of Release to Service
CS Contracting State
CS Certifying Staff
CVR Cockpit Voice Recorder

DD Deferred Defects
DET DETailed inspection
DME Distance Measuring Equipment
DO Design Organisation
DOA Design organisation Approval
DT ALI Damage Tolerant Airworthiness Limitations Items
DWG Design Working Group
EAD Emergency Airworthiness Directive
EASA European Aviation Safety Agency
EC European Commission
ECAC European Civil Aviation Conference
ECI Emergency Conformity Information
ED Enviromental Damage
ED Executive Director
EDR Enviromental Damage Rating
ELT Emergency Locator Transmitter
EMM Enchanced Manufacturing and Maintainability
ETOPS Extended Range Twin engine Operation
ETSO European Technical Standard Order
EU European Union
EWIS Electrical Wire Interconnect Systems

EZAP Enchanced Zone Analysis Program
FAA Federal Aviation Administration
FAD Final Airworthiness Directive
FAL Fuel Airworthiness Limitations
FCIS Flight Crew Interphone System
FD Fatique Damage
FDR Flight Data Recorder
FEC Failure Effect Category
FMEA Failure Mode Effect Analysis
FMECA Failure Mode Effect Critical Analysis
FML Flight Manual Limitations
GPWS Ground Proximity Warning System
GVI General Visual Inspection
HIRF High Intensity Radiated Frequency
ICA Instructions for Continue Airworthiness
ICAW Instructions for Continue Airworthiness
IEM Interpretative and Explanatory Material
IFR Instrument Flight Rules
ISC Industry Steering Comitee
LROPS Long Range Operations

MCAI Mandatory Continuing Airworthiness Information
MCD Magnetic Chip Detector
MCTOM Maximum Certified Take Off Mass
MEL Minimum Equipment List
MMEL Master Minimum Equipment List
MNPS Minimum Navigation Performance Specifications
MPD Maintenance Planning Data
MPP Maintenance Program Proposal
MRB Maintenance Review Board
MRBRP MRB Report Proposal
MRO Maintenance Repair Overhaul
MSG Maintenance Steering Group
MSI Maintenance Significant Item
MTBF Mean Time Between Failure
MTBUR Mean Time Between Unscheduled Removal
MWG Maintenance Working Group
NAA National Aviation Authority
NPA Notice of Proposed Amendments
OEM Original Equipment Manufactorer
OMS On-board Maintenance system

PAD Proposed Airworthiness Directive
PAS Public Address system
PCA Primary Certification Authority
PO Production Organisation
POA Production Organisation Approval
PPH Policy and Procedure Handbook
PSE Principle Structural Element
PSI Principal Structural Item
PtF Permit to Fly
RCM Reliability Centered Maintenance
RNP Required Navigation Performance
RVSM Reduced Vertical Separtion Minima
SARP Standards and Recommanded Practices and Procedures
SB Service Bulletin
SDET Special Detailed inspection
SGHA Standard Ground Handling Agreement
SIB Safety Information Bulletin
SL Service Letter
SoD State of Design
SoR State of Registry
SRM Structural Repair Manual

SSCC Safety Standard Consultative Committe
SSI Structural Significant Item
SSID Supplementary Structural Inspection Document
SSR Secundary Survaillance Radar
STC Supplemental Type certificate
TAWS Terrain Awareness Warning System
TCDS Type Certificate Data Sheet
TSO Technical Standard Order
VFR Visual Flight Rules
VOR VHF omnidirectional radio range
WFD Widespread Fatique Damage

Objectives
Course objectives:
• Familiarise candidates with regulatory requirements, Part M/CAMO;
• Familiarize candidates with the MSG-3 (reliability centered maintenance)
processes;
• Understanding basic statistics analysis w.r.t. Reliability and Probability
• Provide a better understanding of the processes w.r.t. reliability
management in continuing airworthiness

Course Content (1/2)
• Introduction
• Regulatory requirements
• Concept of Reliability centered maintenance/MSG 3
– Functions
– Functional Failures
– FMEA
– Failure consequences
– Maintenance tasks
• Development of Maintenance Programmes
• Statistical methods and reliability measures

Course Content (2/2)
• Reliability
– Types of reliability
– Statistical methods
– Administration and Management of the Reliability Programme
– Elements of the Reliability Programme
• Data collection
• Problem detection (an alerting system)
• Setting and adjusting alert levels
• Reading alert status
• Data display
• Data analysis
• Corrective actions
– Small Fleet Reliability
– Approval of the Reliability Programme
– Escalations/Optimisation

Definitions
Maintenance: ensuring that physical assets continue to what their users want them to do
Reliability Centered Maintenance is a process used to determine what should be done to
ensure that any physical asset continues to do what its user want it to do in its present
operating environment
Reliability is the probability that a component or system will perform a required function
for a given period of time when used under stated operating conditions. (probability of a
non failure over given time)
Maintainability is the probability that a failed component or system will be restored or
repaired to a specified condition within a period of timer when maintennace is performed
iaw presribed procedures.
Availability is the probability that a component or system is performing its required
function at a given point of time when used under stated operating conditions

Introduction
Operators objectives (ideal world):
• Infinite performance
• 0 life cycle costs
• 100% availability
In the operational phase try to achieve maximum effectiveness of a product and/or
equipment.
Inherent characteristics (design parameters) (RMS):
• Reliability
• Maintainability
• Supportability

Introduction
B777 System with 300.000 unique parts total of 6 million parts (incl. Nuts, bolts etc.)
Effective operation, maintenance and support for such complex systems requires
integrated tools, procedures and techniques.
Failure to meet high reliability, maintainability and supportability will result in poor
effectiveness and high costs (cancellations and delays)
Faillure to dispatch on time or cancellation will have effect on competitiveness and
profitability.

Introduction
Life Cycle of a system
Specifications
and
requirements
Design Production
Operation/Maint
enance /support
Retirement
Market Part 21 Part M
Part 145
OPS
Part 145

Introduction
Maintenance task classification
Maintenance tasks can be classified in 3 categories:
1. Corrective maintenance tasks;
2. Preventive maintenance tasks;
3. Conditional (predictive) tasks.

Failure
condition
Fault
location
Disassembly
Repair or
replacement
Assembly
Test/check
Corrective maintenance (unscheduled or unplanned maintenance)

Disassembly
Replacement
Assembly
Test/check
Preventive maintenance task

Inspection/
measure
Data
collection
Condition
assessment
Condition
interpretation
Decision
making
Conditional (Predictive) maintenance task

Part-M
Operation
Maintenance
Airworthiness
Review
CofA (indefinite)
CofA (indefinite) + ARC
Cont. Airw.
management

Subpart A General
Subpart B Accountability
Subpart C Continuing Airworthiness
Subpart D Maintenance standards
Subpart E Components
Subpart F Maintenance organisation
Subpart G Continuing airworthiness management organisation
Subpart H Certificate of Release to Service
Subpart I Airworthiness review certificate
Part M – Relevant Subparts
Part M requirements

The owner is responsible for the Continuing airworthiness if an aircraft and
shall ensure that no flight takes place unless:
(1) The aircraft is maintained in an airworthy condition
(4) The maintenance if the aircraft is performed i.a.w. the approved
maintenance programme
M.A.201(a)
Responsibilities
EASA requirements

In case of CAT the operator is responsible for the continuing airworthiness of
the aircraft it operates.
M.A.201(h)
Responsibilities
EASA requirements
6. C.A.T.: In order to retain ultimate responsibility the operator should limit sub-contracted tasks
to:
a. AD analysis and planning
b. SB analysis
c. Planning of maintenance
d. Reliability monitoring, engine health monitoring
e. Maintenance Programme development and amendments
f. Any other activities…

Subpart C
Continuing Airworthiness
Part-M
Section A

CAMO
•Evidence of airworthy
configuration
•Software library
•Proof that all
Maintenance is
performed
• Ground time for maintenance
• Modifications (fleetprogrammes)
• Availability of the aircraft
• Maintenance contracts with
providers
• Maintenance package strategies
• Ensure that the
configuration is airworthy
• Modification management
• Repair status control
• AD management
• Part in => parts out
• Ensure all scheduled maintenance
is performed on time (AMP
compliance)
• Ensure maitenance is performed
to standard of the CAMO
• Manage in-service failures
• Optimisation of maintenance
• Reliability
• Effectiveness of the AMP
Maintenance
management
Configuration
Management
Aircraft records
Coordination OPS
and Maintenance

A.301 CA Tasks
• Pre-flight inspections

A.301 CA Tasks
• Pre-flight inspections
• Defect/damage rectification (MEL, CDL)
– System for assessing effectiveness
(significant, repetitive, deferred or carried forward defects, unscheduled
removals)
– Consider cumulative number of carried forward defects.
– System to ensure defects rectified within MEL limits.
• Accomplishment of all maintenance i.a.w. approved maintenance
programme
• Effectiveness of maintenance programme (CMPA / Lic. Air
Carrier)
• Airworthiness and operational directives, EASA requirements, Authority
measures
• Modifications and repairs
• Embodiment policy for non mandatory modifications/inspections
(CMPA or Lic. Air Carriers)
• Maintenance check flights

EASA requirements
Subpart C: Continuing Airworthiness tasks
AMC M.A.301-2 Continuing
airworthiness tasks
In the case of aircraft used by air carriers licensed in
accordance with Regulation (EC) No 1008/2008 and
of complex motor-powered aircraft, a system
should be in operation to support the CA of an A/C
and to provide a continuous analysis of the
effectiveness of the approved CAMO’s defect
control system in use.
Monitoring of
significant incidents
& defects
Monitoring of
repetitive incidents
& defects
Monitoring of
deferred & carried
forward defects
Analysis of
unscheduled
component removals
and the A/C systems’
performance

Subpart C: Continuing Airworthiness
airworthiness tasks
In the case of aircraft used by air
carriers licensed in accordance with
Regulation (EC) No 1008/2008 and of
complex motor-powered aircraft, the
operator should have a system to
ensure that all defects affecting the safe
operation of the A/C are rectified
within the limits prescribed by the
approved
MEL CDL
EASA requirements

M.A.301
CA tasks
4. for all complex motor-powered aircraft or aircraft used by
licensed air carriers in accordance with Regulation (EC) No
1008/2008, Analysis of the effectiveness of the approved
maintenance Programme
EASA requirements

airworthiness tasks
The CAMO managing the continuing airworthiness of the
aircraft should have a system to analyse the effectiveness
of the maintenance programme, with regard to spares,
established defects, malfunctions and damage, and to
amend the maintenance programme accordingly.
Spares Established defects Malfunctions
Damage
EASA requirements

M.A.302 Maintenance programme
Maintenance
Programme
Vendor
information
Reliability
programme
MRB/MPD/CMP
STC’s
/modifications
CAA
Operator specific
Repairs AD’s

ü Maintenance of each aircraft shall be organised in accordance with an
aircraft maintenance programme
ü The aircraft maintenance programme and any subsequent amendments
shall be approved by the competent authority.
ü When the continuing airworthiness of the aircraft is managed by a CAMO,
the aircraft maintenance programme and its amendments may be
approved through an indirect approval procedure.
– the indirect approval procedure shall be established by the CAMO as part of the Continuing
Airworthiness Management Exposition and shall be approved by the competent authority.
– shall not use the indirect approval procedure when this organisation is not under the oversight of
the Member State of Registry unless the oversight responsibility has been transferred
M.A.302 Maintenance Programme

The aircraft maintenance programme must establish compliance with:
1. instructions issued by the competent authority;
2. instructions for continuing airworthiness:
– issued by the holders of the type-certificate, restricted typecertificate,
supplemental type-certificate, major repair design approval, ETSO
autorisation
– additional or alternative instructions proposed by the owner or the
continuing airworthiness management organisation once approved in
accordance with point M.A.302, except for intervals of safety related
tasks referred in paragraph (e), which may be escalated, subject to
sufficient reviews carried out in accordance with paragraph (g) and only
when subject to direct approval in accordance with point M.A.302(b).

• Periodically (annually) reviewed and amended
– Operating experience
– New/modified instructions by TC holder (21.A.61)
• Reliability programme (large aircraft)
– If based on MSG logic or includes condition monitoring or no overhaul times
for all significant system components
– Ensure tasks effective and periods adequate (monitoring function)
– Details in AMC appendix I

AMC M.A.302 (d)-4 & 6
A/C Maintenance Programme
compliance
Where an A/C is maintained i.a.w. an A/C MP based upon the MRB report
process, any associated Programme for the continuing surveillance of the
reliability, or health monitoring of the A/C should be considered as part of the
A/C MP.
Some approved MP, not developed from the MRB process, utilise reliability
Programmes. Such reliability Programmes should be considered as part of the
approved MP
EASA requirements

C Continuing Airworthiness CA
AMC M.A.302(d)-7
compliance
7. Alternate and/or additional instructions to those defined in M.A.302(d)(i)
and (ii), proposed by the owner or the operator, may include but are not
limited to the following:
• Escalation of the interval for certain tasks based on reliability data or
other supporting information. Appendix I recommends that the MP
contains the corresponding escalation procedures. The escalation of these
tasks is directly approved by the CA, except in the case of ALIs
(Airworthiness Limitations), which are approved by the Agency.
• More restrictive intervals than those proposed by the TC holder as a
result of the reliability data or because of a more stringent operational
environment.
• Additional tasks at the discretion of the operator.
EASA requirements

AMC M.A.302(f)
Reliability Programme
1. Reliability Programmes should be developed for A/C MP based upon MSG
logic or those that include condition monitored components or that do not
contain overhaul time periods for all significant system components.
2. Reliability Programmes need not be developed for A/C not considered as large
A/C or that contain overhaul time periods for all significant A/C system
components.
3. The purpose of a reliability Programme is to ensure that the A/C MP tasks are
effective and their periodicity is adequate.
EASA requirements

AMC M.A.302(f)
Reliability Programme
4. The reliability Programme may result in the escalation or deletion of a
maintenance task, as well as the de-escalation or addition of a
maintenance task
5. A reliability Programme provides an appropriate means of monitoring
the effectiveness of the MP.
6. Appendix 1 to AMC M.A.302 and M.B.301 (d) gives further guidance.
EASA requirements

Subpart G CAMO
M.A.708 (b) 1
Continuing Airworthiness
management
For every A/C managed, the approved CAMO shall:
• Develop & control a Maintenance Programme for the A/C managed
including any applicable Reliability Programme
EASA requirements

Appendix 1
to AMC M.A.302 & to AMC M.B.301(b)
Maintenance Programme-MP-Content
Part M – Appendix 1
EASA requirements

EASA Requirements
§ EASA Part M requirement for Reliability Monitoring – M.A.302(d) –
AMC M.A.302(d) – Appendix I (item 6) to AMC M.A.302
§ EASA Requirements for ETOPS (see AMC 20-6) require
operators to have reliability Programmes for ETOPS
critical systems
§ Reliability Programmes are necessary for effective
continuing airworthiness management of other
operating modes:
• RVSM
• MNPS (Minimum Navigation Performance Specification)
• Low Visibility - Auto land

Appendix 1 to AMC M.A.302 & to AMC M.B.301(b): MP Content
6. Reliability Programmes
1. 6.1. Applicability
6.1.1 A Reliability Programme should be developed in the following
cases:
(a) the A/C MP is based upon MSG-3 logic
(b) the A/C MP includes condition monitored components
(c) the A/C MP does not contain overhaul time periods for all significant
system components
(d) when specified by the Manufacturer’s MPD or MRB.
EASA requirements

Appendix 1 to AMC M.A.302 & to AMC M.B.301(b): MP Content
6. Reliability Programmes
6.1. Applicability
6.1.2. A Reliability Programmes need not be developed in the following
cases:
(a) the MP is based upon the MSG-1 or 2 logic but only contains hard time or
on condition items
(b) the A/C is not a large A/C according to Part-M
(c) the A/C MP provides overhaul time periods for all significant system
components.
Note : for the purpose of this §, a significant system is a system the failure of which
could hazard the A/C safety.
6.1.3. Notwithstanding 6.1.1 and 6.1.2 above, a CAMO may however, develop
its own reliability monitoring Programme when it may be deemed beneficial
from a MP point of view.
EASA requirements

• Sufficient qualified staff for expected work
– Analysis of tasks, responsibilities
– Number of man hours and qualifications
– Trained on fuel tank safety
• Qualification of all personnel recorded
• The CAME shall define and keep updated titles, names of the Accountable
manager and nominated
A.706 Personnel
For complex motor-powered aircraft and for aircraft used by licenced air
carriers in accordance with Regulation (EC) No 1008/2008, the
organisation shall establish and control the competence of personnel
involved in the continuing airworthiness management, airworthiness
review and/or quality audits in accordance with a procedure and to a
standard agreed by the competent authority.

Reliability Centered Maintenance
MSG – 2 and 3

Since the 1930’s the evolution of maintenance can be traced through 3
generations:
History of Reliability Centered Maintenance
1940 1970 1980 1990 2000
1950 1960
First
Generation:
Fix when broken
Second Generation:
• Higher Plant availability
•Longer Equipment life
•Lower cost
Third Generation:
•Higher plant availability and
reliability
•Greater safety
•Better product quality
•No damage to environment
•Longer equipment life
•Greater cost effectiveness

In July, 1968, representatives of various airlines developed Handbook MSG-1,
"Maintenance Evaluation and Program Development," which included decision
logic and inter-airline/manufacturer procedures for developing a maintenance
Programme for the new Boeing 747 aircraft.
History of RCM

History of RCM
Subsequently, it was decided that experience
gained on this project should be applied to
update the decision logic and to delete
certain 747 detailed procedural information
so that a universal document could be made
applicable for later new type aircraft. It
resulted MSG-2.
MSG-2 decision logic was used to develop
scheduled maintenance Programmes for the
Lockheed 1011 and the Douglas DC-10
airplanes and some military aircraft. A similar
document was prepared in Europe and was
the basis for the the initial Programmes for
the A-300 and Concord.

• In 1979 experience and events indicated that an update of MSG
procedures was both timely and opportune to develop maintenance
Programmes for new aircraft, systems or powerplants
• An ATA Task Force reviewed MSG-2 and identified various areas that
were likely candidates for improvement. The distinction between
economics and safety, and the adequacy of treatment of hidden
functional failures in the decision logic process was made.
• MSG-3 adjusted the decision logic flow paths to provide a more
rational procedure for task definition and a more straightforward
progression through the decision logic. MSG-3 logic took a "from the
top down" or consequence of failure approach. At the outset, the
functional failure was assessed for consequence of failure and was
assigned one of two basic categories:
A. SAFETY
B. ECONOMIC
History of RCM

Reliability Centred Maintenance (RCM) recognises that there is not
necessarily a direct correlation between the amount of maintenance and
operating reliability
A standardised systematic methodology is applied to aircraft systems and
components in order to identify most applicable and effective
maintenance.
Remarks

• MSG 2
• MSG 3
ATA Maintenance Steering Group (MSG)
Analysis Methodology
CS 25.1529 (including Appendix H) require ICA’s to be developed
in accordance with the MSG-3

Process oriented or bottom up approach whereby each unit (system,
component, or appliance) on the aircraft is analysed and assigned to one of
the primary maintenance processes:
– Hard time;
– On condition;
– Condition monitoring.
MSG-2
MSG 2 methodology starts with the assumption that a ‘Maintenance Programme’
(A, B, C, D check etc) already exists.
MSG 2 analysis requires engineering judgement to be applied to aircraft and
system operation.
MSG 2 considers the function
§ What does it do
§ When does it do it
§ How does it do it

MSG 2 requires that for each failure (mode) the following
considerations are taken into account.
§ Can deterioration be detected
§ Can failure be tolerated
§ Can failure be prevented
§ What are the apparent effects of failure
§ What are the hidden effects of failure
§ Can deterioration be detected
MSG-2

(1)
Does the unit’s
failure affect
flight safety?
(4)
Is reduced
resistance to failure
detectable by a
maintenance check
(3)
Is there an
adverse
relatonship
between the age
of the unit the
reliability
(5)
Is there a
maintenance or
shop check to
assure continued
function?
(2)
Is the failure
evident to the
fligth crew?
Hard
Time
Hard
Time
On
Conditi
on
On
Conditi
on
Conditio
n
monitor
ing
NO
NO
NO NO
YES
YES
YES
YES
YES
NO
MSG-2

Tasks - usually overhaul - or Life Limit
Clear correlation between wear-out and time in service - typically engine parts
and undercarriages.
Additional maintenance tasks (e.g. Servicing, Lubrication) may be associated to improve
economic performance.
Service experience may qualify for transfer to OC/CM.
“On-Condition/On-ConditionMaintenance”
Primary maintenance process having repetitive inspections, tests or checks to
determine condition and continued serviceability – corrective action is taken
when required by the determined condition.
MSG 2 Maintenance Processes – Definitions
“Hard Time (HT)” - maximum time for performing maintenance.

Occurrence rates are statistically analysed
Events are monitored.
Not applicable to ‘safety related’ failure.
May have associated task in systems.
“Condition Monitoring” - Primary maintenance process under which data
on the whole population of specified items in service is analysed to
indicate whether allocation of technical resources is required. Not a
preventative maintenance process - Failures (or faults) are allowed to
occur.
MSG 2 Maintenance Processes –
Definitions

MSG 2 analysis requires selection of a Maintenance Task
(Systems / Structures):
§ Inspect (INSP)
§ Operational Check (OP/C)
§ Functional Check (F/C)
§ Test
§ Service
§ Overhaul (OH)
Task selection

The RCM/MSG-3 process essentially entails asking seven
questions about the asset or system:
1. What are the functions and performance standards of the asset in
its present operating environment?
2. In what way does it fail to fulfil its function?
3. What causes each functional failure?
4. What happens when eacch failure occurs?
5. In what way does each failure matter?
6. What can be done to predict or prevent each failure?
7. What should be done if a suitable proactive task cannot be found?
MSG-3

Define the
functions and
performance
standards
Functional
failures
Failure modes Failure effects
Failure
consequences
Proactive
tasks
RCM Process

Example:
The pump is pumping fuel into tank at a rate of 80 liters/minute. Offtake
of tank Y: 80 l/min. Pump can deliver up to 100 l/min
X
Pump
Y
Primary function of the pump: pump fuel from tank X to tank Y at not
less than 80 l/min

Functional failure
A functional failure is defined as the inability of any asset to fulfil a
function to a standard of performance which is acceptable to the user
In this step of the RCM process you should record all the functional failures
associated with each function
Two functinonal failures:
1. Fails to pump any fuel at all;
2. Pumps fuel at less than 80 l/min

1. When the capability falls below desired performance
There are 5 principal causes of reduced capability:
1. Deterioration
2. Lubrication failure
3. Dirt
4. Disassembly
5. Capability reducing human error
Example: manually operated valves left shut causing a process to be
unable to start, parts incorrectly fitted by maintenance or sensors set
in such a way that the machine trips out when nothing is wrong

Failure effects describe what happens when a failure mode occurs
Note : A description of failure effects should include all the information needed to support the
evaluation of the consequences of the failure
Failure Effects
When decribing the effects of a failure, the following should
be recorded:
What evidence that the failure has occcured
In what ways it poses a threat to safety or the environment
In what ways it affects production or operations
What physical damage is caused by the failure
What must be done to repair the failure

Example:
Function Functional failure Failure mode
Primary function of the
pump: pump fuel from
tank X to tank Y at not less
than 80 l/min
A Fails to pump
any fuel at all
1. Bearing seizes
2. Impeller comes adrift
3. Impeller jammed by foreign
object
4. Motor burns out
B Transfers less
than 80 l per min
1. Impeller worn
2. Partially blocked suction line

MSG 3 – MRB
IDENTIFY EACH
MSI
APPLY LOGIC
DETERMINE IF
TASK IS NECESSARY
SCHEDULED
MAINTENANCE Programme
LIST
Applicable and effective
Formed by resultant tasks and
intervals
Provide clear MSI definition
Function
Functional Failure FMEA Data
Failure Effect
Failure Cause
Additional Data
For each MSI’s functional failure
and failure cause

Maintenance Significant Items MSI (MRB)
Selection begins at the highest manageable level.
Maintenance significant items MSI’s
§ Systems and assemblies
§ Identified by manufacturer as items whose failure:
– Could affect safety (on ground or in flight).
and / or
– Could be undetectable or, are not likely to be detected during
operations.
and / or
– Could have significant operational economic impact.

The MSI are then subjected to the MSG Analysis Methodology
The methodology requires the identification of MSI fault conditions
Ø Functions
Ø Functional Failures
Ø Failure Effects
Ø Failure Causes
These conditions are those that are identified during the Failure Modes
Effects Analysis FMEA conducted during the design certification of
the product
Maintenance Significant Items MSI (MRB)

Logic Diagram Flow starts at top
Answers Yes or No dictate analysis flow direction
First level determines consequences of failure
– Each functional failure analysed for effect on the aircraft
operation
– Identifies one effect category for each functional failure
MSG – 3 Analysis - Logic Diagram

MSG-3 Methodology
Level 2 - Applicable and Effective Maintenance Tasks
Failure Effect Category: FEC
FEC 8
Hidden
Safety
FEC 9
Hidden
Non-Safety
FEC 5
Evident
Safety
FEC 6
Evident
Operational
Capability
effects
FEC 7
Evident
No operating
Capability
Economic

IS THE OCCURRENCE OF A
FUNCTIONAL FAILURE EVIDENT TO
THE OPERATING CREW DURING THE
PERFORMANCE OF NORMAL DUTIES?
DOES THE FUNCTIONAL FAILURE
OR SECONDARY DAMAGE
RESULTING FROM THE
FUNCTIONAL FAILURE HAVE A
DIRECT ADVERSE EFFECT ON
OPERATING SAFETY?
DOES THE COMBINATION OF A
HIDDEN FUNCTIONAL FAILURE &
ONE ADDITIONAL FAILURE OF A
SYSTEM RELATED OR BACK-UP
FUNCTION HAVE AN ADVERSE
EFFECT ON OPERATING SAFETY?
DOES THE FUNCTIONAL
FAILURE HAVE A
DIRECT ADVERSE
EFFECT ON OPERATING
CAPABILITY?
OPERATIONAL
EFFECTS
NON-OPERATIONAL
EFFECTS
NO
YES
SAFETY
EFFECTS
YES NO
YES NO
1
3
2
4
HIDDEN
SAFETY
EFFECTS
HIDDEN
NON-SAFETY
EFFECTS
YES
NO
IMPACT IMMEDIATE IMPACT DELAYED
Consequences of Failure Evaluation (Level 1)

FEC 5 - Evident Safety
FEC 6 - Evident Operational Capability
FEC 7 - Evident Economic – non capability effects
FEC 8 - Hidden Safety - dormant
FEC 9 - Hidden – no Safety effects (economic)
The Level 1 analysis results in 5 failure effect categories
MSG 3 Analysis – Level 2

MSG-3
(Hydr)
System
& Powerplant
MWG
(Engines)
System
& Powerplant
MWG
(Elect)
System
& Powerplant
MWG
(others)
System
& Powerplant
MWG
Structure
MWG
Zonal/EZAP
MWG
(Flight Ctr)
System
& Powerplant
MWG
L/HIRF
MWG
Categories of tasks

FEC 5 - Evident Safety Effects
Lubrication / Servicing
Is a Lubrication or Servicing Task
applicable & effective?
Inspection / Functional
Check
Restoration
Discard
Is there a Task or Combination of Tasks
Is a Discard Task to avoid failures or to
reduce the failure rate applicable and
effective?
Is a Restoration Task to reduce failure
rate applicable & effective?
Is an Inspection or Functional Check to
detect degradation of function applicable
& effective?
YES
YES
YES
YES
YES
TASK / COMBINATION
APPLICABLE / EFFECTIVE -
MUST BE DONE
NO
NO
NO
NO
NO
5A
5B
5C
5D
5E
REDESIGN IS MANDATORY
SAFETY EFFECTS:
TASK(S) REQUIRED TO ASSURE SAFE
OPERATION

Check
Restoration
Discard
effective?
& effective?
YES
YES
YES
YES
NO
NO
NO
6A
6B
6C
6D
REDESIGN MAY BE DESIRABLE
NO
OPERATIONAL EFFECTS:
TASK DESIRABLE IF IT REDUCES RISK TO AN
ACCEPTABLE LEVEL
FEC 6 - Evident Operational Effects

Check
Restoration
Discard
effective?
& effective?
YES
YES
YES
YES
NO
NO
NO
7A
7B
7C
7D
REDESIGN MAY BE DESIRABLE
NO
ECONOMIC EFFECTS:
TASK DESIRABLE IF COSTS IS LESS
THAN REPAIR COSTS
FEC 7- Evident Economic Effects

FEC 8- Hidden Function Safety Effects
Operational / Visual
Check
Check
Restoration
& effective?
Is a Check to verify operation applicable
& effective
YES
YES
YES
YES
NO
NO
NO
8A
8B
8C
8D
NO
SAFETY EFFECTS:
TASK(S) REQUIRED TO ASSURE THE
AVAILABILITY NECESSARY TO AVOID
SAFETY EFFECTS OF MULTIPLE FAILURES
Discard
effective?
effective?
8E
8F
YES
TASK / COMBINATION
COST EFFECTIVE MUST
BE DONE
YES
NO
REDESIGN IS MANDATORY
NO

Operational / Visual
Check
Check
Restoration
& effective?
Is a Check to verify operation applicable
& effective
YES
YES
YES
YES
NO
NO
NO
9A
9B
9C
9D
NO
NON-SAFETY EFFECTS:
TASK(S) DESIRABLE TO ASSURE THE
AVAILABILITY NECESSARY TO AVOID
ECONOMIC EFFECTS OF MULTIPLE FAILURES
Discard
effective?
9E
8F
YES
NO
REDESIGN IS DESIRABLE
FEC 9- Hidden Function Non-Safety Effects

Maintenance tasks for airframe systems
1. Lubrication
• An application of lubricants or a check and replenishment of the necessary fluids.
2. Servicing
• An act of attending to basic needs of components and systems for the purpose of maintaining the inherent
design capabilities
3. Inspection
• An examination of an item against a specified standard
4. Functional Check
• A quantitative check to determine if one or more functions of an item performs within specified limits
5. Operational check
• A task that determines if an item is fulfilling its intended purpose. This is a failure finding task and does not
require quantitative tolerances
6. Visual Check
• An observation to determine if an item is fulfilling its intended purpose. This is a failure finding task and does
not require quantitative tolerances
7. Restoration
• That work necessary to return the item to a specific standard. Restoration varies from cleaning or
replacement of a single part up to a complete overhaul
8. Discard
• The removal from service of an item at a specified life limit

Task Applicability Effectiveness
Safety Operational Non-safety
LUB/SER The replenishment of the
consumable must reduce the rate
of functional deterioration
The task must
reduce the risk of
failure
The task must
reduce the risk to
an acceptable
level
The task must be cost
effective
OPC/VC Identification of the Failure must
be possible
The task must ensure
adequate availability of
the hidden function to
reduce risk of a multiple
failure
N/A The task must ensure
adequate availability of the
hidden function in order to
avoid economic effects of
multiple failure and must be
cost Effective
FC Reduced resistance to failure must
be detectable and there exists a
reasonably Consistent interval
between a deterioration condition
and functional failure.
The task must
reduce the risk of
failure to assure
safe operation
The task must
reduce the risk to
an acceptable
level
The task must be
cost effective, i.e., the cost
of the task must be less
than the cost of the failures
prevented.
RS The item must show functional
degradation characteristics at an
identifiable age and a large
proportion of units must survive to
that age.
The task must reduce
the risk of failure to
assure safe operation
The task must
reduce the risk to
an acceptable
level
effective.
DS The item must show functional
degradation characteristics at an
identifiable age and a large
proportion of units must survive to
that age
A safe-life limit must
reduce the risk of a
failure to assure safe
operation
The task must
reduce the risk to
an acceptable
level
effective.

Considerations for defining a task interval:
1. Lubrication/Servicing (failure prevention):
- The interval shall be based on the consumable’s usage rate, the amount of
consumable in the storage container (if applicable) and the deterioration
characteristics.
- Typical operating environments and climatic conditions are to be considered when
assessing the deterioration characteristics.
2. Operational Checks & Visual Checks (failure-
finding):
- Consider the length of potential exposure time to a hidden failure and the potential
consequences if the hidden function is unavailable.
- Task intervals shall be based on the need to reduce the probability of the associated
multiple failure to a level considered tolerable by the working group.
- The failure-finding task and associated interval selection process shall take into
account any probability that the task itself might leave the hidden function in a failed
state.
MSG 3 Analysis – Task Interval Selection

Considerations for defining a task interval:
3. Inspections & Functional checks (potential failure finding):
- There shall exist a clearly defined potential failure condition.
- The task interval shall be less than the shortest likely interval between the point at
which a potential failure becomes detectable and the point at which it degrades into a
functional failure. (If the specific failure data is available, this interval may be referred
to as the P to F interval.) . It shall be practical to do the task at this interval.
- The shortest time between the discovery of a potential failure and the occurrence of
the functional failure shall be long enough for an appropriate action to be taken to
avoid, eliminate or minimize the consequences of the failure mode.
4. Restoration and Discard (failure avoidance):
- Intervals shall be based on the "identifiable age" when significant degradation begins
and where the conditional probability of failure increases significantly.
- Vendor recommendations based on in-service experience of similar parts shall be
taken into consideration.
- A sufficiently large proportion of the occurrences of this failure shall occur after this
age to reduce the probability of premature failure to a level that is tolerable.

Structural item tasks
Airplanes are subjected to 3 sources of structural deterioration:
1. Environmental;
2. Accidental damage;
3. Fatigue damage.
This results in three types of structural inspections:
1. General visual inspection (GVI)
2. Detailed inspection (DET)
3. Special detailed inspection (SDET i.e. NDI’s)
MSG 3 Analysis – Level 2 Task Selection

Zonal maintenance tasks
The zonal maintenance Programme ensures that all systems,
components and installations contained within a specified zone
on the aircraft receive adequate surveillance to determine the
security of installation and general condition.
The Programme packages a number of general visual inspection
tasks, generated against items in the system’s maintenance
Programme, into one or more zonal surveillance tasks
MSG 3 Analysis – Level 2 Task Selection
De afbeelding kan momenteel niet worden weergegeven.

Task Interval selection is based upon collected data from
• Global MTBUR Data
• Component ‘strip’ reports
• Test Data
• Extrapolated Data (Test / In-service)
• Operator Despatch Reliability
• AD / SB Data
• OEM
• The information needed to determine optimum intervals is ordinarily not
available until after the equipment enters service . lf there is no prior
knowledge from other aircraft system/powerplant, or if the information is
insufficient, the task interval/frequency can only be established initially by
experienced working group and steering committee personnel using good
judgment and operating experience in concert with accurate data (reliability,
redundancy, dispatch, etc.).

89
P to F Curve
r
resistance
to
failure
t -time
check A1
check A2
P- potential failure detectable
f
Failure –loss
of
availability
of function

Deterioration
Start of
deterioration
potential failure
detectable.
Usage parameter
The interval for a Potential failure finding task
Should not be greater than the interval between
detection and functional failure
POTENTIAL FAILURE FINDING TASK
Functional
Failure
2. Maintenance Steering
Group-3 procedure

Deterioration
Start of
deterioration
Usage parameter
The interval of a failure finding task must must prevent that i.e multiple hidden failures cause an
unsafe condition.
To determine the maximum interval you need specific data which is only known by the TC holder
FAILURE FINDING TASK
Functional
Failure
Group-3 procedure

Deterioration
Usage parameter
Start of
deterioration
Failure prevention task
(Servicing, lubrication)
Potential failure finding task
(Inspection, functional check)
Failure finding task
(Visual/Operational check)
potential failure
detectable.
The purpose of this task (OPC or VC) is a failure
finding task before multiple failure can lead to Safety
related Functional Failure
The purpose is to find (GVI, DET, SDET of FC)
degradation of the component or system before
the occurrence of a Functional Failure.
Functional
Failure
Group-3 procedure

Reliability Measures
Discussion about various measures by which reliability
characteristics can be quantified and described.
Basic Reliability measures:
– Reliability function
– Failure function
Mission Reliability (failures that cause mission failure:
– Mission reliabilty
– Maintenance free operating period (MFOP)
– Failure free operating period (FFOP)
– Hazard function

Operating Reliability measures are used for the performance of
the system when operated in pannend environmentincluding
the combined effect of design, quality, environment,.
Maintenance , etc.
– Mean time between maintenance (MTBM)
– Mean time between overhaul;
– Maintenance free operating period
– Mean time between critical failure
– Mean time between unscheduled removals
– Main time to failure
– Main time (operating) time between failure

• Applications of the MTTF
1. MTTF is the average life of a non-repairable system
2. For a repairable system it is the average time before the first failure
• Applications of the MTBF
1. MTBF = MTTF if after repair it is as “good as new”
2. MTBF = 1/λ for exponential distribution
• Percentile life (B –life) is the life at which certainproportion
of the population can be expected to have failed
𝐹 𝑡 = � 𝑓 𝑡 𝑑𝑑 = 𝑝𝑝
𝑡
0

• Probability (P) concepts
Any event has a probability of occurrence (range 0 - 1)
• Probability P can be defined in 2 ways
– If an event can occur in N equally likely ways, and if the event with
attribute A can happen in n of these ways, then the P of A occurring is
P(A) = n / N
– If, in an experiment, an event with attribute A occurs n times out of N
experiments, then as N becomes large, the P of event A approaches n/N
is P(A) = lim n/N (n -> ∞)
Reliability Mathematics

• Rules of Probability
– The probability of obtaining an outcome A is denoted by P(A)
– The joint P that A & B occur is denoted by P(AB)
– The P that A or B occur is denoted P(A+B)
– The conditional P of obtaining outcome A, given that B has occurred, is denoted by P(A I B)
_
– The P of the complement, A not occurring is P(A) = 1- P(A)
– If P(A) is unrelated to whether or not B occurs, we say the events are
independent
– The joint P of the occurrence of 2 independent events A and B is the
product of the individual P: P(AB) = P(A) . P(B)
– For k indep. outcomes P(ABCD…) = P(A).P(B).P(C).P(D)…

• Rules of Probability
– The probability of any one of 2 events A or B occurring is
P(A+B) = P(A) + P(B) – P(AB)
– The P of A or B occurring, if A and B are independent is
P(A+B) = P(A) + P(B) – P(A) . P(B)
– Bayes’ formula: P(A I B) = P(A) . P(B I A) / P(B)

Example:
If an aircraft electrical power generation system has n channels, each with a
failure probability of 10-3 then assuming all channels are totally independent the
probability of single channel and total system failure is given by:-
No. of Channels, n
1
2
3
4
Failure of
Single Channel
10-3
10-3 x 10-3
10-3 x 10-3 x 10-3
10-3 x 10-3 x 10-3 x 10-3
Failure of
Total System
10-3
10-6
10-9
10-12

• Example:
– The R of a missile is 0.85
– If a salvo of 2 missiles is fired, what is the P of at least one hit? (assume
independence of missile hits)
– Let A be the event ‘First missile hits’ and B the event ‘Second missile hits’
– Then P(A) = P(B) = 0.85
_ _
– Then P(A) = P(B) = 0.15 _ _ __
– There are 4 possible, mutually exclusive outcomes, AB, AB, AB, AB
__ _ _
– P(AB) = P(A) P(B) = 0.0225
– Therefore the P of at least one hit is P = 1 - 0.0225 = 0.9775

Failure function
Failure function is probability that an item will fail before or at the moment of
operating time t.
(t generic: FH/FC/CT etc.)
𝐹 𝑡 = � 𝑓 𝑡 𝑑𝑑
𝑡
0
f(t) is de PDF (probability density function)
Popular pdf’s :
– Exponential
– Normal
– Lognormal
– Poisson
– Weibull

• CONTINUOUS VARIATION
f(x) is the
probability
density of
occurrences,
related to
the variable
x
x

– The value of x at which the distribution peaks is called the mode
– The area under the curve = 1 (when it covers the entire range of x)
∞
Therefore ∫ f(x) dx = 1
-∞
The probability of a value falling between 2 values x1 and x2 is the
area bounded by this interval
x2
P(x1<x<x2) = ∫ f(x) dx
x1

Measures of central tendency
– For a sample containing n items the sample mean ¯x = Σ xi / n (i= 1 to n)
– The mean of a distribution is usually denoted by μ
∞
μ = ∫ x f(x) dx
-∞
The median and the mode

– Spread of a distribution
The spread or dispersion (the extent to which the values which make up the
distribution vary) is measured by its variance.
For a sample size n the variance Var(x) = E(x-x¯)² = Σ (xi – x¯)² / n i= 1,n
For a finite population N the population variance σ² = Σ (xi –μ)² / N i= 1,n
∞
For a continuous distribution σ² = ∫ (x-μ)² f(x) dx
-∞
σ is called the standard deviation

The cumulative distribution function c.d.f. F(x) gives the
probability that a measured value will fall between -∞ and x
x
F(x) = ∫ f(x) dx
0
1
0
Typical
cumulative
distribution
function c.d.f.
F(x)
x

• Reliability function
The probability that an item will survive for a stated interval (time, cycles,
distance) means that there is no failure in the interval 0 to x. This is
the reliability function R(x)
∞
R(x) = 1 – F(x) = ∫ f(x) dx
x
• MTTF is the average life of a non-repairable system.
• For a repairable system, MTTF represents the average time before the first failure.

Hazard function (instantaneous failure rate)
Is the indicator of the effect of ageing in the reliability of a
system. It quantifies the risk of failure as the age increases.
Exponential: λ
Weibull:
𝛽
η
(
𝑡
η
)𝛽−1
Characteristics:
1. Hazard function can be increasing, decreasing and constant;
2. Hazard function is not a probability => can be >1
=> For h(t) < 1, it is not recommended to do preventive maintenance.

Joint Aviation Authorities JAA Releability Program Training

Continuous distribution functions
Gaussian/Normal
Exponential
Weibull

Gaussian Distribution
Random variations are likely to be grouped around some MEAN value
in such a way that small variations are more likely than large ones.
Limiting case being the “bell-shaped” p.d.f.
The NORMAL or GAUSSIAN distribution (the most widely used model)
A population which conforms to the Normal distribution has variations
which are symmetrically disposed about the mean (skewness is zero)

• Continuous distribution functions
– The NORMAL or GAUSSIAN distribution
– It can be proved that the p.d.f. is given
-1/2 (x-μ / σ)²
– f(x) = 1/ σ . √2π . e
μ is the location parameter, equal to the mean
σ is the standard deviation
Gaussian distribution

Example:
the life of an incandescent lamp is “normally” distributed, with the mean =1200h
and SD=200h. What is the probability that a lamp will last at least 800h? At
least 1600h?
z (a) = 800-1200 / 200 = -2
z(b) = 1600-1200 / 200 = 2
The probability of a value not exceeding 2 is 0.9773
For 800h: p = 0.977
For 1600h: p = 1- 0.977 = 0.023
Gaussian distribution

Gaussian Distribution
We can use this criteria to develop maintenance intervals and
certification limitation item intervals – how many failures are we
prepared to tolerate?
10% of all cases lie below a level 1.3 x SD from the mean
1% of all cases lie below a level 2.3 x SD from the mean
0.1% of all cases lie below a level 3.1 x SD from the mean
0.01% of all cases lie below a level 3.7 x SD from the mean

– The exponential distribution describes the situation wherein the hazard rate
is constant.
– The p.d.f. is
(-λt)
f(t) = λ. e for x > 0
f(x) = 0 for x < 0
(-λt)
R(t) = e is the reliability function (or survival probability)
Example: the R of an item with an MTTF of 500h over a 24h period is
(-24/500)
R(24) = e = 0.953
Exponential distribution

For items which are repaired,
– λ is called the failure rate and
– 1/ λ is called the MTTF
If t= MTTF, λt =1 and 63.2 % of items will have failed at that time
MTTF = (Nb of items x operating time) / total Nb of failures
Example: 10 items
Operating time = 5000 hours
4 failures
MTTF = 12500

CONSTANT FAILURE RATE: As far as maturity period of electronic
components is concerned, we use λ constant. (no wear, no
corrosion, no fatigue)
λ(t) = λ
The knowledge of λ allows any evaluation. It can be obtained through a
reliability data basis ( predictive rel.), reliability tests (experimental rel.) or in-
service feed-back (operational rel.)
-6

A number of statistical tools are available but probably the most
commonly used tool is WEIBULL ANALYSIS
This law is general (2 or 3 existing parameters) ; it is used for
mechanical components according to their functioning time.
Weibull distribution

•A number of statistical
tools are available but
probably the most
commonly used tool is
WEIBULL ANALYSIS
•This law is general (2 or
3 existing parameters) ;
it is used for
mechanical
components
according to their
functioning time.

1 The Shape Parameter β can
be used to determine failure
characteristics
• β<1 we get a decreasing
failure/hazard rate function
Infant Mortality
Ø Post infant mortality –
component reliability
improves with age – overhaul
not appropriate

2 The Shape Parameter β can be used to determine failure characteristics
• β=1
The exponential reliability function (constant hazard rate) results
with η = characteristic life (age at which 63,2 % will have failed)
3 The Shape Parameter β can be used to determine failure characteristics
• 1<β<4 We get an increasing hazard rate reliability function
Early wear out
Ø Low cycle fatigue effects
Ø Bearing failures
Ø Corrosion, erosion
Ø Overhaul, restoration may be appropriate as preventative maintenance
process

4 The Shape Parameter β can be used
to determine failure characteristics
• β>4 – Old age wear-out
Ø Some bearing failure modes
Ø Material properties
Ø Brittle materials (ceramics)
Ø Some corrosion, erosion types
Ø Overhaul, restoration may be
appropriate as preventative
maintenance process

WEIBULL analysis may be used by the TC Holder as a diagnostic tool in
order to
§ Develop corrective action for in-service failures
§ Develop initial maintenance requirements at a component level
using limited test data
Small data sets
• Small quantities of data may produce markedly unsymmetrical
distributions
• Statistical analysis becomes less reliable as the quantity of data
reduces - some processes (e.g. Weibull) are more reliable than
others when dealing with small quantities of data

WEIBULL analysis may be used by operators to as a tool in order to:
§ Spare parts control (how many units are expected to fail for a give
production) so that you have enough spare parts in stock
• Failure forcasting
§ Measure effectivity of modifications (SB policy), corrective actions and
organisational changes
§ Manage cost of unplanned failure which are subject to wear out failure
mode.
§ Determination of optimal replacement intervals for units with wear out
failure modes.

WEIBULL Characteristics:
β = 1 => MTTF = η
β > 1 => MTTF < η
β < 1 => MTTF > η
β = 0.5 => MTTF = 2η

Discrete variations
Binomial distribution
Poisson distribution

The binomial distribution BD (or Bernoulli D.)
Describes a situation in which there are only 2 outcomes (pass or fail) and
the probability remains the same for all trials.
The p.d.f. for the BD is
n! x n-x
f(x) = ---------- p q
x! n-x!
This is the probability of obtaining x good items and n-x bad items, in a
sample of n items, when the probability of selecting a good item is p
and of selecting a bad item is q
μ = n.p
σ = √ npq
Discrete variations

The binomial distribution BD
The BD can only have values at points where x is an integer. The c.d.f. of
the binomial distribution (the P. of obtaining r of fewer successes in n
trials) is given by
r n! x n-x
F(r) = Σ ------- p q
0 x! n-x!
Exercise:
An A/C Landing gear has 4 tires; experience shows that tire bursts occur on
average on 1 landing in 1200. Assuming that tire bursts occur independently
of one another, and that a safe landing can be made if not more than 2 tires
burst.
What is the probability of an unsafe landing?
Discrete variations

The POISSON distribution
If we consider again the p.d.f. of the BD
n! x n-x
f(x) = ---------- p q
x! n-x!
With n allowed to become very large, p to become very small, whilst the
product n.p remains finite, then
x
np -np
f(x) = . e
x!
Discrete variations

The POISSON distribution
x
μ _ μ
f(x) = ---- . e x=0,1,2,…
x!
where μ is np, provides the number of failures in a given time or the number
of defects in a batch of items.
Discrete variations

Failure pattern A: Bathtub curve
Infant mortality caused by:
• Poor design
• Poor quality of
manufacturing
• Incorrect installation
• Incorrect operation
• Incorrect maintenance
Failure patterns

Failure pattern F
– Most common failure (68%)
– Probability of failure declines with age
• Infant mortality caused by
– Poor design
– Poor quality manufacturing
– Incorrect installation
– Incorrect operation
– Incorrect maintenance
Failure patterns

Maintenance programmme
Development

Damage
Tolerance
MSG-3 System
Safety
assess
ments
MRB
MPD
ALS Part
3
CMR’s
ALS
Part 2
ALS
Part 4
ALS
Part 5
AD
SB
SL National requirements
Vendor information
Customise
AMP
Operators Maintenance programme

Customized Maintenance Programme
MPD is a document that contains tasks and intervals for series of an aircraft
type. Therefore several configurations are in the MPD.
An operator will have to filter the correct configuration applicable to the fleet.
The configuration can differ:
• Engines
• Aircraft type
• Modification status
• Cabin Layout
• Weight variant
• APU
• SB status
B777-300ER
B777-200

Applicability of task
Applicability => see note:
Aircraft type
Modification: winglets installed

Applicability and description
Modification status:
PRE: Modification not
embodied
POST: Modification embodied
SB linked to the Modification
SB linked to the Modification

Applicability and description
Airframe Engine combination:
GE/PW/RR

Airworthiness Limitation Section
• Part 1 : Safe Life Airworthiness Limitations Items (SL ALI)
• Part 2 : Damage Tolerant Airworthiness Limitations Items (DT ALI)
• Part 3 : Certification Maintenance Requirements (CMR)
• Part 4 : Ageing System Maintenance (ASM)
• Part 5 : Fuel Airworthiness Limitations (FAL)

The term reliability can be used in various aspects:
– Reliability of airline activity
– Reliability of components or systems
– Reliability of processes or persons
There are however four types of reliability related to maintenance:
1. Statistical reliability
2. Historical reliability
3. Event-oriented reliability
4. Dispatch reliability
Types of reliability
Reliability is the probability that a component or system will perform a required
function for a given period of time when used under stated operating conditions.
(probability of a non failure over time)

Is based on on collection and analysis of failure, removal and
repair rates of systems or components. Event rates are
calculated on the basis of normalised events (per 1000 FH or
per 100 FC).
Note: statistics should have enough data points. In many books on
statistics any data set with < 30 data points the statistical calculations
are not considered very significant.
1. Statistical Reliability

Remember: formula’s will always produces a
numerical value!

Example:
Airline uses weather radar only 2 months of
the year. When we calculate the mean value
of failure rates and the alert level in the
conventional manner we will find that we are
always in alert.
Data 12 Months 2 Months
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
7 0 0
8 0 0
9 26 26
10 32 32
11 0 0
12 0 0
Sum 58 58
n 12 2
Avg 4,8 29
Std Dev 11,4 3
A.L. 27,6 35

Why is this approach not valid?
This method shows 10 zero data points while the equipment was not used. They
represent “no data”. In this approach it’s represented as “no failure”.
0
5
10
15
20
25
30
35
1 2 3 4 5 6 7 8 9 10 11 12
Failure rate
Mean value
Alert Level

What is wrong with this?
Using statistics with only 2 data points? The only meaningful value is the
mean value!
0
5
10
15
20
25
30
35
40
1 2 3 4 5 6 7 8 9 10 11 12
Failure rate
Mean value
Alert Level

Comparing the current event rates with those of past experiences.
In the previous example it would be better if it was analysed by this type.
This type can used for when new equipment/modifications is introduced and
there is no data available on event rates.
After sufficient data has been collected to determine a norm the equipment
then can be added to the statistical reliability Programme.
2. Historical reliability

Event-oriented reliability is in general concerned with :
– Hard landings;
– In-flight shutdows;
– Lightning strikes
– Etc.
Each occurence must be investigated to determine the cause and to
prevent/reduce the possibility of recurrence
3. Event-oriented Reliability

In ETOPS operations certain
events that relate to the
successful conduct of of ETOPS
flights are designated by the
NAA’s as actions to be tracked
by this type of reliability in
addition to statistical and
historical reliability.
3. Event-oriented Reliability

Dispatch reliability is a measure o the overall effectiveness of the
airline operation w.r.t on-time departure.
Some airlines however only track the dispatch reliability. This can result in tracking
and investigating only problems that results in delays.
Example 1:
Pilot in command experiences a problem with the rudder controls 2h before landing. He writes it
in the ATL. Mainteance checks ATL => troubleshooting => maintenance. The maintenance process
takes more time than the scheduled turnaround time and results in a delay.
This delay is charged to maintenance.
Is this the proper response?
Did maintenance cause the delay?
Did the rudder equipment cause the delay?
Or is the poor airline procedures?

Overemphisizing dispatch delay that some airlines will investigated every
delay (which) they should but if equipment problem is involved, the
investigation may or may not take into account other similar failures that did
not cause delays.
Example 2:
Operator has 12 write-ups of rudder problems during the month and only one caused
a delay.
The operator should investigate 2 problems:
1. Delay which could be caused by an other problem than the rudder equipment
2. 12 rudder write up that may be related to a maintenance problem.

Posibility:
• The delay is an event-oriented reliability problem
• The 12 rudder problems (if identified as high failure rate) should be addressed by
the statistical or historical reliability Programme.

Administration and Management of
the Reliability Programme

DATA SOURCES MANAGEMENT
Line Maintenance
Defects
Component removals
Airworthiness Authorities
AD’s
Mandatory mod’s
Flight crew
Pilot reports
Delays/ cancellation
In-flight shut-downs
Workshop Reports
Corrective actions
Unjustified removals
Aircraft Manufacturer
SB’s
Equipment manufacturer
SB
Modifications
Airworthiness
Authority
Postholder Maintenance
and Engineering Director
Reliability Department/Section
Analysis of all incoming data
Maintenance of data records
Presentation of information to the Reliability Control Commitee
Reliability Management Board
Members
Quality Assurance Manager
Engineering Manager
Reliability Engineers
Fleetmanagers
Maintenance manager
Flightops representative
Corrective action
Request for escalations
Modifications
Maintenance procedures
Workshop procedures
Product improvement
Provisioning
Training, Publications
Etc.

Every program is required to have a controlling body (Reliability Management Board)
which is responsible for the implementation, decision making and day-to-day running
of the programme.
It is essential that the Reliability Management Board should ensure that the
programme establishes not only close cooperation between all relevant departments
and personnel within the Operator’s own Organisation, but also liaison with other
appropriate Organisations. Lines of communication are to be defined and fully
understood by all concerned.
Programme Control
The Reliability Management Board is responsible for, and will have full authority to
take, the necessary actions to implement the objectives and processes defined in the
programme.

• The Board should meet frequently to review the progress of the
programme and to discuss and, where necessary, resolve current
problems. The Board should also ascertain that appropriate action is being
taken, not only in respect of normal running of the programme, but also in
respect of corrective actions
• Formal review meetings could be held with the CAA at agreed intervals to
assess the effectiveness of the programme. An additional function of the
formal review meeting is to consider the policy of, and any proposed
changes to, the programme.
Programme Control

• Data collection
• Problem detection (an alerting system)
• Setting and adjusting alert levels
• Reading alert status
• Data display
• Data analysis
• Corrective actions
Elements of the Reliability Programme

Data will vary in type according to the needs of each Reliability programme.
1. data in respect of systems and sub-systems will utilise inputs from
reports by pilots, reports on engine unscheduled shut-downs and also,
perhaps, reports on mechanical delays and cancellations.
2. data in respect of components will generally rely upon inputs from
reports on component unscheduled removals and on workshop reports.
Data collection
The principle behind the data collection process is that the information has
to be adequate to ensure that any adverse defect rate, trend, or apparent
reduction in failure resistance, is quickly identified for specialised attention.

The data types normally collected are:
1. Flight time and cycles for each aircraft
2. Pilot reports of logbook write-ups
3. Cancellations and delays over 15 minutes
4. Unscheduled component removals
5. Unscheduled engine removals
6. In-flight engine shutdowns
7. Cabin logbook write-ups
8. Component failures (shop maintenance)
9. Maintenance check package findings
10. Critical failures
Data collection

Data collection
AREA TYPICAL SOURCE DATA ELEMENT
OPERATIONAL
STATISTICS
• Airplane Flight Log
• Flight Operations Systems
• Airplane hours
• Airplane Cycles (landings)
PIREP’s
(Pilot Deficiency
Reports)
• Airplane Maintenance Logs • Airplane Registration Number
• Flight Number
• Date and Station
• Problem description & corrective action
• ATA code
• Part numbers/serial numbers of removed and
installed components
• Mechanic’s identification
DELAYS &
CANCELLATIONS
• Delay Reports (from Line Stations or
Maintenance Control Center)
• Airplane Registration Number
• Flight Number
• Length of Delay
• ATA code
UNSCHEDULED
ENGINE REMOVALS
• Airplane Maintenance Logs
• MCC Reports
• Engine Shop Reports
• Engine Serial Number
• Reason for Removal, including Basic or Non-
Basic Failure
• Shop Findings

ENGINE INFLIGHT
SHUT-DOWNS (IFSD)
• MCC Reports
• Engine Shop Reports
• Airplane Registration
• Engine Serial Number
• Reason for Shutdown, including Basic or Non-
Basic Failure
• Corrective action
• Shop Findings
ROTABLE
COMPONENT
REMOVALS
• MCC Reports
• Component Tracking & Routing Tags
• Airplane Registration
• Part Number / Serial Number
• Installation / Removal Date
• Reason for Removal
• ATA code
• Times: Since New, Overhaul, or Repair
• Shop Findings
SERVICE DIFFICULTY
/ OCCURRENCE
REPORTS
• Service Difficulty / Occurrence
Reports to Regulatory Authority
• Quality Reports
• MCC Reports
• Part Number / Serial Number of Parts
involved
Data collection

SIGNIFICANT NON-
ROUTINE FINDINGS
• Non-Routine Write-Ups (Task-Cards) • Airplane Registration Number
• Date
• Type of Check
• Task Card Number (MRI)
• ATA Code
• Description Finding & Corrective Action
• Part Number / Serial Number of Parts
involved
• Mechanic’s identification
Data collection
Problems with data:
• Censored or suspended data
• Mixture of failure modes
• Non zero time origin
• Unknown ages of succesful units
• Extreme small samples
• No failure data
• Early data missing
• Inspection data (interval data for example: hidden failures)

• Pireps are reports of occurrences and malfunctions entered in the Aircraft
Technical Log by the flight crew for each flight. Pireps are a significant sources of
information, since they are a result of operational monitoring by the crew and are
thus a direct indication of aircraft reliability as experienced by the flight crew.
• Technical Log entries should be routed to the Reliability Section at the end of each
day, or at some other agreed interval. Pireps should be monitored on a continuous
basis, and at the end of the prescribed reporting period are calculated to a set
base as a reliability statistic for comparison with the established Alert Levels e.g.
Pirep Rate per 1000 hr, Number of Pireps per 100 departures, etc.
• Engine performance monitoring can also be covered by the Pirep process in a
programme. Flight crew monitoring of engine operating conditions is, in many
programmes, a source of data in the same way as reports on system malfunctions.
Pilot reports (Pireps)

Pilot reports (Pireps)
Pireps with
reference
data

These are flight crew reports of engine shut-downs and usually include details of the
indications and symptoms prior to shut-down. When analysed, these reports provide
an overall measure of propulsion system reliability, particularly when coupled with
the investigations and records of engine unscheduled removals.
Engine Shut-downs

• These are normally daily reports, made by the Operator’s line
maintenance staff, of delays and cancellations resulting from mechanical
defects. Normally each report gives the cause of delay and clearly
identifies the system or component in which the defect occurred. The
details of any corrective action taken and the period of the delay are also
included (standard IATA Delay Codes)
• The reports are monitored by the Reliability Section and are classified
(ATA 100) Chapter sequence, recorded and passed to the appropriate
engineering staffs for analysis. At prescribed periods, recorded delays and
cancellations for each system are plotted.
Aircraft Mechanical Delays and Cancellations

Algoritms that can be used to represent data depends on the operators need

At the end of the reporting period the unscheduled removals and/or confirmed failure
rates for each component are calculated to a base of 1,000 hours flying, or, where
relevant, to some other base related to component running hours, cycles, landings,
etc.
Every component unscheduled removal is reported and should normally
include the following information:
i) Identification of component.
ii) Precise reason for removal.
iii) Aircraft registration and component location.
iv) Date and airframe hours/running hours/landings, etc. at removal.
v) Component hours since new/repair/overhaul/calibration.
Component Unscheduled Removals and Confirmed Failures

Component Unscheduled Removals and Confirmed Failures
Problems with data:
• Censored or suspended data
• Mixture of failure modes
• Non zero time origin
• Unknown ages of succesful units
• Extreme small samples
• No failure data
• Early data missing
• Inspection data (interval data for example: hidden failures)

Problem detection
• General
• Performance Parameters
• Alert Definitions
• General
• Alert Status
• Upper Control Limits
• Upper Control Limit determination
• Revision of the Control Limits
• Example
• Dataflow

General
The performance of airplane systems, components and power plants is monitored
with an Alerting System which assist in the assessment of reliability. The system
uses Upper Control Limits (UCL) to identify unacceptable performance.
To be able to compare data the Reliability programme should use normalised units
of measurements.
The choice of units of measurement will depend on the type of operation, the
preference of the Operator and those required by the equipment manufacturer.
Too much importance should not be placed upon the choice of units of measurement, provided
that they are constant throughout the time the Reliability programme runs and are appropriate
to the type and frequency of the event

Examples of units of measurements
units
SYSTEMS – Pilot reports per 100 landings
– Delays and cancellations per 1,000 departures
COMPONENTS – Removal per 1,000 unit hours
– Failures per 1,000 unit hours
POWER PLANT – In-flight shutdowns per 1,000 engine hours
– Unscheduled removals per 1,000 engine hours
AIRPLANE – Repeat pilot reports
– Service Difficulty / Occurrence Reports
– Significant non-routine findings from heavy checks
STRUCTURES – Service Difficulty Report (structural only)
– Significant non-routine findings from letter checks

𝐷𝐷𝐷𝐷𝐷 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 =
𝑇𝑇𝑇𝑇𝑇 𝐹𝐹𝐹𝐹𝐹𝐹 𝐻𝐻𝐻𝐻𝐻 𝑖𝑖 𝑀𝑀𝑀𝑀𝑀
𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑑𝑑𝑑𝑑 𝑖𝑖 𝑀𝑀𝑀𝑀𝑀
Performance parameters
𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑠𝑠𝑠𝑠𝑠𝑠 𝑙𝑙𝑙𝑙𝑙𝑙 =
𝑇𝑇𝑇𝑇𝑇 𝐹𝐹𝐹𝐹𝐹𝐹 𝐻𝐻𝐻𝐻𝐻 𝑝𝑝𝑝 𝑀𝑀𝑀𝑀𝑀
𝑇𝑇𝑇𝑇𝑇 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑝𝑝𝑝 𝑀𝑀𝑀𝑀𝑀
Utilisation

Pilot reports (PIREPS)
𝑃𝑃𝑃𝑃𝑃 𝑟𝑟𝑟𝑟𝑟𝑟 𝑟𝑟𝑟𝑟𝑟 =
𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑃𝑃𝑃𝑃𝑃 𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝐹𝐹𝐹𝐹𝐹𝐹 ℎ𝑜𝑜𝑜𝑜
𝑋 1000
𝑀𝑀𝑀𝑀𝑀𝑀 𝐴𝐴𝐴𝐴𝐴 𝐿𝐿𝐿𝐿𝐿 = 𝑀𝑀𝑀𝑀(3 𝑚𝑚𝑚𝑚𝑚𝑚 𝑟𝑟𝑟𝑟 𝑜𝑜𝑜𝑜 12 𝑚𝑚𝑚𝑚𝑚𝑚) + 3𝜎
𝐹𝐹𝐹𝐹𝐹 𝐴𝐴𝐴𝐴𝐴 𝐿𝐿𝐿𝐿𝐿 = 𝑀𝑀𝑀𝑀(3 𝑚𝑚𝑚𝑚𝑚𝑚 𝑟𝑟𝑟𝑟 𝑜𝑜𝑜𝑜 𝑙𝑙𝑙𝑙 𝑝𝑝𝑝𝑝𝑝𝑝) + 3𝜎

Component Reliability
𝐴𝐴𝐴𝐴𝐴 𝐿𝐿𝐿𝐿𝐿 = 𝑀𝑀𝑀𝑀 𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑟𝑟𝑟𝑟 (3 𝑚𝑚𝑚𝑚𝑚𝑚 𝑟𝑟𝑟𝑟 𝑜𝑜𝑜𝑜 𝑙𝑙𝑙𝑙 8 𝑞𝑞𝑞𝑞𝑞𝑞𝑞𝑞 ) + 3𝜎
𝑅𝑅𝑅𝑅𝑅𝑅𝑅 𝑟𝑟𝑟𝑟𝑟 =
𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝐹𝐹𝐹𝐹𝐹𝐹 𝐻𝐻𝐻𝐻𝐻 × 𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑝𝑝𝑝𝑝𝑝 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑜𝑜 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎
× 1000
𝑀𝑀𝑀𝑀𝑀 =
𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝐹𝐹𝐹𝐹𝐹𝐹 𝐻𝐻𝐻𝐻𝐻 × 𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑝𝑝𝑝𝑝𝑝 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑜𝑜 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎
𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝐼𝐼𝐼𝐼 𝑅𝑅𝑅𝑅𝑅 =
𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝐼𝐼𝐼𝐼
𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝐸𝐸𝐸𝐸𝐸𝐸𝐸 × 𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝐹𝐹𝐹𝐹𝐹𝐹 ℎ𝑜𝑜𝑜𝑜
× 1000

Dispatch Reliability
𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 =
(𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 − 𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑)
(𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 − 𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐)
× 100
𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝑅𝑅𝑅𝑅 =
𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 + 𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
× 100
𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 𝐿𝐿𝐿𝐿 =
𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑒𝑒𝑒𝑒𝑒𝑒
× 100
𝐷𝐷𝐷𝐷𝐷 𝑅𝑅𝑅𝑅 =
𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝐷𝐷𝐷𝐷𝐷𝐷
× 100
𝑀𝑀𝑀𝑀𝑀𝑀 𝐴𝐴𝐴𝐴𝐴 𝐿𝐿𝐿𝐿𝐿 = 𝑀𝑀𝑀𝑀(3 𝑚𝑚𝑚𝑚𝑚𝑚 𝑟𝑟𝑟𝑟 𝑜𝑜𝑜𝑜 12 𝑚𝑚𝑚𝑚𝑚𝑚) + 3𝜎

• Alert Levels should be based on the number of events which have
occurred during a representative period of safe operation of the fleet.
They should be updated periodically to reflect operating experience
• When establishing Alert the normal period of operation taken is between
2 and 3 years dependent on fleet size and utilisation. The Alert Levels will
usually be calculated for events recorded in one-monthly or three-
monthly periods of operation.
Alert Levels

Where there is insufficient operating experience or a programme for a new
aircraft type is being established, the following approaches may be used:
• all significant malfunctions should be considered and should be
investigated, and although Alert Levels may not be in use, programme
data will still be accumulated for future use
• For new Operator, the experience of other Operators may be used until
the new Operator has himself accumulated a sufficient period of his own
experience. Alternatively, experience gained from operation of a similar
aircraft model may be used
• Setting Alert Levels based on computed values based on the degree of
system and component in-service expected reliability assumed in the
design of the aircraft (this is however not common because of lack of
design data).
Alert Levels

• A reliability alert level is purely an 'indicator' which when
exceeded indicates that there has been an apparent
deterioration in the normal behavior pattern.
• When an Alert Level is exceeded an appropriate action has to
be taken. It is important to realise that Alert Levels are not
minimum acceptable airworthiness levels.
• In the case of a system designed to a multiple Redundancy
philosophy it has been a common misunderstanding that, as
Redundancy exists, an increase in failure rate can always be
tolerated without corrective action being taken.
Remarks

Reliability Alert levels
Normal performance is determined by comparing current to past
performance.
• Upper Control Limits (UCL’s) are assigned to each parameter to
describe desirable or undesirable trends.
• The UCL is a rate of occurrence which, if exceeded, can trigger an
investigation and if necessary corrective actions
• An alert exists whenever the monthly or the three month average
rate of occurrence exceeds the UCL. Several stages of alert status
exist according to the combination of rates that exist and whether
or not the trend is improving or deteriorating.

Alert Definition – Alert Status
• CLEAR – Normal operating status. Clear status exists when both the
monthly and the three month average rates remain below the UCL.

YELLOW – Yellow status exists when two consecutive monthly rates exceed
the UCL while the three month average remains below the UCL The status
warns of a possible alert condition in the following month.

RED – Red status exists when the three month average exceeds the UCL.
REMAINS IN ALERT – Remains in Alert status exists when two or more
consecutive 3 month rates exceed the UCL and the monthly rate is
equal to or greater than the previous month’s rate.

WATCH – Watch status exists after an initial alert where the
following monthly rate has shown an improvement.

Upper Control Limits. UCL’s are determined by standard deviation
calculations. Standard deviation is the accepted industry
standard.
• Twelve months of data is normally used to determine UCL’s
• The UCL should not be set so high that a major increase in the
failure rate does not produce an alert.
• The UCL should not be set so low that a normal distribution
of failures results in excessive alerts.
Upper Control Limits

• The UCL is usually set at 2, 2.5 or 3 times the SD above the mean level.
• The actual setting of the UCL/Alert Level will depend upon the
distribution or “scatter” observed in the report or failure rates
• An UCL of 2 times the standard deviation is generally effective for most
failure patterns. It can, however, produce excessive alerts when applied
to a system with widely dispersed report rates, while it is not sufficiently
sensitive to produce alerts when applied to data with narrow dispersion.
Using 2 standard deviations, the probability of an alert resulting from
the scatter of a normal distribution is approximately 4.5%.
• An UCL of 3 times the standard deviation is most suited to narrowly
scattered data (small standard deviation). Using 3 standard deviations,
the probability of an alert is about 0.3%.
• In general the UCL must be set at a level which produces a reasonable
number of alerts.
Upper Control Limits

Example 1 – Pilot Reports (Pireps) by Aircraft System per 1,000 Flight Hours
Method: Alert Level per 1,000 flight hours = Mean of the 3 monthly Running Average
'Pirep' Rates per 1,000 flight hours (for past 12 months) plus 3 Standard Deviations.
N(months)=12
System: Aircraft Fuel System (ATA 100, Chapter 28)
𝑀𝑀𝑀𝑀 𝑥 =
∑ 𝑥
𝑁
=
236
12
= 19.67
𝑆𝑆 =
∑(𝑥−𝑥)2
𝑁
=
104
12
= 2.94
𝐴𝐴𝐴𝐴𝐴 𝐿𝐿𝐿𝐿𝐿 = 𝑀𝑀𝑀𝑀 + 3 𝑆𝑆 = 29

Recalculation of Alert Levels
• The Reliability section should be recalculate UCL’s each 12 months.
• Increases will normally be limited to 10% above the previous value.
• All UCL revisions greater than 10% are to be agreed upon by the
Reliability Control Committee.
• If a system remains above the established UCL for a period of three
months and investigation demonstrates that the UCL is incorrect, The
UCL may be re-established.
All changes in Alert Levels are not required to be approved by the CAA but
the procedures, periods and conditions for re-calculation should be defined
in the Reliability programme.

Reliability Data may be displayed in the form of Charts,
Graphs, Data Tables.
Graphs are commonly used to provide a pictorial display of
System Performance by ATA Chapter.
Alert levels are displayed on the graph to allow
instantaneous comparison between system performance
and alert level.
Displays are ‘Management Tool’ intended to provide a
snapshot view of Reliability Information.
Reliability Data Displays (1)

§ Reliability Engineers should be constantly reviewing:
– Safety Effects & criteria
– Trends (Hidden & Apparent)
– Rates of Change
– Alert Levels
– Utilisation vs. Display Criteria
– Impending Alert Levels Exceedances
– Availability
Reliability Data Displays (2)

The procedures for data analysis should be such as to enable to measure the
performance of the items controlled by the programme.
They should also facilitate recognition, diagnosis and recording of significant
problems.
Such a process may involve:
• Comparisons of operational reliability with established standards
• Analysis and interpretation of trends
• The evaluation of repetitive defects
• Confidence testing of expected and achieved results
• Studies of life-bands and survival characteristics
• Reliability predictions
• Other methods of assessment
Data Analysis

The range and depth of engineering analysis and interpretation should be
related to the particular programme and to the facilities available.
The following should be taken into account:
• Flight defects and reductions in operational reliability
• Defects occurring at line and main base
• Deterioration observed during routine maintenance
• Workshop and overhaul facility findings
• Modification evaluations
• Sampling programmes
• The adequacy of maintenance equipment and technical publications
• The effectiveness of maintenance procedures
• Staff training
• Service Bulletins (SB), technical instructions, etc.
Data Analysis

Priorities:
Focus your reliability program on money and or Safety
issues beginning with a Pareto priority list. The work
list is made from recent problems and the threat of
failures.
• Focus on the 10-20% of the Pareto items containing
60-80% of the money issues.
• Work on Pareto things first—not your love affairs
• In the end, it’s all about the money! Where’s your
focus: R & M or R & S or S & M?
Data Analysis

Corrective actions should correct any reduction in reliability
revealed by the programme and could take the form of:
• Changes to maintenance, operational procedures or techniques
• Maintenance changes involving inspection frequency and content,
function checks, overhaul requirements and time limits, which will
require amendment of the scheduled maintenance periods or tasks in
the approved maintenance programme. This may include escalation or
de-escalation of tasks, addition or modification or deletion of tasks
• Amendments of manuals (e.g. Maintenance Manual, Crew Manual)
• Initiation of modifications
• Special inspections or fleet campaigns
• Spares provisioning
• Staff training
• Manpower and equipment planning
Corrective Actions

Data flow
DATA COLLECTION
• Pilot Reports
• Delays & Cancellations
• Component Removals
• Inspection Findings
• Shop Findings
• Power Plant Data
• Structural Data
• Flight Statistics
• OMT / QAR Data
DATA DISPLAY,
REPORTS AND
ANALYSIS
• Statistical Reports
• Trend Monitoring
• Fleet Campaign
• Validation Study
• Historical Data
CORRECTIVE ACTION AND
ANALYSIS
• Modify Maintenance Process
• Modify Maintenance Tasks
• Service Bulletin Modifications
• Correct Maintenance Procedures
• Modify/Correct Shop Procedures
• Escalate/Reduce Intervals
• Adjust Inventory Levels

Analysis and Assessment of reports can lead to review of:
Integrated Maintenance Control Procedures
SB Compliance
Programme Effectiveness
Maintenance Procedures
Operating Procedures
Utilisation
Lubrication
Training programmes
Trends
Repetitive Defects
Sampling programmes
Supplier Evaluation

Benefits of Integrated Reliability Programme
A well resourced and integrated programme will benefit operators.
Some of the benefits are as follows:
§ Effective SB adoption policy
§ Warranty issues
§ Uneconomic Task Deletion
§ Controlled Programme Escalation
§ Spares Provisioning Control
§ Maintenance / Operations Planning
§ Reliability improvements
§ Improved Economic Planning
§ More Effective Procedures

Benefits of Integrated Reliability Programme
A well resourced and integrated programme will benefit operators.
Some of the benefits are as follows:
§ Spare parts control (how many units are expected
to fail for a give production) so that you have
enough spare parts in stock
§ Measure effectivity of modifications (SB policy),
corrective actions and organisational changes
§ Manage cost of unplanned failure which are subject
to wear out failure mode.
§ Determination of optimal replacement intervals for
units with wear out failure modes.

Small fleet Reliability Programme

Ø Reliability programme management – policy, organisation and
procedures
Ø TC holder Global data returns
Ø Despatch Reliability & reliability targets
Ø Corrective action policy and procedures – event monitoring
Ø MEL utilisation rates
Ø Component strip report policy
Ø Assessment of safety, capability, economic effects
Ø Reliance on engineering judgement
Characteristics of a small fleet reliability programme

What happens if the operator has only one or two airplanes and has no
pooling options?
Part M “requires” a reliability programme!
Can we run a reliability programme with only one or two airplanes?
Sufficient data?
Alert levels?
Fleet Pooling Considerations: small fleet

• One event is not
remarkable: unless the
failure itself is a safety or
operating capability effect
failure
• Each event has to be
analysed
• safety effects
• operating capability effects
• Engineering Judgement!
Fleet Pooling Considerations: small fleet

• How many failures can we accept by ATA Chapter?
• Despatch Reliability may be an appropriate measure
Ø How much effect are the component/system failures having on the
operation?
Ø How many flights are being conducted with DD i.a.w. the MEL?
• Set Despatch reliability targets!
• Set MEL usage targets!
• These targets become Alert Levels
Fleet Pooling Considerations
§ What are the Global fleet despatch
reliability returns?
§ Consult the TC Holder

Assessment of Effectiveness
(Small Fleet)
Engineering judgement, based on:-
ØTech Log Defects / PIREP
ØCarry Forward Defects
ØCheck Findings
ØSB
ØAD
ØOccurrence Report
ØStrip Reports
ØKnowledge of type
When component / System / Structure continually lead to unscheduled
removals / repairs, then there is an indication that scheduled
maintenance is not adequate.

Scheduled Maintenance may not be specified.
Repetitive unscheduled removals should initiate Review of Maintenance
Procedures and/or consult the TCH/OEM.
Marked reduction in unacceptable levels of Reliability, may lead to
Additional Review of operating procedures.
This should be taken into account when
Ø auditing Maintenance Programme effectiveness,
Ø auditing Organisations managing Programmes for Operators
Assessment of Effectiveness
(Small Fleet)

An effective statistical reliability programme requires relatively large quantities
of data in order to suppress instantaneous effects and distortions.
Fleets may be pooled in order to increase size of data set
TC Holder advice/support may be required
Reliability Programme - Data and Information Sources
§ Aircraft Type (TC DS) - Commonality
§ Aircraft Age
§ Modification Standard
§ Operating Environment
§ Utilisation
§ Respective Fleet Size
§ Operating Rules
Considerations for pooling data
§ Operating Procedures
§ Maintenance Procedures
§ Maintenance Standards
§ Lubrication Programme

• Operators require to have an effective maintenance
programme;
– Safe operation (PART M article M.A.302)
– Economic Operation
• Scheduled maintenance programme optimisation (Revision /
Amendment) should be anticipated throughout the operational
life of an aircraft.

Optimisation can result in the following changes to the
Operators Maintenance programme;
• Change to compliance interval category (FH, FC,
Calendar, Check etc)
• Escalation or reduction of compliance interval
• Revision of tasks or processes (GVI, FC, OPC,
CM, OC, HT etc)
• Revision of accomplishment instructions
• Deletion of Task or Process
• Revision of workscope
• Addition of tasks

• Changes to a scheduled maintenance programme should be
made in an incremental and controlled manner where
possible, by ‘Trial Extension’ or sampling programmes.
• Increment to be agreed with CAA.
• ‘Rate of Change’ may be high for a new type recently
introduced to service.
• ‘Rate of Change’ more conservative for old types or
inexperienced operator. TC holder comment for escalations is
advised to be sought by the operator.
• Differences in configurations should be taken into account

• ‘NIL Defects’ is not in itself, justification for escalation or
deletion.
• Justification should include audit of;
– Maintenance Standards (line and base)
– Data Collection
– Data Processing
– Deferred / Carry Forward Defects
– Aircraft Utilisation

Sample size calculation (using IP44):
Light maintenance (for example ≤ 4 years)
(note: the consecutive tasks should be checked to assess the reliability of the
task)
For light maintenance statistical formula can be used de
determine sampling size
𝑚 =
𝑍𝛼
2
2
𝑝(1 − 𝑝)
𝑐2
Maintenance schedule escalation

𝑚 =
𝑍𝛼
2
2
𝑝(1 − 𝑝)
𝑐2
1 − 𝛼 = 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑙𝑙𝑙𝑙𝑙 (95%)
𝑍𝛼/2 = 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑛𝑛𝑛𝑛𝑛𝑛 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
𝑝 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑙𝑙𝑙𝑙𝑙 𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓
𝑐 = 𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑙𝑙𝑙𝑙𝑙
𝑛 =
𝑚
1 + (𝑚 − 1)/𝑆
n = sampling size
𝑆 = 𝑡𝑡𝑡𝑡 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 ( 𝑛𝑛 𝑜𝑜 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 × 𝑛𝑛 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒)

𝑚 =
𝑍𝛼
2
2
𝑝(1 − 𝑝)
𝑐2
1 − 𝛼 = 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑙𝑙𝑙𝑙𝑙 (95%)
𝑍𝛼/2 = 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑛𝑛𝑛𝑛𝑛𝑛 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
𝑝 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑙𝑙𝑙𝑙𝑙 𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 (10%)
𝑐 = 𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑙𝑙𝑙𝑙𝑙 (7%)
𝑛 =
𝑚
1 + (𝑚 − 1)/𝑆
n = sampling size
𝑆 = 𝑡𝑡𝑡𝑡 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 ( 𝑛𝑛 𝑜𝑜 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 × 𝑛𝑛 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 )

Aircraft selection (light maintenance):
The following criteria can be used:
• Initial review of 50% of fleet sample (minimum of 10% of the
fleet or 6 representive aircraft (whichever is higher))
• Remaining % of the tasks should randomly selected from
remaining fleet
• The consolidation phase of the target interval should be
performed with the remaining fleet (where the new interval
apply)

Heavy maintenance with task intervals ≥ 4 years
The following criteria can be used:
• Minimum of 10% of the fleet or 6 representive aircraft
(whichever is higher)
The evaluation of the data:
• Initial review of 50% of fleet sample
• The consolidation phase of the target interval should be
performed with the remaining fleet (where the new interval
apply)
Note: a statistical formula might not be applicable due to limited
data collected . Therefore a consolidation phase can be used to
assess the schedule adjustment

Data Review
Analysis Schedule - Evolution/Optimization timeline
• MRB task interval adjustments should be considered after sufficient
service
• experience is accumulated since entry into service. Subsequent task
interval adjustments should be considered after additional service
experience has been accumulated since the last interval adjustment. In
both cases, data sufficiency is measured by the level of confidence as
stipulated in these guidelines.
Statistical Analysis
• OEM/TCH shall develop and implement a statistical analysis system to
provide justification that a 95% level of confidence has been achieved for
the Evolution /Optimization exercise on a task by task basis. Exceptions
can be presented and may be approved at the discretion of the approving
Airworthiness Authorities.

Data Review
Engineering analysis
• Engineering analysis will verify that findings are relevant to the scheduled
task under evaluation. Non-routine write-ups will be evaluated to
determine the significance or severity of findings. Pilot reports and
component reliability reports will also be examined to account for line
maintenance activities that may be relevant to the task under evaluation.
The severity of the findings shall be considered and evaluated.
Modification Status, AD, SB, SL, etc.
• All information related to the task (service bulletins, Airworthiness
Directives, service letters, and other in-service reports/resolutions, as
applicable) should be reviewed. Fleet configuration, should also be
assessed.

Data Review
Servicing Tasks
• Scheduled servicing (e.g. lubrication /oil replenishment) task data will not
normally result in reported related findings. For these tasks, Engineering
assessment and analysis is the primary method to be used to support an
evolution / optimization. The engineering assessment must take into
account the negative long-term effects (e.g. corrosion) resulting from
inappropriate servicing intervals.
Restoration/Discard Tasks
• For many restoration/discard tasks, fault findings will not typically be
recorded in the performance of the task. In these cases, an engineering
assessment of shop/teardown data should be performed. This
engineering analysis should assess the rate of wear, corrosion, and
degradation of lubricants or other included components.

Approval of the Reliability
Programme

Assess the operator/applicant for the following programme requirements:
• Organizational structure
• Data Collection system
• Methods of data analysis and application to maintenance control
• Procedures for establishing and revising performance standards
• Definition of significant terms
• Programme displays and status of corrective action programmes
• Procedures for programme revision
• Procedures for maintenance control changes
Approval of the Reliability Programme

Ensure that the reliability programme includes an organizational chart that
shows the following:
The relationships among organizational elements responsible for administering
the programme
Organizational Structure
1. Determine if the reliability programme document addresses the following:
(i) The method of exchanging information among organizational elements.
This may be displayed in a diagram.
(ii) Activities and responsibilities of each organizational element and/or reliability
control committee for enforcing policy and ensuring corrective action
2. Ensure that authority is delegated to each organizational element to enforce
policy.

Ensure that the reliability document fully describes the data collection system
for the aircraft, component, and/or systems to be controlled.
The following must be addressed:
– Flow of information
– Identification of sources of information
– Steps of data development from source to analysis
– Organizational responsibilities for each step of data development
– Data quality (grouping, correct failure report)
– Data integrity
Data Collection System

Ensure that the data analysis system includes the following:
1. One or more of the types of action appropriate to the trend or level of reliability
experienced, including:
– Actuarial or engineering studies employed to determine a need for
maintenance programme changes
– Maintenance programme changes involving inspection frequency and content,
functional checks, overhaul procedures, and time limits
– Aircraft, aircraft system, or component modification or repair
– Changes in operating procedures and techniques
Methods of Data Analysis and Application to
Maintenance Controls (1/2)

2. The effects on maintenance controls such as overhaul time, inspection
and check periods, and overhaul and/or inspection procedures
3. Procedures for evaluating critical failures as they occur
4. Documentation used to support and initiate changes to the maintenance
programme, including modifications, special inspections, or fleet
campaigns.
5. A corrective action programme that shows the results of corrective
actions in a reasonable period of time. Depending on the effect on
safety, a "reasonable" period of time can vary from immediate to an
overhaul cycle period.
6. A description of statistical techniques used to determine operating
reliability levels
Methods of Data Analysis and Application to Maintenance
Controls (2/2)

1. Ensure that each programme includes one of the following for
each aircraft system and/or component controlled by the
programme:
• Initial performance standards defining the area of acceptable reliability
• Methods, data, and a schedule to establish the performance standard
2. The standard should not be so high that abnormal variations
would not cause an alert or so low that it is constantly exceeded
in spite of the best known corrective action measures.
3. Ensure that the procedures specify the organizational elements
responsible for monitoring and revising the performance
standard, as well as when and how to revise the standard.
Procedures for Establishing and Revising Performance
Standards

• Ensure that the programme describes reports, charts, and graphs used to
document operating experience. Responsibilities for these reports must be
established and the reporting elements must be clearly identified and described.
• Ensure that the programme displays containing the essential information for each
aircraft, aircraft system, and component controlled by the programme are
addressed. Each system and component must be identified by the appropriate
ATA Specification 100 system code number.
Programme Displays and Status of Corrective Action
Programmes and Reporting
• Ensure that the programme includes displays showing:
– Performance trends
– The current month's performance
– A minimum of 12 months' experience
– Reliability performance standards (alert values)
• The programme must include the status of corrective action programmes. This includes all
corrective action programmes implemented since the last reporting period.

• Review the change system procedures. Ensure that there are
special procedures for escalating systems or components whose
current performance exceeds control limits
• Ensure that the programme does not allow for the maintenance
interval adjustment of any Certification Maintenance Requirements
(CMRs) items. CMRs are part of the certification basis. No CMR item
may be escalated through the operator maintenance/reliability
programme. CMRs are the responsibility of EASA as far as approval
and escalation
• Ensure that the programme includes provisions for notifying the
Authority when changes are made.
Interval Adjustments and Process and/or Task Changes System

Ensure that the reliability programme document addresses the following:
1. Procedures for maintenance control changes to the reliability
programme
2. The organizational elements responsible for preparing substantiation
reports to justify maintenance control changes.
3. Processes used to specify maintenance control changes (e.g., sampling,
functional checks, bench checks, and unscheduled removal)
4. Procedures covering all maintenance programme activities controlled by
the programme
5. Procedures for amending operations specifications, as required
6. Procedures to ensure maintenance interval adjustments are not
interfering with ongoing corrective actions
7. Critical failures and procedures for taking corrective action
8. Procedures for notifying the CAA, when increased time limit
adjustments or other programme adjustments occur are addressed
Procedures for Maintenance Control Changes

Case life data analysis
Crow-AMSAA-Duane modeling
An operator has 20 pumps in the fleet. Calculate
the expected failures. Usage per month:
100, 100, 100, 300, 300, 350, 350, 250, 200, 150,
100, 100. (usage: 2400 h/YR)
2 pump failures (1: 2000h and 2: 3000h)
Expected failures at 5000h?
No replacement at failure

Case
An operator has 20 pumps in the fleet. Calculate the
expected failures. Usage per month:
100, 100, 100, 300, 300, 350, 350, 250, 200, 150, 100,
100. (2400 h/YR)
With replacement at failure.

Recommended reading
Practical reliability engineering, O’Connor.
The new Weibull Handbook, R.B Abernethy.
Reliability, Maintenance and Logistic Support, U Dinesh Kumar.

Joint Aviation Authorities JAA Releability Program Training

More Related Content

Similar to Joint Aviation Authorities JAA Releability Program Training (20)

Recently uploaded (20)

Joint Aviation Authorities JAA Releability Program Training