SlideShare a Scribd company logo
2
Most read
3
Most read
5
Most read
Fault Management
and
Proactive Maintenance
Ardhita Banu Adji
Irnanta wahyu andari
Wibisono Juhdi
Overview
• Nowadays it is a matter of a
course that you can use your
mobile phone anytime almost
everywhere. This makes fault
management one of the most
important aspect of network
management. The most of the
problems are, but not all. The
reason for this is limited
amount of time used in the
testing and also the fact that
these systems will be used in
the several different network
environments, with different
vendors equipments
connected to the nodes.
Chapter 1 :
Introduction
• Greater demands and the increasing
dependency of people in the mobile
communications networks are the
main driver of creating the better error
detection mechanisms for mobile
networks. Modern mobile
communications networks may
produce hundreds of alarms during
one day. Fault situations can arise for
example from hardware and software
failures bor from operational errors
• One solution for this problem is that all
the alarms are divided into classes so
that for example A1 alarm is the most
critical alarm class and A2 is the
second class of alarms and so on
Chapter 1.1 :Concept Definitions
Fault => a fault is defined as a cause of malfunctioning, Faults are responsible for
making it difficult of preventing the normal functioning of a system and they
manifest as themselves through errors, that s, deviations relation to the normal
operation of the system
Alarm => notifications concerning detected
faults or abnormal conditions, which may
or may not represent an error, an alarm
report is a kind of event report used in the
transportation of alarm information. Alarms
that are defined by the vendors are
observable by the network operator.
Chapter 2 : Fault management
• One of the most important areas in the telecommunications network
management is the management of fault occurred during the normal
operation of these networks. definition fault management is about
detection, isolation and correction of faults. The main requirement to
perform fault management :
1. Existence of information on the network's real time functioning
2. The abnormalities that occur during the operation
Chapter 2.1 : Fault
diagnosis
• Fault diagnosis is a
stage in the fault
management process
which consists of
finding out the
original cause for the
received alarms by
formulate a set of
fault hypotheses and
Finding the root
cause of a problem
Chapter 2.2 : Alarm correlation
a conceptual interpretation of multiple alarms, leading to the
attribution of new meaning of the original alarm
1. Compression -> reduce multiple occurrences of an alarm
2. Counting -> consists of generating a new alarm
3. Suppression -> temporary inhibition of alarms to a given events
4. Scaling -> alarm is canceled and another is created
5. Filtering -> suppressing a given alarm
6. Generalization -> replacing an alarm by the alarm corresponding
to its super-class1.
7. Specialization -> The reversal of generalization
Chapter 3 : Proactive
maintenance
Purpose :
reduce the
errors and faults
in the network
and also to
prepare for the
situations where
there is no fast
or easy way to fix
the fault
Chapter 3.1 : Health check
• Health Check : Predefined set of commands to be executed in
the system to find out if the system is functioning correctly or not.
• Health check script : consists of commands to be executed in
the health Check Commands.
• Alarms : If there are alarms health check script will write an entity
to the result file.
• Configuration errors : the most common problems that
systems have.
• Status of the Software level : checked if there is any newer
software level available
• Software comparison : between the nodes are compared to
verify that those are exactly the same.
Chapter 3.2 :
Backup plans
• In case there is a major fault
which needs an immediate
recovery actions, backups
are very essential The more
important the device the
more often the backups
should be taken. After the
backups are taken,
verification of integrity of
the data should be checked
in order to be sure that
backup is actually valid
Conclusion
Since the demand of demand mobile
communication so, the problem occurs more
often and sometimes it can leads to the
distraction of using mobile communication. In
this case, fault management explain about
how we can maintain fault management,
errors, alarm detection, or maybe software
and hardware miss behave on the device.
Answer And Question
1. Explain why fault management is the one of the most
Important aspect in network Management?
• Answer : Because it is a matter of a course that you can
use your mobile phone anytime almost everywhere,
and Modern mobile communications networks may
produce hundreds of alarms during one day and for
that great number of alarm it is very important to
manage all of the fault real time or not and The most
of the problems are, but not all. The reason for this is
limited amount of time used in the testing and also the
fact that these systems will be used in the several
different network environments, with different vendors
equipments connected to the nodes
• 2. What cause An alarm ?
• Answer : When The detected fault or abnormal
conditions occurs in the device. In the ideal situation
every fault (hardware, software or configuration) and
abnormal situations happening in the network would
cause an alarm, and the alarm text would indicate
unambiguously where the problem is.
3. What is Fault Management ?
Answer : Fault management is about detection,
isolation and correction of faults by Analyze
Existence of information on the network's real
time functioning and notice abnormalities
that occur during the operation in a device
4. Explain fault diagnosis and alarm
correlation!
Answer
• Fault Diagnosis -> stage in the fault management process which consists of finding
out the original cause for the received alarms. Before getting to the original cause
it may be necessary to formulate a set of fault hypotheses, which will be needed to
possibly reproduce and validate the problem. Finding the root cause of a problem
is essential for effective fault management
• Alarm Correlation -> a conceptual interpretation of multiple alarms, leading to the
attribution of new meaning of the original alarm by :
1. Compression -> reduce multiple occurrences of an alarm
2. Counting -> consists of generating a new alarm
3. Suppression -> temporary inhibition of alarms to a given events
4. Scaling -> alarm is canceled and another is created
5. Filtering -> suppressing a given alarm
6. Generalization -> replacing an alarm by the alarm corresponding to its super-
class1.
7. Specialization -> The reversal of generalization
5. What is the relation between fault management and
Pro active management?
Answer :
In our Opinion The relation of fault management and
pro active management is fault management has the
task before proactive management do its task because
fault management finding detection, isolation and
correction of faults and after that with pro active
management will help to reduce the error and also to
prepare for the situations where there is no fast or
easy way to fix the fault so both of them have the task
which is related to each other, that is if there is not
fault management, how can we reduce or even
prepare to the worse case if we don’t know what are
the problems.
What do we do in proactive maintenance? Explain each of these!
Answer :
Health Check : elements that help reduce fault
• Health Check : Predefined set of commands to be executed in the system to find
out if the system is functioning correctly or not.
• Health check script : consists of commands to be executed in the health Check
Commands.
• Alarms : If there are alarms health check script will write an entity to the result file.
• Configuration errors : the most common problems that systems have.
• Status of the Software level : checked if there is any newer software level available
• Software comparison : between the nodes are compared to verify that those are
exactly the same.
Back up Plan
In case there is a major fault which needs an immediate recovery actions, backups
are very essential The more important the device the more often the backups
should be taken. After the backups are taken, verification of integrity of the data
should be checked in order to be sure that backup is actually valid
Example back up plan :
• Sometimes it is possible that there is such a major software fault or fault that is
very hard to localize or fix. In these kinds of situations only way for fast recovery is
to use existing backups and restore the system from those. The backups should be
taken often and should be stored to a place which is remotely accessible

More Related Content

PPTX
Networking devices
PDF
Fault Management System (OSS)
PPTX
Cs6703 grid and cloud computing unit 3
PDF
5G Network Architecture and Design
PPTX
Gsm architecture
PDF
Network Monitoring System
PPT
Network management
PPTX
CLOUD ENABLING TECHNOLOGIES.pptx
Networking devices
Fault Management System (OSS)
Cs6703 grid and cloud computing unit 3
5G Network Architecture and Design
Gsm architecture
Network Monitoring System
Network management
CLOUD ENABLING TECHNOLOGIES.pptx

What's hot (20)

PPTX
Transmission Impairment
PDF
Cisco Digital Network Architecture - Introducing the Network Intuitive
PPT
protocol architecture
PDF
Palo alto networks product overview
PPTX
Introduction to Data Center Network Architecture
PPTX
Cyber kill chain
PPTX
Wireless network security
PPTX
01. 03.-introduction-to-infrastructure
PPTX
Cloud Computing Design Considerations
PPT
03 backup-and-recovery
PPTX
Cs6703 grid and cloud computing unit 5
PPTX
Network architecture - part-I
PPTX
Overlay network
PDF
CS8601 mobile computing Two marks Questions and Answer
PPTX
Vapt( vulnerabilty and penetration testing ) services
PDF
Aruba Remote Access Point (RAP) Networks Validated Reference Design
PDF
Secure your network - Segmentation and segregation
PDF
Data dissemination
PDF
The Incident Response Playbook for Android and iOS
PPTX
Cloud Security
Transmission Impairment
Cisco Digital Network Architecture - Introducing the Network Intuitive
protocol architecture
Palo alto networks product overview
Introduction to Data Center Network Architecture
Cyber kill chain
Wireless network security
01. 03.-introduction-to-infrastructure
Cloud Computing Design Considerations
03 backup-and-recovery
Cs6703 grid and cloud computing unit 5
Network architecture - part-I
Overlay network
CS8601 mobile computing Two marks Questions and Answer
Vapt( vulnerabilty and penetration testing ) services
Aruba Remote Access Point (RAP) Networks Validated Reference Design
Secure your network - Segmentation and segregation
Data dissemination
The Incident Response Playbook for Android and iOS
Cloud Security
Ad

Similar to Fault management presentation (20)

PPT
Ch20
PDF
Paul Giralt Without Ch# 6
PDF
A Practical Fault Tolerance Approach in Cloud Computing Using Support Vector ...
PDF
The on-call survival guide - how to be confident on-call
PPTX
Apply Problem Solving Techniques to Routine Malfunctions.pptx
PPTX
15.-Procedures-in-Planning-and-Conducting-Maintenance-1.pptx
PDF
Understanding security operation.pptx
PDF
Implementing Vulnerability Management
PDF
Proposed Algorithm for Surveillance Applications
PPTX
MAINTAINING AND REPAIRING COMPUTER SYSTEMS AND NETWORKS 12.pptx
PPTX
Mitigating worm attacks
PDF
The difference between in-depth analysis of virtual infrastructures & monitoring
PDF
Monitoring Clusters and Load Balancers
PDF
Alarm information sheet
PDF
AnswerWe have recognised the network parts we can evaluate their .pdf
PDF
Bug Life Cycle in Software Testing: Understanding the Journey from Detection ...
PPTX
Testing
PDF
CEH v12 Lesson 5 _ Vulnerability Assessment To (1).pdf
PDF
Review Paper on Recovery of Data during Software Fault
PDF
Firewall best-practices-firewall-analyzer
Ch20
Paul Giralt Without Ch# 6
A Practical Fault Tolerance Approach in Cloud Computing Using Support Vector ...
The on-call survival guide - how to be confident on-call
Apply Problem Solving Techniques to Routine Malfunctions.pptx
15.-Procedures-in-Planning-and-Conducting-Maintenance-1.pptx
Understanding security operation.pptx
Implementing Vulnerability Management
Proposed Algorithm for Surveillance Applications
MAINTAINING AND REPAIRING COMPUTER SYSTEMS AND NETWORKS 12.pptx
Mitigating worm attacks
The difference between in-depth analysis of virtual infrastructures & monitoring
Monitoring Clusters and Load Balancers
Alarm information sheet
AnswerWe have recognised the network parts we can evaluate their .pdf
Bug Life Cycle in Software Testing: Understanding the Journey from Detection ...
Testing
CEH v12 Lesson 5 _ Vulnerability Assessment To (1).pdf
Review Paper on Recovery of Data during Software Fault
Firewall best-practices-firewall-analyzer
Ad

Recently uploaded (20)

PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
master seminar digital applications in india
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Institutional Correction lecture only . . .
PDF
01-Introduction-to-Information-Management.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
RMMM.pdf make it easy to upload and study
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
Lesson notes of climatology university.
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Sports Quiz easy sports quiz sports quiz
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
human mycosis Human fungal infections are called human mycosis..pptx
Supply Chain Operations Speaking Notes -ICLT Program
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Microbial diseases, their pathogenesis and prophylaxis
master seminar digital applications in india
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Abdominal Access Techniques with Prof. Dr. R K Mishra
Institutional Correction lecture only . . .
01-Introduction-to-Information-Management.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
RMMM.pdf make it easy to upload and study
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Lesson notes of climatology university.
102 student loan defaulters named and shamed – Is someone you know on the list?
Sports Quiz easy sports quiz sports quiz
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx

Fault management presentation

  • 1. Fault Management and Proactive Maintenance Ardhita Banu Adji Irnanta wahyu andari Wibisono Juhdi
  • 2. Overview • Nowadays it is a matter of a course that you can use your mobile phone anytime almost everywhere. This makes fault management one of the most important aspect of network management. The most of the problems are, but not all. The reason for this is limited amount of time used in the testing and also the fact that these systems will be used in the several different network environments, with different vendors equipments connected to the nodes.
  • 3. Chapter 1 : Introduction • Greater demands and the increasing dependency of people in the mobile communications networks are the main driver of creating the better error detection mechanisms for mobile networks. Modern mobile communications networks may produce hundreds of alarms during one day. Fault situations can arise for example from hardware and software failures bor from operational errors • One solution for this problem is that all the alarms are divided into classes so that for example A1 alarm is the most critical alarm class and A2 is the second class of alarms and so on
  • 4. Chapter 1.1 :Concept Definitions Fault => a fault is defined as a cause of malfunctioning, Faults are responsible for making it difficult of preventing the normal functioning of a system and they manifest as themselves through errors, that s, deviations relation to the normal operation of the system Alarm => notifications concerning detected faults or abnormal conditions, which may or may not represent an error, an alarm report is a kind of event report used in the transportation of alarm information. Alarms that are defined by the vendors are observable by the network operator.
  • 5. Chapter 2 : Fault management • One of the most important areas in the telecommunications network management is the management of fault occurred during the normal operation of these networks. definition fault management is about detection, isolation and correction of faults. The main requirement to perform fault management : 1. Existence of information on the network's real time functioning 2. The abnormalities that occur during the operation
  • 6. Chapter 2.1 : Fault diagnosis • Fault diagnosis is a stage in the fault management process which consists of finding out the original cause for the received alarms by formulate a set of fault hypotheses and Finding the root cause of a problem
  • 7. Chapter 2.2 : Alarm correlation a conceptual interpretation of multiple alarms, leading to the attribution of new meaning of the original alarm 1. Compression -> reduce multiple occurrences of an alarm 2. Counting -> consists of generating a new alarm 3. Suppression -> temporary inhibition of alarms to a given events 4. Scaling -> alarm is canceled and another is created 5. Filtering -> suppressing a given alarm 6. Generalization -> replacing an alarm by the alarm corresponding to its super-class1. 7. Specialization -> The reversal of generalization
  • 8. Chapter 3 : Proactive maintenance Purpose : reduce the errors and faults in the network and also to prepare for the situations where there is no fast or easy way to fix the fault
  • 9. Chapter 3.1 : Health check • Health Check : Predefined set of commands to be executed in the system to find out if the system is functioning correctly or not. • Health check script : consists of commands to be executed in the health Check Commands. • Alarms : If there are alarms health check script will write an entity to the result file. • Configuration errors : the most common problems that systems have. • Status of the Software level : checked if there is any newer software level available • Software comparison : between the nodes are compared to verify that those are exactly the same.
  • 10. Chapter 3.2 : Backup plans • In case there is a major fault which needs an immediate recovery actions, backups are very essential The more important the device the more often the backups should be taken. After the backups are taken, verification of integrity of the data should be checked in order to be sure that backup is actually valid
  • 11. Conclusion Since the demand of demand mobile communication so, the problem occurs more often and sometimes it can leads to the distraction of using mobile communication. In this case, fault management explain about how we can maintain fault management, errors, alarm detection, or maybe software and hardware miss behave on the device.
  • 12. Answer And Question 1. Explain why fault management is the one of the most Important aspect in network Management? • Answer : Because it is a matter of a course that you can use your mobile phone anytime almost everywhere, and Modern mobile communications networks may produce hundreds of alarms during one day and for that great number of alarm it is very important to manage all of the fault real time or not and The most of the problems are, but not all. The reason for this is limited amount of time used in the testing and also the fact that these systems will be used in the several different network environments, with different vendors equipments connected to the nodes
  • 13. • 2. What cause An alarm ? • Answer : When The detected fault or abnormal conditions occurs in the device. In the ideal situation every fault (hardware, software or configuration) and abnormal situations happening in the network would cause an alarm, and the alarm text would indicate unambiguously where the problem is.
  • 14. 3. What is Fault Management ? Answer : Fault management is about detection, isolation and correction of faults by Analyze Existence of information on the network's real time functioning and notice abnormalities that occur during the operation in a device
  • 15. 4. Explain fault diagnosis and alarm correlation! Answer • Fault Diagnosis -> stage in the fault management process which consists of finding out the original cause for the received alarms. Before getting to the original cause it may be necessary to formulate a set of fault hypotheses, which will be needed to possibly reproduce and validate the problem. Finding the root cause of a problem is essential for effective fault management • Alarm Correlation -> a conceptual interpretation of multiple alarms, leading to the attribution of new meaning of the original alarm by : 1. Compression -> reduce multiple occurrences of an alarm 2. Counting -> consists of generating a new alarm 3. Suppression -> temporary inhibition of alarms to a given events 4. Scaling -> alarm is canceled and another is created 5. Filtering -> suppressing a given alarm 6. Generalization -> replacing an alarm by the alarm corresponding to its super- class1. 7. Specialization -> The reversal of generalization
  • 16. 5. What is the relation between fault management and Pro active management? Answer : In our Opinion The relation of fault management and pro active management is fault management has the task before proactive management do its task because fault management finding detection, isolation and correction of faults and after that with pro active management will help to reduce the error and also to prepare for the situations where there is no fast or easy way to fix the fault so both of them have the task which is related to each other, that is if there is not fault management, how can we reduce or even prepare to the worse case if we don’t know what are the problems.
  • 17. What do we do in proactive maintenance? Explain each of these! Answer : Health Check : elements that help reduce fault • Health Check : Predefined set of commands to be executed in the system to find out if the system is functioning correctly or not. • Health check script : consists of commands to be executed in the health Check Commands. • Alarms : If there are alarms health check script will write an entity to the result file. • Configuration errors : the most common problems that systems have. • Status of the Software level : checked if there is any newer software level available • Software comparison : between the nodes are compared to verify that those are exactly the same. Back up Plan In case there is a major fault which needs an immediate recovery actions, backups are very essential The more important the device the more often the backups should be taken. After the backups are taken, verification of integrity of the data should be checked in order to be sure that backup is actually valid Example back up plan : • Sometimes it is possible that there is such a major software fault or fault that is very hard to localize or fix. In these kinds of situations only way for fast recovery is to use existing backups and restore the system from those. The backups should be taken often and should be stored to a place which is remotely accessible