SlideShare a Scribd company logo
i
Adama Science and Technology University
School of Electrical Engineering and Computing
Department of Computer Science Engineering
Network Architecture Assignment 2
Title System Software Support for router Fault tolerance
Prepared by Ashenafi Workie
Id No PGR/18068/11
Submitted to Dr. N. Satheesh Kumar, Network
Science SIG Leader
Submission date January 27, 2019
ii
Contents
Abstract.............................................................................................................................................1
1. Introduction...............................................................................................................................1
Basic router functionality and architecture ...................................................................................2
2 Literature Review........................................................................................................................2
3 Research Question......................................................................................................................4
4 Objective....................................................................................................................................4
5 Methodology..............................................................................................................................4
Classification of network typical faults are: ...................................................................................4
Generalized Algorithm of Fault Tolerance (GAFT)........................................................................4
6 Conclusion and Future Works......................................................................................................6
Reference..........................................................................................................................................8
1
System Software Support for Router Fault Tolerance
Ashenafi W. Dessalgn
Adama Science and Technology University
Department of computer Science and Engineering, Email: {ashenafiworkie@gmail.com}
Abstract
Communication Network become shifted from dedicated physical network devices to the high level
of functional software component of those network devices. Fault tolerant network has an ability
to find or detect any irregular situation that might resulted in temporarily malfunctions or
permanent faults. Depends on topologies and protocols used an actual faults does occur multiple
alarm alerts could be generated through multiple network elements. Modern network management
helps to determine the root cause of faults regardless of its type which may occur
and perform fault detection. Routers are the most important traffic maintaining device in a
communications network and core network equipment that forward data packets between
computer networks.
1. Introduction
Communication Network has shifted from
common physical network devices to the
high level of functional software component
of those network utility. The functional
software package can be run as a software
utility on any hardware cross platform. Such
package are expected to give communication
through these networks in an efficient and
transparent manner. The network efficiency
relies on how the above components are able
to tolerate a faults which could be
malfunction or permanent fault. A system
which provides full functionality for its
application and can recover transparently
from predefined faults is called a fault
tolerant system (Schagaev and Zalewski,
2001).
Fault tolerant network has an ability to find,
detect any unexpected situation that might
resulted in temporarily malfunctions (for
seconds) to permanent faults (several days).
There are two major approach of increasing
reliability with respect to faults in a system;
namely, 1) fault prevention and 2) fault
tolerance. Avoiding all faults from a system
is in most cases impossible or may cause
some problems such as the delay or difficulty
in maintenance. So it bases on fault tolerant
algorithms to manages faults regarding to the
following scenario [5] based on topologies
and protocols used when an actual faults does
occur multiple alarm alerts could be
generated through multiple network
elements. This is said to be fault detection
phases. Modern network management helps
to determine the main cause of faults
regardless of its type which may occur
and perform fault detection, fault isolation
processes and localization simultaneously.
Routers is the most important traffic
maintaining device in a communications
network [2] and core network equipment
that forward data packets between computer
networks in different way. Besides the
fundamental packet routing capability,
modern routers incorporate a variety of
extended functionalities such as traffic
management, packet filtering, and virtual
2
private networks (VPNs). For this reason,
router systems are also becoming extremely
creating exceedingly and complex high
barriers to network innovations. Muhammad
Azam, et al argue that the sustainability of
router hardware and software will determine
the efficiency level of a particular network.
Depending on the character and strength of
the fault, the communication system crash
from a few microseconds to several days. The
manifestation and occurrence of a hardware
fault may cause errors in data transmission in
network communication. The normal way to
recover an erroneous reception of data is to
request retransmission of the particular
packet(s). Although the fault might have
happened inside a router during the process
of path computation, determination and/or
packet encapsulation. In this case, a packet
could be corrupted whilst being read from the
input buffer, written into the processor cache,
processed, or written into output buffers.
Thus the overall reliable operation of a
router becomes critically important for
network efficiency and performance
Basic router functionality and architecture
Generally, the router performs two major
tasks these are: 1) control routines path and
Data control path (switching). Routers
maintain and manipulate routing tables; they
listen for updates and maintain changes in the
routing tables to reflect the new network
topology. The topology network in the core
of the Internet and in organizational networks
is largely dynamic and changes very often.
Routers also divide packets and perform
control actions on the packets; it performs
Layer 3 switching and sometimes maintains
statistical data on the data-flow. Typically,
packets are obtained from inbound network
interface; they are then handled by the
processing module (CPU), possibly put in the
buffering module.
Figure 1 Conventional routerarchitecture [16]
2 Literature Review
In recent years, several approaches have been
investigated to achieve a good fault tolerant
system supports capability in router.
[1] In the improvements of router reliability
using the generalized algorithm of fault
tolerance (GAFT) are presented using time,
structure and information types of
redundancy. But the limitation of the research
Separation of toleration of malfunction and
permanent faults is not well discussed in
terms of their impact on system reliability
[4] Hossam M.A. Fahmy et.al propose a
routing algorithm to handle complex faults in
multicomputer networks with dimension-
order routers. Simple changes to router
structure and routing logic are proposed but
problem shown that its performance in terms
of bisection utilization and message latency
is challenging
[7] Authors improved the single link failure
tolerance, by reconfiguration and defining a
new deterministic routing algorithm for all
routers on a cycle-free around faulty path link
3
The following table show the detail literature
Table 1 Literature review detail comparative analysis
Author’s (year) Techniques/ Parameters Advantages Disadvantages(research
gap)
[1]Azam, N. Ioannides, M.
H. Rümmeli, and I.
Schagaev (2009)
Reducing Router Faults,
network efficiency and
performance.
 Improvements to router
reliability
 Router functionality
and options to tolerate
faults
 Difficult cover all
hardware role by
software
 Security of software
router is difficult
[2] K. Xu, W. Chen, C. Lin,
M. Xu, D. Ma, and Y. Qu
(2014)
A reconfigurable routing,
software platform supporting
functional modules and a
component development
environment
practical approach is
introduced to build
 an open,
 flexible
 modularized and
 reconfigurable router
 system complexity,
 most commercial
routers
 vendors are a closed
development pattern
[3] A. Runge and Armin
(2015)
Energy consumption ,faster and
smaller router design
 NoCs can be used to
tolerate failures
 significant energy
consumption
 allows a faster and
smaller router design
 Buffer less routers
can drop packet in
collision
 Every time need to
buffered router
[4] J. Albrecht (2013) Applied to current
implementations in which a
router is partitioned into
multiple modules.
 handle complex faults
in multicomputer
networks
 high adaptability to
faults
 Bisection
utilization and
message latency is
challenging.
[5] H. S. Castro and O. A.
De Lima(2013)
Maintaining
communication between non-
faulty network’s routers.
 NoCs have fault
tolerance mechanisms
 control mechanism
of backup paths
 Backup and control
challenging in some
way
[6] C. Feng, Z. Lu, A.
Jantsch, M. Zhang, and Z.
Xing
Integrated
circuits leads to increases in
susceptibility to transient and
permanent faults.
 a fault-tolerant solution
for a buffer less
network-on-chip,
 Mechanism to detect
both transient and
permanent faults.
 input register for each
input port,
 There are no other
buffers in the buffer
less router.
[7] S. Y. Jiang, Y. Liu, J. B.
Luo, H. Cheng, and G. Luo
Improved the single link failure
tolerance and Improvement
ideology for link failures and
router fault tolerance.
 not require virtual
channels and
 Power consumption
will be reduced.
 Focus on single link
or hopes

[8] S. Y. Jiang, Y. Liu, J. B.
Luo, H. Cheng, and G. Luo
Tolerate multiple faulty and
reliability of network without
losing the performance of
network.
 Tolerate multiple
faulty& efficient of
network without losing
the performance.
 Loss of a
number of packets.
4
3 Research Question
This paper try to investigate and focus the
following question
1) What type’s faults in communication
network?
2) What are the major approach of
increasing reliability with respect to
faults in a system?
3) What is the basic router architecture
and functionality describe it’s
mechanisms for fault detection and
recovery?
4) How generalized algorithm of fault
Tolerance (GAFT) is used as fault
detection and recovery and show fault
tolerance FT routing table works and
show flow chart?
4 Objective
The objective of the study is to improve the
performance of the router through applying
the generalized algorithm of fault tolerance
(GAFT) that bases on time structure,
redundancy of information and a scheme of
reliability improvement for router using
system software recovery points.
5 Methodology
Classification of network typical faults are:
 Line outages, a failure of circuit;
 White noise, caused by thermal
energy;
 Impulse noise, burst errors like
lightning and poor connections
 Cross-talk, an adjacent circuit pickup
signal from other circuit;
 Attenuation, loosing of capability due
to distance.
 Jitter, caused by variation of
frequency modulation and maximum
of amplitude.
 Harmonic distortion, wrong amplify of
input signal. Such faults needed to
tolerate using general algorithm for
fault detection and recovery.
Generalized Algorithm of Fault
Tolerance (GAFT) Fault detection,
fault type identification, faulty
component
location, and hardware
reconfiguration, to achieve a
repairable state and re-establishment
of a correct stat
HW (I) - a hardware redundancy to keep extra
information
for GAFT purposes such as redundant line or
1-bit
register of data to check errors of data;
• HW(T) – detect hardware redundancy
such as hardware delay (latch) to avoid
malfunctions caused by racing of signals;
• SW(S) – detect software redundancy
such as periodic hardware testing procedures
performed;
• SW(I) - informational redundancy of the
program deliberately applied to recover a
system.
Recovery Points can be analysis
mathematically
5
Figure 2 GAFT router architecture implementer algorithm
The distributed processing architecture of the
router (central route processing and local
processing subsystem) enables mutual
checking and recovery procedures to be
performed
and excludes the core of the router in terms
of reliability.
Distributed processing archtecture [1]Azam, N.Ioannides, M. H. Rummeli, and I. Schagaev (2009)
act Class M odel
Fault
Is perm anent faul t ?
Giv e loaction to fault
component
Reconfigure Hardw are
component
Rej ect fault componets
Does i t efect Sotware com ponent
Locate faulty program
Define right recov ery
point(RP)
Recov er the system from
RP
continue the operation
Issue1
fi nd
«trace»
Yes
6
Again, in the case of any detected
inconsistencies, a procedure or re-reading the
packet will be applied with n or less
(if successful) number of iterations. Finally,
or the router outbound segment, together with
automatic formation of recovery points
(mentioned as redundant information
generation), a process of checking and
repetition is implemented.
Note that these two processes have a
semantic difference: the checking and
formation of recovery points is synchronous
and is performed constantly along the routing
process. In turn, recovery actions and
repetitions of reading from caches is
asynchronous and activated only when
packet integrity is detected.
6 Conclusion and Future
Works
The approaches for router reliability
clearly stated .Generalized algorithm
of fault tolerance were proposed to
overcome the problem of improving
reliability in case router hardware
components. Router hardware is the
major drawback in improving
7
reliability therefore, it is better to use
software supports to handle such
faults. Flexible real-time fault tolerant
systems apply different steps of the
algorithm making an option to design
The implementation of an algorithm
assumes support and coordination of
the process of hardware checking and
composed of three sets of recovery
points routing, inbound, and
outbound hardware of the router
respectively. The recovery
procedures include searching of the
correct recovery point to restart
operation; probability of this
procedure depends on quality and
consistency of checking procedures.
During the recovery actions might be
implemented in different router
hardware segments; thus reducing
performance degradation of the router
as a whole even in recovery process.
Distributed architecture processing of
the router (central route processing
and local processing subsystem)
enables to perform mutual verifying
and recovery steps excluding
core of the router in terms of
transparency.
8
Reference
[1] M. Azam, N. Ioannides, M. H. Rümmeli, and I. Schagaev, “System Software Support for Router
Fault Tolerance,” Networks, no. July 2015, pp. 13–18, 2009.
[2] K. Xu, W. Chen, C. Lin, M. Xu, D. Ma, and Y. Qu, “Toward a practical reconfigurable router: A
software component development approach,” IEEE Netw., vol. 28, no. 5, pp. 74–80, 2014.
[3] A. Runge and Armin, “Fault-tolerant Network-on-Chip based on Fault-aware Flits and Deflection
Routing,” Proc. 9th Int. Symp. Networks-on-Chip - NOCS ’15, no. January, pp. 1–8, 2015.
[4] J. Albrecht, “B 0 → Μ Μ,” vol. 0, pp. 361–366, 2013.
[5] H. S. Castro and O. A. De Lima, “A fault tolerant NoC architecture based upon external router
backup paths,” 2013 IEEE 11th Int. New Circuits Syst. Conf. NEWCAS 2013, 2013.
[6] C. Feng, Z. Lu, A. Jantsch, M. Zhang, and Z. Xing, “Addressing transient and permanent faults in
NoC with efficient fault-tolerant deflection router,” IEEE Trans. Very Large Scale Integr.
Syst., vol. 21, no. 6, pp. 1053–1066, 2013.
[7] S. Y. Jiang, Y. Liu, J. B. Luo, H. Cheng, and G. Luo, “Study of fault-tolerant routing algorithm of
NoC based on 2D-Mesh topology,” 2013 IEEE Int. Conf. Appl. Supercond. Electromagn.
Devices, ASEMD 2013, no. 41301460, pp. 189–193, 2013.
[8] R. Xie, J. Cai, X. Xin, and B. Yang, “Low-cost adaptive and fault-Tolerant routing method for 2D
network-on-chip,” IEICE Trans. Inf. Syst., vol. E100D, no. 4, pp. 910–913, 2017.
[9] Y. Chawathe and E. A. Brewer, “System Support for Scalable and Fault Tolerant,” Manager,no.
12421, pp. 1–34, 1999.
[10] W. Fu, T. Song, S. Wang, and X. Wang, “for Energy Ef fi cient Router,” pp. 139–140, 2012.
[11] T. Meyer, D. Raumer, F. Wohlfart, B. E. Wolfinger, and G. Carle, “Low latency packet
processing in software routers,” Proc. 2014 Int. Symp. Perform. Eval. Comput. Telecommun.
Syst. SPECTS 2014 - Part SummerSim 2014 Multiconference,pp. 556–563, 2014.
[12] W. Cerroni, C. Raffaelli, and M. Savi, “Optical router architecture to enable next generation
network services,” Int. Conf. Transparent Opt. Networks,pp. 1–4, 2011.
[13] V. A. N. D. E. R. Wal, .“, a , : ~ I : I : I : : : : : : I I I I ! I : I : : I : ~ I : : ......... Pyroxenite Layers,”
vol. 14, no. 7, pp. 839–846, 1992.
[14] Y. Kai, Y. Wang, and B. Liu, “GreenRouter: Reducing power by innovating Router’s
architecture,” IEEE Comput. Archit. Lett., vol. 12, no. 2, pp. 51–54, 2013.
[15] K. Li, X. J. Lu, and J. P. Li, “Fast forwarding system for centralized router,” 2008 Int. Conf.
Apperceiving Comput. Intell. Anal. ICACIA 2008,no. Mc, pp. 315–318, 2008.
, [2], [11]–[15], [3]–[10]
9
16
16

More Related Content

PDF
18068 system software suppor t for router fault tolerancelatex ieee journal s...
PDF
Reliable Metrics for Wireless Mesh Network
PDF
E42022125
PDF
Priority based bandwidth allocation in wireless sensor networks
PDF
PDF
Ijariie1170
PDF
Secure routing proposals in manets a review
PDF
Security in ad hoc networks
18068 system software suppor t for router fault tolerancelatex ieee journal s...
Reliable Metrics for Wireless Mesh Network
E42022125
Priority based bandwidth allocation in wireless sensor networks
Ijariie1170
Secure routing proposals in manets a review
Security in ad hoc networks

What's hot (20)

PDF
shashank_spdp1993_00395543
PDF
A COMBINATION OF THE INTRUSION DETECTION SYSTEM AND THE OPEN-SOURCE FIREWALL ...
PDF
Achieving Transmission Fairness in Distributed Medium Access Wireless Mesh Ne...
PDF
DSSS with ISAKMP Key Management Protocol to Secure Physical Layer for Mobile ...
PDF
A cross layer optimized reliable multicast routing protocol in wireless mesh ...
PDF
Design and Implementation of TARF: A Trust-Aware Routing Framework for WSNs
PDF
Performance Enhancement of Intrusion Detection System Using Advance Adaptive ...
PDF
AN ALTERNATE APPROACH TO RESOURCE ALLOCATION STRATEGY USING NETWORK METRICSIN...
PDF
Different Approaches for Secure and Efficient Key Management in Mobile Ad-Hoc...
PDF
PDF
Distributed Computing: An Overview
PDF
Secured client cache sustain for maintaining consistency in manets
PDF
Performance improvement of bottleneck link in red vegas over heterogeneous ne...
PDF
Checkpointing and Rollback Recovery Algorithms for Fault Tolerance in MANETs:...
PPTX
High performance energy efficient multicore embedded computing
DOC
Routing security in ad hoc wireless network
PDF
A New Approach to Improve the Efficiency of Distributed Scheduling in IEEE 80...
PDF
B0781013215
PDF
Distance Based Cluster Formation for Enhancing the Network Life Time in Manets
PDF
DESIGNING AN ENERGY EFFICIENT CLUSTERING IN HETEROGENEOUS WIRELESS SENSOR NET...
shashank_spdp1993_00395543
A COMBINATION OF THE INTRUSION DETECTION SYSTEM AND THE OPEN-SOURCE FIREWALL ...
Achieving Transmission Fairness in Distributed Medium Access Wireless Mesh Ne...
DSSS with ISAKMP Key Management Protocol to Secure Physical Layer for Mobile ...
A cross layer optimized reliable multicast routing protocol in wireless mesh ...
Design and Implementation of TARF: A Trust-Aware Routing Framework for WSNs
Performance Enhancement of Intrusion Detection System Using Advance Adaptive ...
AN ALTERNATE APPROACH TO RESOURCE ALLOCATION STRATEGY USING NETWORK METRICSIN...
Different Approaches for Secure and Efficient Key Management in Mobile Ad-Hoc...
Distributed Computing: An Overview
Secured client cache sustain for maintaining consistency in manets
Performance improvement of bottleneck link in red vegas over heterogeneous ne...
Checkpointing and Rollback Recovery Algorithms for Fault Tolerance in MANETs:...
High performance energy efficient multicore embedded computing
Routing security in ad hoc wireless network
A New Approach to Improve the Efficiency of Distributed Scheduling in IEEE 80...
B0781013215
Distance Based Cluster Formation for Enhancing the Network Life Time in Manets
DESIGNING AN ENERGY EFFICIENT CLUSTERING IN HETEROGENEOUS WIRELESS SENSOR NET...
Ad

Similar to 18068 system software suppor t for router fault tolerance(word 2 column) (20)

PDF
Fpga based highly reliable fault tolerant approach for network on chip (noc)
PDF
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
PDF
High Fault Coverage For On Chip Network Using Priority Based Routing Algorithm
PDF
High Fault Coverage For On Chip Network Using Priority Based Routing Algorithm
PDF
DYNAMIC CURATIVE MECHANISM FOR GEOGRAPHIC ROUTING IN WIRELESS MULTIMEDIA SENS...
PDF
A NOVEL ROBUST ROUTER ARCHITECTURE
PDF
Performance Improved Network on Chip Router for Low Power Applications
PDF
SURVEY OF ENERGY EFFICIENT HIGH PERFORMANCE LOW POWER ROUTER FOR NETWORK ON CHIP
PDF
Design of fault tolerant algorithm for network on chip router using field pr...
PDF
Performance Analysis of Mesh-based NoC’s on Routing Algorithms
PDF
Disadvantages And Disadvantages Of Wireless Networked And...
PDF
Ensuring the Adaptive Path for the Routing in 5g Wireless Network
PDF
PDF
PDF
IRJET- Design of Virtual Channel Less Five Port Network
PDF
Network on Chip Architecture and Routing Techniques: A survey
PDF
Lecture number 5 Theory.pdf(machine learning)
PDF
PDF
PDF
Routing in Networks using Genetic Algorithm
Fpga based highly reliable fault tolerant approach for network on chip (noc)
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
High Fault Coverage For On Chip Network Using Priority Based Routing Algorithm
High Fault Coverage For On Chip Network Using Priority Based Routing Algorithm
DYNAMIC CURATIVE MECHANISM FOR GEOGRAPHIC ROUTING IN WIRELESS MULTIMEDIA SENS...
A NOVEL ROBUST ROUTER ARCHITECTURE
Performance Improved Network on Chip Router for Low Power Applications
SURVEY OF ENERGY EFFICIENT HIGH PERFORMANCE LOW POWER ROUTER FOR NETWORK ON CHIP
Design of fault tolerant algorithm for network on chip router using field pr...
Performance Analysis of Mesh-based NoC’s on Routing Algorithms
Disadvantages And Disadvantages Of Wireless Networked And...
Ensuring the Adaptive Path for the Routing in 5g Wireless Network
IRJET- Design of Virtual Channel Less Five Port Network
Network on Chip Architecture and Routing Techniques: A survey
Lecture number 5 Theory.pdf(machine learning)
Routing in Networks using Genetic Algorithm
Ad

Recently uploaded (20)

PPTX
Geodesy 1.pptx...............................................
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPT
Mechanical Engineering MATERIALS Selection
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Safety Seminar civil to be ensured for safe working.
PDF
PPT on Performance Review to get promotions
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Sustainable Sites - Green Building Construction
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPT
Project quality management in manufacturing
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Construction Project Organization Group 2.pptx
Geodesy 1.pptx...............................................
OOP with Java - Java Introduction (Basics)
Embodied AI: Ushering in the Next Era of Intelligent Systems
Mechanical Engineering MATERIALS Selection
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Safety Seminar civil to be ensured for safe working.
PPT on Performance Review to get promotions
bas. eng. economics group 4 presentation 1.pptx
Sustainable Sites - Green Building Construction
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
CYBER-CRIMES AND SECURITY A guide to understanding
Project quality management in manufacturing
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
Automation-in-Manufacturing-Chapter-Introduction.pdf
Construction Project Organization Group 2.pptx

18068 system software suppor t for router fault tolerance(word 2 column)

  • 1. i Adama Science and Technology University School of Electrical Engineering and Computing Department of Computer Science Engineering Network Architecture Assignment 2 Title System Software Support for router Fault tolerance Prepared by Ashenafi Workie Id No PGR/18068/11 Submitted to Dr. N. Satheesh Kumar, Network Science SIG Leader Submission date January 27, 2019
  • 2. ii Contents Abstract.............................................................................................................................................1 1. Introduction...............................................................................................................................1 Basic router functionality and architecture ...................................................................................2 2 Literature Review........................................................................................................................2 3 Research Question......................................................................................................................4 4 Objective....................................................................................................................................4 5 Methodology..............................................................................................................................4 Classification of network typical faults are: ...................................................................................4 Generalized Algorithm of Fault Tolerance (GAFT)........................................................................4 6 Conclusion and Future Works......................................................................................................6 Reference..........................................................................................................................................8
  • 3. 1 System Software Support for Router Fault Tolerance Ashenafi W. Dessalgn Adama Science and Technology University Department of computer Science and Engineering, Email: {ashenafiworkie@gmail.com} Abstract Communication Network become shifted from dedicated physical network devices to the high level of functional software component of those network devices. Fault tolerant network has an ability to find or detect any irregular situation that might resulted in temporarily malfunctions or permanent faults. Depends on topologies and protocols used an actual faults does occur multiple alarm alerts could be generated through multiple network elements. Modern network management helps to determine the root cause of faults regardless of its type which may occur and perform fault detection. Routers are the most important traffic maintaining device in a communications network and core network equipment that forward data packets between computer networks. 1. Introduction Communication Network has shifted from common physical network devices to the high level of functional software component of those network utility. The functional software package can be run as a software utility on any hardware cross platform. Such package are expected to give communication through these networks in an efficient and transparent manner. The network efficiency relies on how the above components are able to tolerate a faults which could be malfunction or permanent fault. A system which provides full functionality for its application and can recover transparently from predefined faults is called a fault tolerant system (Schagaev and Zalewski, 2001). Fault tolerant network has an ability to find, detect any unexpected situation that might resulted in temporarily malfunctions (for seconds) to permanent faults (several days). There are two major approach of increasing reliability with respect to faults in a system; namely, 1) fault prevention and 2) fault tolerance. Avoiding all faults from a system is in most cases impossible or may cause some problems such as the delay or difficulty in maintenance. So it bases on fault tolerant algorithms to manages faults regarding to the following scenario [5] based on topologies and protocols used when an actual faults does occur multiple alarm alerts could be generated through multiple network elements. This is said to be fault detection phases. Modern network management helps to determine the main cause of faults regardless of its type which may occur and perform fault detection, fault isolation processes and localization simultaneously. Routers is the most important traffic maintaining device in a communications network [2] and core network equipment that forward data packets between computer networks in different way. Besides the fundamental packet routing capability, modern routers incorporate a variety of extended functionalities such as traffic management, packet filtering, and virtual
  • 4. 2 private networks (VPNs). For this reason, router systems are also becoming extremely creating exceedingly and complex high barriers to network innovations. Muhammad Azam, et al argue that the sustainability of router hardware and software will determine the efficiency level of a particular network. Depending on the character and strength of the fault, the communication system crash from a few microseconds to several days. The manifestation and occurrence of a hardware fault may cause errors in data transmission in network communication. The normal way to recover an erroneous reception of data is to request retransmission of the particular packet(s). Although the fault might have happened inside a router during the process of path computation, determination and/or packet encapsulation. In this case, a packet could be corrupted whilst being read from the input buffer, written into the processor cache, processed, or written into output buffers. Thus the overall reliable operation of a router becomes critically important for network efficiency and performance Basic router functionality and architecture Generally, the router performs two major tasks these are: 1) control routines path and Data control path (switching). Routers maintain and manipulate routing tables; they listen for updates and maintain changes in the routing tables to reflect the new network topology. The topology network in the core of the Internet and in organizational networks is largely dynamic and changes very often. Routers also divide packets and perform control actions on the packets; it performs Layer 3 switching and sometimes maintains statistical data on the data-flow. Typically, packets are obtained from inbound network interface; they are then handled by the processing module (CPU), possibly put in the buffering module. Figure 1 Conventional routerarchitecture [16] 2 Literature Review In recent years, several approaches have been investigated to achieve a good fault tolerant system supports capability in router. [1] In the improvements of router reliability using the generalized algorithm of fault tolerance (GAFT) are presented using time, structure and information types of redundancy. But the limitation of the research Separation of toleration of malfunction and permanent faults is not well discussed in terms of their impact on system reliability [4] Hossam M.A. Fahmy et.al propose a routing algorithm to handle complex faults in multicomputer networks with dimension- order routers. Simple changes to router structure and routing logic are proposed but problem shown that its performance in terms of bisection utilization and message latency is challenging [7] Authors improved the single link failure tolerance, by reconfiguration and defining a new deterministic routing algorithm for all routers on a cycle-free around faulty path link
  • 5. 3 The following table show the detail literature Table 1 Literature review detail comparative analysis Author’s (year) Techniques/ Parameters Advantages Disadvantages(research gap) [1]Azam, N. Ioannides, M. H. Rümmeli, and I. Schagaev (2009) Reducing Router Faults, network efficiency and performance.  Improvements to router reliability  Router functionality and options to tolerate faults  Difficult cover all hardware role by software  Security of software router is difficult [2] K. Xu, W. Chen, C. Lin, M. Xu, D. Ma, and Y. Qu (2014) A reconfigurable routing, software platform supporting functional modules and a component development environment practical approach is introduced to build  an open,  flexible  modularized and  reconfigurable router  system complexity,  most commercial routers  vendors are a closed development pattern [3] A. Runge and Armin (2015) Energy consumption ,faster and smaller router design  NoCs can be used to tolerate failures  significant energy consumption  allows a faster and smaller router design  Buffer less routers can drop packet in collision  Every time need to buffered router [4] J. Albrecht (2013) Applied to current implementations in which a router is partitioned into multiple modules.  handle complex faults in multicomputer networks  high adaptability to faults  Bisection utilization and message latency is challenging. [5] H. S. Castro and O. A. De Lima(2013) Maintaining communication between non- faulty network’s routers.  NoCs have fault tolerance mechanisms  control mechanism of backup paths  Backup and control challenging in some way [6] C. Feng, Z. Lu, A. Jantsch, M. Zhang, and Z. Xing Integrated circuits leads to increases in susceptibility to transient and permanent faults.  a fault-tolerant solution for a buffer less network-on-chip,  Mechanism to detect both transient and permanent faults.  input register for each input port,  There are no other buffers in the buffer less router. [7] S. Y. Jiang, Y. Liu, J. B. Luo, H. Cheng, and G. Luo Improved the single link failure tolerance and Improvement ideology for link failures and router fault tolerance.  not require virtual channels and  Power consumption will be reduced.  Focus on single link or hopes  [8] S. Y. Jiang, Y. Liu, J. B. Luo, H. Cheng, and G. Luo Tolerate multiple faulty and reliability of network without losing the performance of network.  Tolerate multiple faulty& efficient of network without losing the performance.  Loss of a number of packets.
  • 6. 4 3 Research Question This paper try to investigate and focus the following question 1) What type’s faults in communication network? 2) What are the major approach of increasing reliability with respect to faults in a system? 3) What is the basic router architecture and functionality describe it’s mechanisms for fault detection and recovery? 4) How generalized algorithm of fault Tolerance (GAFT) is used as fault detection and recovery and show fault tolerance FT routing table works and show flow chart? 4 Objective The objective of the study is to improve the performance of the router through applying the generalized algorithm of fault tolerance (GAFT) that bases on time structure, redundancy of information and a scheme of reliability improvement for router using system software recovery points. 5 Methodology Classification of network typical faults are:  Line outages, a failure of circuit;  White noise, caused by thermal energy;  Impulse noise, burst errors like lightning and poor connections  Cross-talk, an adjacent circuit pickup signal from other circuit;  Attenuation, loosing of capability due to distance.  Jitter, caused by variation of frequency modulation and maximum of amplitude.  Harmonic distortion, wrong amplify of input signal. Such faults needed to tolerate using general algorithm for fault detection and recovery. Generalized Algorithm of Fault Tolerance (GAFT) Fault detection, fault type identification, faulty component location, and hardware reconfiguration, to achieve a repairable state and re-establishment of a correct stat HW (I) - a hardware redundancy to keep extra information for GAFT purposes such as redundant line or 1-bit register of data to check errors of data; • HW(T) – detect hardware redundancy such as hardware delay (latch) to avoid malfunctions caused by racing of signals; • SW(S) – detect software redundancy such as periodic hardware testing procedures performed; • SW(I) - informational redundancy of the program deliberately applied to recover a system. Recovery Points can be analysis mathematically
  • 7. 5 Figure 2 GAFT router architecture implementer algorithm The distributed processing architecture of the router (central route processing and local processing subsystem) enables mutual checking and recovery procedures to be performed and excludes the core of the router in terms of reliability. Distributed processing archtecture [1]Azam, N.Ioannides, M. H. Rummeli, and I. Schagaev (2009) act Class M odel Fault Is perm anent faul t ? Giv e loaction to fault component Reconfigure Hardw are component Rej ect fault componets Does i t efect Sotware com ponent Locate faulty program Define right recov ery point(RP) Recov er the system from RP continue the operation Issue1 fi nd «trace» Yes
  • 8. 6 Again, in the case of any detected inconsistencies, a procedure or re-reading the packet will be applied with n or less (if successful) number of iterations. Finally, or the router outbound segment, together with automatic formation of recovery points (mentioned as redundant information generation), a process of checking and repetition is implemented. Note that these two processes have a semantic difference: the checking and formation of recovery points is synchronous and is performed constantly along the routing process. In turn, recovery actions and repetitions of reading from caches is asynchronous and activated only when packet integrity is detected. 6 Conclusion and Future Works The approaches for router reliability clearly stated .Generalized algorithm of fault tolerance were proposed to overcome the problem of improving reliability in case router hardware components. Router hardware is the major drawback in improving
  • 9. 7 reliability therefore, it is better to use software supports to handle such faults. Flexible real-time fault tolerant systems apply different steps of the algorithm making an option to design The implementation of an algorithm assumes support and coordination of the process of hardware checking and composed of three sets of recovery points routing, inbound, and outbound hardware of the router respectively. The recovery procedures include searching of the correct recovery point to restart operation; probability of this procedure depends on quality and consistency of checking procedures. During the recovery actions might be implemented in different router hardware segments; thus reducing performance degradation of the router as a whole even in recovery process. Distributed architecture processing of the router (central route processing and local processing subsystem) enables to perform mutual verifying and recovery steps excluding core of the router in terms of transparency.
  • 10. 8 Reference [1] M. Azam, N. Ioannides, M. H. Rümmeli, and I. Schagaev, “System Software Support for Router Fault Tolerance,” Networks, no. July 2015, pp. 13–18, 2009. [2] K. Xu, W. Chen, C. Lin, M. Xu, D. Ma, and Y. Qu, “Toward a practical reconfigurable router: A software component development approach,” IEEE Netw., vol. 28, no. 5, pp. 74–80, 2014. [3] A. Runge and Armin, “Fault-tolerant Network-on-Chip based on Fault-aware Flits and Deflection Routing,” Proc. 9th Int. Symp. Networks-on-Chip - NOCS ’15, no. January, pp. 1–8, 2015. [4] J. Albrecht, “B 0 → Μ Μ,” vol. 0, pp. 361–366, 2013. [5] H. S. Castro and O. A. De Lima, “A fault tolerant NoC architecture based upon external router backup paths,” 2013 IEEE 11th Int. New Circuits Syst. Conf. NEWCAS 2013, 2013. [6] C. Feng, Z. Lu, A. Jantsch, M. Zhang, and Z. Xing, “Addressing transient and permanent faults in NoC with efficient fault-tolerant deflection router,” IEEE Trans. Very Large Scale Integr. Syst., vol. 21, no. 6, pp. 1053–1066, 2013. [7] S. Y. Jiang, Y. Liu, J. B. Luo, H. Cheng, and G. Luo, “Study of fault-tolerant routing algorithm of NoC based on 2D-Mesh topology,” 2013 IEEE Int. Conf. Appl. Supercond. Electromagn. Devices, ASEMD 2013, no. 41301460, pp. 189–193, 2013. [8] R. Xie, J. Cai, X. Xin, and B. Yang, “Low-cost adaptive and fault-Tolerant routing method for 2D network-on-chip,” IEICE Trans. Inf. Syst., vol. E100D, no. 4, pp. 910–913, 2017. [9] Y. Chawathe and E. A. Brewer, “System Support for Scalable and Fault Tolerant,” Manager,no. 12421, pp. 1–34, 1999. [10] W. Fu, T. Song, S. Wang, and X. Wang, “for Energy Ef fi cient Router,” pp. 139–140, 2012. [11] T. Meyer, D. Raumer, F. Wohlfart, B. E. Wolfinger, and G. Carle, “Low latency packet processing in software routers,” Proc. 2014 Int. Symp. Perform. Eval. Comput. Telecommun. Syst. SPECTS 2014 - Part SummerSim 2014 Multiconference,pp. 556–563, 2014. [12] W. Cerroni, C. Raffaelli, and M. Savi, “Optical router architecture to enable next generation network services,” Int. Conf. Transparent Opt. Networks,pp. 1–4, 2011. [13] V. A. N. D. E. R. Wal, .“, a , : ~ I : I : I : : : : : : I I I I ! I : I : : I : ~ I : : ......... Pyroxenite Layers,” vol. 14, no. 7, pp. 839–846, 1992. [14] Y. Kai, Y. Wang, and B. Liu, “GreenRouter: Reducing power by innovating Router’s architecture,” IEEE Comput. Archit. Lett., vol. 12, no. 2, pp. 51–54, 2013. [15] K. Li, X. J. Lu, and J. P. Li, “Fast forwarding system for centralized router,” 2008 Int. Conf. Apperceiving Comput. Intell. Anal. ICACIA 2008,no. Mc, pp. 315–318, 2008. , [2], [11]–[15], [3]–[10]