The document discusses the probabilistic behavior of crash-recovery failure detectors and their impact on the quality of service (QoS) of distributed systems. It highlights the need to extend QoS measures to incorporate recovery detection speed and the proportion of failures detected, which is influenced by the dependability metrics of the monitored process. The analysis is supported by simulations, illustrating that variations in mean time to failure (MTTF) and mean time to recovery (MTTR) significantly affect the QoS outcomes of such failure detectors.
Related topics: