Sunday, 7 August 2016

What is Fault Tolerant System ?


Fault Tolerant System -:
Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of (or one or more faults within) some of its components. If its operating quality decreases at all, the decrease is proportional to the severity of the failure, as compared to a naively designed system in which even a small failure can cause total breakdown. Fault tolerance is particularly sought after in high-availability or life-critical systems. The ability of maintaining functionality when portions of a system break down is referred to as graceful degradation.[1]
fault-tolerant design enables a system to continue its intended operation, possibly at a reduced level, rather than failing completely, when some part of the system fails.[2] The term is most commonly used to describe computer systems designed to continue more or less fully operational with, perhaps, a reduction in throughput or an increase in response time in the event of some partial failure. That is, the system as a whole is not stopped due to problems either in the hardware or thesoftware. An example in another field is a motor vehicle designed so it will continue to be drivable if one of the tires is punctured. A structure is able to retain its integrity in the presence of damage due to causes such as fatiguecorrosion, manufacturing flaws, or impact.
Within the scope of an individual system, fault tolerance can be achieved by anticipating exceptional conditions and building the system to cope with them, and, in general, aiming for self-stabilization so that the system converges towards an error-free state. However, if the consequences of a system failure are catastrophic, or the cost of making it sufficiently reliable is very high, a better solution may be to use some form of duplication. In any case, if the consequence of a system failure is so catastrophic, the system must be able to use reversion to fall back to a safe mode. This is similar to roll-back recovery but can be a human action if humans are present in the loop.

Optimizing Cauchy Reed-Solomon Codes for Fault-Tolerant Storage Applications :- 

Abstract
In the past few years, all manner of storage systems,
ranging fromdisk array systems to distributed and widearea
systems, have started to grapple with the reality
of tolerating multiple simultaneous failures of storage
nodes. Unlike the single failure case, which is optimally
handled with RAID Level-5 parity, the multiple failure
case is more difficult because optimal general purpose
strategies are not yet known.
Erasure Coding is the field of research that deals with
these strategies, and this field has blossomed in recent
years. Despite this research, the decades-old strategy of
Reed-Solomon coding remains the only space-optimal
(MDS) code for all but the smallest storage systems.
The best performing implementations of Reed-Solomon
coding employ a variant called Cauchy Reed-Solomon
coding, developed in the mid 1990’s [BKK+95].
In this paper, we present an improvement to Cauchy
Reed-Solomon coding that is based on optimizing the
Cauchy distribution matrix. We detail an algorithm
for generating good matrices and then evaluate the
performance of encoding using all manners of Reed-
Solomon coding, plus the best MDS codes from the literature.
The improvements over the original Cauchy
Reed-Solomon codes are as much as 83% in realistic
scenarios, and average roughly 10% over all cases that
we tested.

For More Reference :

No comments:

Post a Comment

How to install google-chrome in redhat without redhat subscription

Install google-chrome in redhat  Download the .rpm file of chrome https://www.google.com/chrome/thank-you.html?installdataindex=empty&st...