Apr 14, 2024





edit SideBar



The Failure Trace Archive (FTA) is centralized public repository of availability traces of distributed systems, and tools for their analysis. The purpose of this archive is to facilitate the design, validation, and comparison of fault-tolerant models and algorithms.

In particular, the FTA contains the following:

  • availability traces of distributed systems, differing in scale, volatility, and usage
  • a standard format for failure traces
  • scripts and tools for analyzing these traces

The FTA allows the following:

  • the comparison and cross-validation of a fault-tolerant model or algorithm across identical trace data sets
  • the evaluation of the generality of a model or algorithm across different types of resources (in terms of reliability or user base, for example)
  • the evaluation the generality of a failure trace, i.e., to determine whether measurements are biased to particular platform or middleware
  • the determination of which trace data set is most interesting or applicable for a given algorithm or model
  • the analysis of the evolution of availability in different systems across long timescales
  • the integration of failure models with other types of models (such as workloads)
  • the incorporation of traces with a common format into fault simulators or emulators for model or algorithm evaluation

Page Actions

Recent Changes

Group & Page

Back Links