Files

Abstract

Real-time control systems (RTCSs) tolerate delay and crash faults by replicating the controller. Each replica computes and issues setpoints to actuators over a network that might drop or delay messages. Hence, the actuators might receive an inconsistent set of setpoints. Such inconsistency is avoided either by having a single primary replica compute and issue setpoints (in passive replication) or a consensus algorithm select one sending-replica (in active replication). However, due to the impossibility of a perfect failure-detector, passive-replication schemes can have multiple primaries, causing inconsistency, especially in the presence of intermittent delay faults. Furthermore, the impossibility of bounded-latency consensus causes both schemes to have poor real-time performance. We identified three properties of RTCSs that enable active-replication schemes to agree on the measurements before computing, instead of using traditional consensus. As all computing replicas compute with the same state, the resulting setpoints are guaranteed to be consistent. We present the design of Quarts, an agreement solution for active replication that guarantees consistency and bounded latency-overhead. We prove the guarantees and compare the performance of Quarts with existing solutions through simulation. We show that Quarts provides an availability higher than existing solutions, and that the availability improvement is up to 10x with two replicas.

Details

Actions

Preview