Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News IronFleet: A Methodology for Proving Distributed Systems

IronFleet: A Methodology for Proving Distributed Systems

A group of researchers from Microsoft has published the paper “IronFleet: Proving Practical Distributed Systems Correct” (PDF) and made available the accompanying source code demonstrating the use of the methodology in machine proving the correctness of a non-trivial distributed system from a safety and liveliness point of view.

Given the difficulty of building correct distributed systems due to the possibility of bugs generated by the parallel execution of code on multiple machines, a Microsoft research team has developed IronFleet, a methodology for automatically verifying such systems. Going beyond specifying and verifying distributed protocols, IronFleet “guarantees” that the implementation of a distributed system satisfies the demands of a centralized specification, ruling out “race conditions, violations of global invariants, integer overflow, disagreements between packet encoding and decoding, and bugs in rarely exercised code paths such as failure recovery.” The aim is to determine if a distributed system is “correct” and it will behave properly at all times.

IronFleet proves both the safety - the correctness of actions performed - and liveliness – the usefulness of the actions - properties of a distributed system. According to the paper, IronFleet does that by the

comprehensive verification of complex distributed systems via a methodology for structuring and writing proofs about them, as well as a collection of generic verified libraries useful for implementing such systems. Structurally, IronFleet’s methodology uses a concurrency containment strategy that blends two distinct verification styles within the same automated theorem-proving framework, preventing any semantic gaps between them. We use TLA-style state-machine refinement [TLA - Temporal Logic of Actions] to reason about protocol-level concurrency, ignoring implementation complexities, then use Floyd-Hoare-style imperative verification to reason about those complexities while ignoring concurrency. To simplify reasoning about concurrency, we impose a machine checked reduction-enabling obligation on the implementation. Finally, we structure our protocols using always-enabled actions to greatly simplify liveliness proofs.

Such systems are written in Dafny, a language and program verification tool used to determine the functional correctness of a program. After verification, the code can be compiled to C# or assembly code to remove any dependence on .NET or Windows. Dafny could be enhanced to compile the code to other languages such as C++.

IronFleet has been used to demonstrate the correctness of IronRSL – a distributed Paxos state machine – and IronKV – a sharded key-value store-. The source code of IronFleet and of the verified systems is available on GitHub.

Rate this Article