Permissioned ledgers such as Fabric employ consensus to avoid central points of failures without endangering the blockchain consistency. Much attention has been given to consensus algorithms, their scalability, performance, and fault model (crash or Byzantine). However, much less attention has been paid to auxiliary protocols such as state transfer and reconfiguration, that any consensus “engine” must also support.
In this talk, I’ll discuss some essential performance and scalability aspects of some well-known crash and Byzantine fault-tolerant consensus protocols such as Raft and PBFT, illustrate some limitations of these protocols when recovering crashed nodes and reconfiguring the group, and show how to cope with these limitations.
The last part of the talk will report the experience of my team on the design and implementation of the first Byzantine fault-tolerant orderer service for Fabric 1.1, showing how the issues above affected our system and pointing some features that should be considered if this project wants to avoid them. Finally, I’ll finish the presentation with some suggestions for making Fabric more friendly to new ordering service implementations.