For folks who care about what’s possible in distributed computing: Peter Alvaro and I wrote an introduction to the CALM Theorem and subsequent work that is now up on arXiv. The CALM Theorem formally characterizes the class of programs that can achieve distributed consistency without the use of coordination.
— Joe Hellerstein (Cross-posted from databeta.wordpress.com.)
I spent a good fraction of my academic life in the last decade working on a deeper understanding of how to program the cloud and other large-scale distributed systems. I was enormously lucky to collaborate with and learn from amazing friends over this period in the BOOM project, and see our work picked up and extended by new friends and colleagues.
Our research was motivated by simple questions, chief among them this:
Q: “What is the hardest thing about distributed systems?”
A: “Coordination and consistency.”
Protocols like Two-Phase Commit, Paxos and their myriad offspring are celebrated for being tricky, and as such form the backbone of academic classes on distributed computing. But trickiness is not a hallmark of good software design. In practice, coordination is the source of much of the complexity and inefficiency of distributed systems, and it is avoided when possible by good engineers.
So we moved to a more fundamental question:
Q: When can we correctly avoid coordination, and when are we absolutely required to use it?
A (circa 2010): Unknown.
Surprisingly, this computability question was one that the pioneers of distributed systems never answered, at least not in any sense of algorithms or program semantics. The discussion in the literature was cast in terms of “memory models” or “storage consistency” guarantees so low down the stack as to be irrelevant and unhelpful to most application designers.
In a keynote talk at PODS 2010, I proposed an answer to this open question. I conjectured—based on my team’s experience with streaming queries and declarative networking—that coordination was needed if and only if you had a computational task that could not be expressed with monotonic logic. I called this idea CALM: Consistency as Logical Monotonicity. Not long thereafter a formalization and proof of the CALM Theorem was provided by Ameloot, Neven and Van den Bussche over at Hasselt University in Belgium. Related work ensued across both sides of the Atlantic on additional theoretical results and practical uses of the idea for program analysis.
I sense that this body of work deserves more attention today, when distributed computing is becoming the norm rather than the exception. CALM provides a formal basis for a myriad of conversations over the last 15 years regarding what is possible to get correct with “eventual consistency”, “noSQL”, “commutativity”, “ACID 2.0”, “CRDTs” and other pragmatics. It provides the nuanced answer to screeds about “beating the CAP Theorem”. It also lays the groundwork for what we did later with the Bloom language: provide a programming model where the really hard issues of distributed programming are first-order concerns of the language and its syntax.
To bring these issues to a wider audience, I sat down with the inimitable Peter Alvaro to write up what we hope is an approachable but sufficiently meaty intro to the CALM Theorem, its implications, and the many open questions remaining. It took a while for this to get to the top of our stacks, but the paper is now up on arXiv.
We’re spinning up a new generation of work on cloud programming here at the RISELab that builds on these lessons. Watch this space!