Immutability dramatically simplifies fault tolerance, straggler mitigation, and data consistency and is an essential part of widely-used distributed batch analytics systems including MapReduce, Dryad, and Spark. However, these systems are increasingly being used for new applications like stream processing and incremental analytics, which often demand fine-grained updates and are seemingly at odds with the essential assumption of immutability. We introduce the persistent adaptive radix tree (PART), a map data structure that supports efficient fine-grained updates without compromising immutability. In addition, PART (1) allows applications to trade off latency for throughput using batching, (2) supports efficient scans using an optimized memory layout and periodic compaction, and (3) achieves efficient fault recovery using incremental checkpoints. PART achieves update performance comparable to a mutable hash table, scan performance within 2x of a B-tree due to compaction, and memory usage within 2x of optimal. Across a variety of distributed applications, PART improves upon the performance of state-of-the-art immutable and even mutable systems by anywhere from 50x to 4 orders of magnitude.
Ankur Dave
Ion Stoica
Joey Gonzalez