Confluo is a system for real-time monitoring and analysis of data streams. Confluo achieves three desirable properties for such systems:
- High throughput concurrent writes of millions of data points from multiple data streams
- Online queries at millisecond timescale
- Ad-hoc queries using minimal CPU resources
The key technical contribution in Confluo is a new data structure — Atomic MultiLog — that supports efficiently updating a collection of lock-free concurrent logs as a single atomic operation. Confluo supports the above three properties using Atomic MultiLogs to store data, aggregate statistics and materialized views, along with hardware primitives for efficient atomic updates on Atomic MultiLogs. Confluo supports a wide range of real-time streaming applications, ranging from network monitoring and diagnosis tools to time-series databases.