Confluo: Distributed Monitoring and Diagnosis Stack for High-speed Networks

Anurag Khandelwal

Confluo is an end-host stack that can be integrated with existing network management tools to enable monitoring and diagnosis of network-wide events using telemetry data distributed across end-hosts, even for high-speed networks. Confluo achieves these properties using a new data structure — Atomic MultiLog — that supports highly-concurrent read-write operations by exploiting two properties specific to telemetry data: (1) once processed by the stack, the data is neither updated nor deleted; and (2) each field in the data has a fixed pre-defined size. Our evaluation results show that, for packet sizes 128B or larger, Confluo executes thousands of triggers and tens of filters at line rate (for 10Gbps links) using a single core. 

Published On: February 26, 2019

Presented At/In: USENIX Symposium on Networked Systems Design and Implementation (NSDI'19)

Link: https://people.eecs.berkeley.edu/~anuragk/papers/confluo.pdf

Authors: Anurag Khandelwal, Rachit Agarwal, Ion Stoica