RISE Seminar 10/25/19: RADE : Resource-Efficient Supervised Anomaly Detection Using Decision Tree Based Ensemble Methods, a talk by Yaniv Ben-Itzhak (VMware)

October 25, 2019

Title: RADE : Resource-Efficient Supervised Anomaly Detection Using Decision Tree Based Ensemble Methods

Speaker: Yaniv Ben-Itzhak (VMware)

Date and location: Friday, October 25, 11 – 12 pm, Wozniak Lounge

Abstract: Decision-tree-based ensemble classification methods (DTEMs) are a prevalent tool for supervised anomaly detection. However, due to the continued growth of datasets, DTEMs result in increasing drawbacks such as growing memory footprints, longer training times, and slower classification times at lower throughput. In this paper, we present, design, and evaluate RADE – a DTEM-based anomaly detection framework that augments standard DTEM classifiers and alleviates these drawbacks by relying on two observations: (1) we find that a small (coarse-grained) DTEM model is sufficient to classify the majority of the classification queries correctly, such that a classification is valid only if its corresponding confidence level greater than or equal to a predetermined classification confidence threshold; (2) we find that in these fewer harder cases where our coarse-grained DTEM model results in insufficient confidence in its classification, we can improve it by forwarding the classification query to one of expert DTEM (fine-grained) models, which is explicitly trained for that particular case. We implement RADE in Python based on scikit-learn and evaluate it over different DTEM methods: RF, XGBoost, AdaBoost, GBDT, and LightGBM, and over three publicly available datasets. Our evaluation over both a strong AWS EC2 instance and a Raspberry Pi 3 device indicates that RADE offers competitive and often superior anomaly detection capabilities as compared to standard DTEM methods, while significantly improving memory footprint (by up to 5.46×), training-time (by up to 17.2×), and classification latency (by up to 31.2×).

(Under submission to SysML 2020 conference)

Bio: Dr. Yaniv Ben-Itzhak is a senior researcher in VMware Research Group. Originally, his research interests are in the computer networks domain, including Data-Center networks, Software Defined Networks (SDN), network virtualization, and Network Function Virtualization (NFV). Nowadays, Yaniv works on system enhancements for machine-learning, for instance by introducing methods from the computer networks domain into ML models. He received his B.Sc (magna cum laude), M.Sc., and Ph.D. degrees in Electrical Engineering from the Technion – Israel Institute of Technology.

Link to the video recording of the talk: https://www.youtube.com/watch?v=ScGD5ZBc9gY&feature=youtu.be