Selecting the Best VM across Multiple Public Clouds: A Data-Driven Performance Modeling Approach

Neeraja Yadwadkar

Users of cloud services are presented with a bewildering choice of VM types and the choice of VM can have significant implications on performance and cost. In this paper we address the fundamental problem of accurately and economically choosing the best VM for a given workload and user goals. To address the problem of optimal VM selection, we present PARIS, a data-driven system that uses a novel hybrid offline and online data collection and modeling framework to provide accurate performance estimates with minimal data collection. PARIS is able to predict workload performance for different user-specified metrics, and resulting costs for a wide range of VM types and workloads across multiple cloud providers. When compared to a sophisticated baseline linear interpolation model using measured workload performance on two VM types, PARIS produces significantly better estimates of performance. For instance, it reduces runtime prediction error by a factor of 4 for some workloads on both AWS and Azure. The increased accuracy translates into a 45% reduction in user cost while maintaining performance.

Published On: September 25, 2017

Presented At/In: ACM Symposium on Cloud Computing 2017 (SoCC '17)

Download Paper: https://rise.cs.berkeley.edu/wp-content/uploads/2017/07/paris_socc17.pdf

Link: https://people.eecs.berkeley.edu/~neerajay/paris_socc17.pdf

Authors: Neeraja Yadwadkar, Bharath Hariharan, Joseph Gonzalez, Burton Smith, Randy Katz