Part of the Berkeley tradition—and the RISELab mission—is to release open source software as part of our research agenda. Six months after launching the lab, we’re excited to announce initial v0.1 releases of three RISElab open-source systems: Clipper, Ground and Ray.
- Clipper is an open-source prediction-serving system. Clipper simplifies deploying models from a wide range of machine learning frameworks by exposing a common REST interface and automatically ensuring low-latency and high-throughput predictions. In the 0.1 release, we focused on reliable support for serving models trained in Spark and Scikit-Learn. In the next release we will be introducing support for TensorFlow and Caffe2 as well as online-personalization and multi-armed bandits. We are providing active support for early users and will be following Github issues closely. You can get started using Clipper and learn more about the system by visiting our website at http://clipper.ai and reading about the research underlying the system in the paper describing Clipper at NSDI 2017.
- Ground is an open-source, vendor-neutral data context service, developed in collaboration with a community of contributors. We use the phrase data context to refer to the full complement of information surrounding the use of data in an organization. Ground is an effort to enable users and applications to understand what data they have, who is using that data, when and how data is changing, and to & from where the data is moving. Initial use cases for Ground include organization-wide Global Data Inventory (in collaboration with our sponsors at Capital One) and lifecycle management for machine learning models (in collaboration with the RISELab Clipper team). To learn more about the design principles behind Ground, please visit the Ground website at http://ground-context.org and read more in our CIDR 2017 publication.
- Ray is a new distributed framework to enable next-generation AI applications, with a focus on the reinforcement learning (RL) techniques that have recently seen radical success playing Atari games and board games (Google’s AlphaGo), and which are finding their ways into medical diagnosis, self-driving cars, and robotics. RL applications need to perform hundreds of thousands of simulations per second to evaluate the current policy, implement complex distributed algorithms to update the policy, and process in parallel and in real-time the inputs from different sensors to capture the state of the environment. Supporting this functionality requires a cluster computing engine that can handle millions of millisecond granularity tasks, which are part of a dynamic task graph. With its initial release, we hope that Ray makes one step towards addressing these challenges.
We will be following up with posts from each of these teams soon.
These three emerging projects represent a cross-cut of systems efforts in the RISELab, and a taste of things to come. Like all RISELab projects, these are available on GitHub in open source, and we eagerly invite you to try them out and give us feedback.