We’re excited to be releasing v0.1 of the Ground project! Ground is a data context service. It is a central repository for all the information surrounding the use of data in an organization. Ground concerns itself with what data an organization has, where that data is, who (both human beings and software systems) is touching that data, and how that data is being modified and described. Above all, Ground aims to be an open-source, vendor neutral system that provides users an unopinionated metamodel and set of APIs that allow them to think about and interact with data context generated in their organization. Ground has many use cases, but we’re focused on two specific ones at present: Data Inventory: large organizations…
RISELab Announces 3 Open Source Releases
Part of the Berkeley tradition—and the RISELab mission—is to release open source software as part of our research agenda. Six months after launching the lab, we’re excited to announce initial v0.1 releases of three RISElab open-source systems: Clipper, Ground and Ray. Clipper is an open-source prediction-serving system. Clipper simplifies deploying models from a wide range of machine learning frameworks by exposing a common REST interface and automatically ensuring low-latency and high-throughput predictions. In the 0.1 release, we focused on reliable support for serving models trained in Spark and Scikit-Learn. In the next release we will be introducing support for TensorFlow and Caffe2 as well as online-personalization and multi-armed bandits. We are providing active support for early users and will be following Github issues…
Metadata Megafail: Messing up Your Data Strategy in 3 Easy Steps
A key aspect of the RISELab agenda is to aggressively harness data—lots of it, both historical and live. Of course bits in computers don’t provide value on their own. We need a broader context for data: where it came from, what it represents, and how it gets used. Traditionally, people called this metadata: the data about our data. Requirements for metadata have changed drastically in recent years in response to technology trends. There’s an emerging groundswell to address these new requirements and explore new opportunities. This includes our work on the broader notion of data context in the Ground system. How should data-driven organizations respond to these changing requirements? In the tradition of Berkeley advice like how to build a bad research center and…