This blog post was originally posted here. View the code on Gist.
This post was originally posted here. Robert Nishihara and Philipp Moritz are graduate students in the RISElab at UC Berkeley. This post elaborates on the integration between Ray and Apache Arrow. The main problem this addresses is data serialization. From Wikipedia, serialization is … the process of translating data structures or object state into a format that can be stored … or transmitted … and reconstructed later (possibly in a different computer environment). Why is any translation necessary? Well, when you create a Python object, it may have pointers to other Python objects, and these objects are all allocated in different regions of memory, and all of this has to make sense when unpacked by another process on another machine. Serialization and deserialization …
This was originally posted on the Ray blog. We are pleased to announce the Ray 0.2 release. This release includes the following: substantial performance improvements to the Plasma object store an initial Jupyter notebook based web UI the start of a scalable reinforcement learning library fault tolerance for actors Plasma Since the last release, the Plasma object store has moved out of the Ray codebase and is now being developed as part of Apache Arrow (see the relevant documentation), so that it can be used as a standalone component by other projects to leverage high-performance shared memory. In addition, our Arrow-based serialization libraries have been moved into pyarrow (see the relevant documentation). In 0.2, we’ve increased the write throughput of the object store …
Ray 0.2 has been released: https://ray-project.github.io/2017/09/30/ray-0.2-release.html