RISE Seminar 1/25/18: Tiny functions for codecs, compilation, and (maybe) soon everything, talk by Keith Winstein
January 25, 2018
Title: Tiny functions for codecs, compilation, and (maybe) soon everything
By: Keith Winstein
Affiliation: Professor, Stanford Computer Science
Where/When: Thursday Jan 25 noon-1pm Wozniak Lounge (430 Soda Hall)
Abstract: Networks, applications, and media codecs frequently treat one another as strangers. By expressing large systems as compositions of small, pure functions, we’ve found it’s possible to achieve tighter couplings between these components, improving performance without giving up modularity or the ability to debug. I’ll discuss our experience with systems that demonstrate this basic idea: ExCamera (NSDI 2017) parallelizes video encoding into thousands of tiny tasks, each handling a fraction of a second of video, much shorter than the interval between key frames, and executing in parallel on AWS Lambda. This was the first system to demonstrate “burst-parallel” thousands-way computation on functions-as-a-service infrastructure. Salsify (NSDI 2018) is a low-latency network video system that uses a purely functional video codec to explore execution paths of the encoder without committing to them, allowing it to closely match the capacity estimates from a video-aware transport protocol. This architecture outperforms more loosely-coupled applications — Skype, Facetime, Hangouts, WebRTC — in delay and visual quality, and suggests that while improvements in video codecs may have reached the point of diminishing returns, video systems still have low-hanging fruit. Lepton (NSDI 2017) uses a purely functional JPEG/VP8 transcoder to compress images in parallel across a distributed network filesystem with arbitrary block boundaries. This free-software system is in production at Dropbox and has compressed, by 23%, more than 200 petabytes of user JPEGs.
Based on our experience, we propose a general abstraction for outsourced morsels of computation, called cloud “thunks” — stateless closures that describe their data dependencies by content-hash. We have created a tool that uses this abstraction to capture off-the-shelf Makefiles and other build systems, letting the user treat a FaaS service like an outsourced build farm with global memoization of results. The bottom line: expressing systems and protocols as compositions of small, pure functions will lead to a new wave of “general-purpose” lambda computing, permitting us to transform many time-consuming operations into large numbers of functions executing with massive parallelism for short durations in the cloud.
Bio: Keith Winstein is an assistant professor of computer science and, by courtesy, of law at Stanford University. His research group designs networked systems that cross traditional abstraction boundaries, using statistical and functional techniques. He and his colleagues made the Mosh (mobile shell) tool, the Sprout and Remy systems for computer-generated congestion control, the Mahimahi network emulator, and the Lepton JPEG-recompression tool. Winstein has received the Usenix NSDI Community Award (2017), a Google Faculty Research Award (2017, 2015) and Facebook Faculty Award (2016), the ACM SIGCOMM Doctoral Dissertation Award (2015), and the Applied Networking Research Prize (2013). Winstein previously served as a staff reporter at The Wall Street Journal (2007-2009).