RISE Seminar 2/1/18 : Yangqing Jia – Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective

February 1, 2018

Title: Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective

By: Yangqing Jia

Affiliation: Facebook AI

Where/When: Thursday Feb 1 noon-1pm Wozniak Lounge (430 Soda Hall)

Abstract: Machine learning sits at the core of many essential products and services at Facebook. This paper describes the hardware and software infrastructure that supports machine learning at global scale. Facebook’s machine learning workloads are extremely diverse: services require many different types of models in practice. This diversity has implications at all layers in the system stack. In addition, a sizable fraction of all data stored at Facebook flows through machine learning pipelines, presenting significant challenges in delivering data to high-performance distributed training flows. Computational requirements are also intense, leveraging both GPU and CPU platforms for training and abundant CPU capacity for real-time inference. Addressing these and other emerging challenges continues to require diverse efforts that span machine learning algorithms, software, and hardware design.

Bio: Yangqing Jia obtained his PhD degree from Berkeley in 2013 in Prof Trevor Darrell’s group. He is currently a research scientist manager currently leading Facebook’s AI platform team. The team serves as the backbone of Facebook AI products, such as ranking, computer vision, natural language processing, speech recognition, mobile AI, and AR. He has worked on a series of deep learning frameworks such as Caffe and TensorFlow over the years. Before Facebook, Yangqing was a research scientist at Google Brain.