Parametrized Hierarchical Procedures for Neural Programming

Roy Fox Deep Learning, Human in the Loop, Theoretical ML

Neural programs are highly accurate and structured policies that perform algorithmic tasks by controlling the behavior of a computation mechanism. Despite the potential to increase the interpretability and the compositionality of the behavior of artificial agents, it remains difficult to learn from demonstrations neural networks that represent computer programs. The main challenges that set algorithmic domains apart from other imitation learning domains are the need for high accuracy, the involvement of specific structures of data, and the extremely limited observability. To address these challenges, we propose to model programs as Parametrized Hierarchical Procedures (PHPs). A PHP is a sequence of conditional operations, using a program counter along with the observation to select between taking an elementary action, invoking another PHP as a sub-procedure, and returning to the caller. We develop an algorithm for training PHPs from a set of supervisor demonstrations, only some of which are annotated with the internal call structure, and apply it to efficient level-wise training of multi-level PHPs. We show in two benchmarks, NanoCraft and long-hand addition, that PHPs can learn neural programs more accurately from smaller amounts of both annotated and unannotated demonstrations.

Published On: April 30, 2018

Presented At/In: International Conferences on Learning Representations (ICLR) 2018

Link: https://openreview.net/pdf?id=rJl63fZRb

Authors: Roy Fox, Eui Chul (Richard) Shin, Sanjay Krishnan, Kenneth Goldberg, Dawn song, Ion Stoica