Robotics: Science and Systems X

Tell Me Dave: Context-Sensitive Grounding of Natural Language to Manipulation Instructions

Dipendra Kumar Misra, Jaeyong Sung, Kevin Lee, Ashutosh Saxena


We consider performing a sequence of mobile manipulation tasks with instructions given in natural language (NL). Given a new environment, even a simple task such as of boiling water would be performed quite differently depending on the presence, location and state of the objects. We start by collecting a dataset of task descriptions in free-form natural language and the corresponding grounded task-logs of the tasks performed in an online robot simulator. We then build a library of verb-environment- instructions that represents the possible instructions for each verb in that environment—these may or may not be valid for a different environment and task context. We present a model that takes into account the variations in natural language, and ambiguities in grounding them to robotic instructions with appropriate environment context and task constraints. Our model also handles incomplete or noisy NL instructions. Our model is based on an energy function that encodes such properties in a form isomorphic to a conditional random field. In evaluation, we show that our model produces sequences that perform the task successfully in a simulator and also significantly outperforms the state-of-the-art. We also verify by executing our output instruction sequences on a PR2 robot.



    AUTHOR    = {Dipendra Kumar Misra AND Jaeyong Sung AND Kevin Lee AND Ashutosh Saxena}, 
    TITLE     = {Tell Me Dave: Context-Sensitive Grounding of Natural Language to Manipulation Instructions}, 
    BOOKTITLE = {Proceedings of Robotics: Science and Systems}, 
    YEAR      = {2014}, 
    ADDRESS   = {Berkeley, USA}, 
    MONTH     = {July},
    DOI       = {10.15607/RSS.2014.X.005}