Robotics: Science and Systems XVI

Kernel Taylor-Based Value Function Approximation for Continuous-State Markov Decision Processes

Junhong Xu, Kai Yin, Lantao Liu


We propose a principled kernel-based policy iteration algorithm to solve the continuous-state Markov Decision Processes (MDPs). In contrast to most decision-theoretic planning frameworks, which assume fully known state transition models, we design a method that eliminates such a strong assumption which is oftentimes extremely difficult to engineer in reality. To achieve this, we first apply the second-order Taylor expansion of the kernelized value function. The Bellman equation is then approximated by a partial differential equation, which only relies on the first and second moments of the transition model. By combining the kernel representation of value function, we then design an efficient policy iteration algorithm whose policy evaluation step can be represented as a linear system of equations evaluated at a finite set of supporting states. We have validated the proposed method through extensive simulations on both simplified and realistic planning scenarios, and the experiments show that our proposed approach leads to a much superior performance over several baseline methods.



    AUTHOR    = {Junhong Xu AND Kai Yin AND Lantao Liu}, 
    TITLE     = {{Kernel Taylor-Based Value Function Approximation for Continuous-State Markov Decision Processes}}, 
    BOOKTITLE = {Proceedings of Robotics: Science and Systems}, 
    YEAR      = {2020}, 
    ADDRESS   = {Corvalis, Oregon, USA}, 
    MONTH     = {July}, 
    DOI       = {10.15607/RSS.2020.XVI.050}