Boundary Extension Features for Width-Based Planning with Simulators on Continuous-State Domains


Width-based planning algorithms have been demonstrated to be competitive with state-of-the-art heuristic search and SAT-based approaches, without requiring access to a model of action effects and preconditions, just access to a black-box simulator. Width-based planners search is guided by a measure of the novelty of states, that requires observations on simulator states to be given as a set of features. This paper proposes agnostic feature mapping mechanisms that define the features online, as exploration progresses and the domain of continuous state variables is revealed. We demonstrate the effectiveness of these features on the OpenAI gym “classical control” suite of benchmarks. We compare our online planners with state-of-the-art deep reinforcement learning algorithms, and show that width-based planners using our features can find policies of the same quality with significantly less computational resources.

International Joint Conference on Artificial Intelligence (IJCAI)