Integrating Natural Language in Sequential Decision Problems

Abstract

Natural language (NL) offers an intuitive interface for humans to specify tasks for AI systems, enabling lay users to program agents without formal coding. However, NL’s lack of precision introduces critical challenges for sequential decision-making (SDM) systems: noise can steer agents toward undesired behaviors, while ambiguity leads to inconsistent interpretations and invalid decisions. This thesis investigates how NL can reliably be integrated into SDM systems across two dominant paradigms: model-free reinforcement learning (RL), where agents learn policies through trial-and-error, and model-based automated planning, where agents reason over explicit representations of the problems to generate solutions. The research exposes critical limitations in existing approaches, proposes novel solutions including BiMI reward function for addressing false positive rewards, and re-evaluates assumptions about large language model (LLM)-based SDM systems to clarify their capabilities and constraints.

Publication
The University of Melbourne

Related