This work explores the use of Reinforcement Learning as a tool for designing policies for systems under uncertainty. Our research investigates the efficacy of Reinforcement Learning to design policies, and how Dynamic Adaptive Policy Pathways (DAPP), can improve the quality of Reinforcement Learning derived policies in uncertain systems. The Victorian electricity market is used as a case study, where policies have been designed to support the transition to an environmentally sustainable future. A novel integration of the DAPP framework into Reinforcement Learning algorithms is proposed, to bolster the efficacy and robustness of Reinforcement Learning derived policies. Experimentation is also performed to better understand the Multi-Objective Evolutionary Algorithms (MOEAs) used by the DAPP framework to computationally design its policies. Our discussion on MOEAs evaluates what strengths they provide the DAPP framework for developing robust policies, and their implications for our proposed DAPP-Reinforcement Learning method. A comparative analysis is conducted on the quality of policies designed using only Reinforce- ment Learning techniques, compared to policies designed using our DAPP-Reinforcement Learning method, in addition to various baseline policies. Our results show policies designed by the DAPP-Reinforcement Learning method on average increase Victoria’s renewable elec- tricity utilisation by 23%, and decrease household greenhouse gas emissions by 28%, when compared to the policies derived via only Reinforcement Learning algorithms. Through crit- ical analysis of the results, this work conveys how the strengths of the DAPP framework can be combined with Reinforcement Learning to develop more robust policies for systems under uncertainty.