Navigating Uncertainty: A Reinforcement Learning Approach to Solving the Frozen Lake Challenge
College:
College of Business and Public Administration
Major:
Economics
Faculty Research Advisor(s):
Israel Curbelo
Abstract:
This project explores the application of Reinforcement Learning (RL) techniques to address the Frozen Lake environment from the OpenAI Gym, a popular platform for evaluating and developing RL algorithms. The Frozen Lake environment presents an agent with the task of navigating across a grid of ice and water tiles to reach a goal, without prior knowledge of the environment's dynamics. Our approach leverages the principles of Markov Decision Processes (MDPs) to model the environment, enabling the agent to learn optimal policies through interaction and feedback.
We begin by formalizing the Frozen Lake problem as an MDP, characterizing the environment's states, actions, rewards, and transition probabilities. To solve the MDP, we implement several RL algorithms, including Value Iteration, Policy Iteration, and Q-Learning, and evaluate their performance in terms of efficiency, effectiveness, and the ability to learn under uncertainty. Our implementation pays particular attention to the challenges of sparse rewards and stochastic transitions inherent to the Frozen Lake environment.
Our results demonstrate the comparative strengths and limitations of each algorithm within this context. Value Iteration and Policy Iteration, while guaranteeing convergence to an optimal policy, differ significantly in computational efficiency and practicality for real-time decision-making. Q-Learning, an off-policy temporal difference learning algorithm, exhibits robustness in learning from sparse and delayed rewards, making it particularly suited for the Frozen Lake challenge.
The project not only showcases the adaptability and potential of RL in navigating complex, uncertain environments but also contributes to the broader understanding of how different RL strategies can be optimized and applied to specific challenges. Through extensive experimentation and analysis, this work highlights the critical aspects of algorithm selection, parameter tuning, and policy evaluation in the pursuit of autonomous decision-making in uncertain and dynamic environments.