Navigating Uncertainty: A Reinforcement Learning Approach to Solving the Frozen Lake Challenge

Click the Poster to View Full Screen, Right click to save image

Qiutong Liu

CoPIs:
Sihan Fu, Weixun Xie, Junyang Li, Zhirui Chen

College:
College of Business and Public Administration

Major:
Economics

Faculty Research Advisor(s):
Israel Curbelo

Abstract:
This project explores the application of Reinforcement Learning (RL) techniques to address the Frozen Lake environment from the OpenAI Gym, a popular platform for evaluating and developing RL algorithms. The Frozen Lake environment presents an agent with the task of navigating across a grid of ice and water tiles to reach a goal, without prior knowledge of the environment's dynamics. Our approach leverages the principles of Markov Decision Processes (MDPs) to model the environment, enabling the agent to learn optimal policies through interaction and feedback.

We begin by formalizing the Frozen Lake problem as an MDP, characterizing the environment's states, actions, rewards, and transition probabilities. To solve the MDP, we implement several RL algorithms, including Value Iteration, Policy Iteration, and Q-Learning, and evaluate their performance in terms of efficiency, effectiveness, and the ability to learn under uncertainty. Our implementation pays particular attention to the challenges of sparse rewards and stochastic transitions inherent to the Frozen Lake environment.

Our results demonstrate the comparative strengths and limitations of each algorithm within this context. Value Iteration and Policy Iteration, while guaranteeing convergence to an optimal policy, differ significantly in computational efficiency and practicality for real-time decision-making. Q-Learning, an off-policy temporal difference learning algorithm, exhibits robustness in learning from sparse and delayed rewards, making it particularly suited for the Frozen Lake challenge.

The project not only showcases the adaptability and potential of RL in navigating complex, uncertain environments but also contributes to the broader understanding of how different RL strategies can be optimized and applied to specific challenges. Through extensive experimentation and analysis, this work highlights the critical aspects of algorithm selection, parameter tuning, and policy evaluation in the pursuit of autonomous decision-making in uncertain and dynamic environments.


Previous
Previous

Optimizing Tesla's Capital Structure: Assessing Value and Risk under the Optimal Capital Structure Framework

Next
Next

Economic growth, environment, and urbanization in Latin America