Vijay Nannapuraju - Portfolio

Overview

This project implements a complete reinforcement learning pipeline combining algorithm development, environment simulation, and interactive visualization. The agent employs Q-learning with an epsilon-greedy exploration strategy to learn optimal navigation policies in a dynamic grid world.

Key Features

Q-Learning Algorithm: Implements epsilon-greedy exploration strategy for optimal policy learning
Reward Shaping: Proximity-based bonuses and out-of-bounds penalties guide learning
Real-Time Visualization: SFML-based UI displays grid layout, agent trajectory, and reward heatmaps
Training Analytics: Live monitoring of Q-value distributions and episode metrics
Interactive Controls: Pause/resume training, manual agent control, and adjustable training speed

Technical Implementation

Built with C++ and SFML for high-performance rendering and real-time interaction. The Q-learning implementation uses temporal difference learning with customizable hyperparameters. The visualization system provides comprehensive insights into how the agent's policy evolves over episodes.

Learning Outcomes

This integrated system demonstrates practical applications of Q-learning in a visually engaging environment, making reinforcement learning concepts tangible through interactive exploration and real-time feedback.

C++SFMLQ-LearningReinforcement LearningReal-Time Visualization

VIJAY RAJU NANNAPURAJU

Grid Navigator RL

Overview

Key Features

Technical Implementation

Learning Outcomes