Designed and developed reinforcement learning agents that successfully tackled multiple challenges, ultimately contributing to the overall victory of my team.
Orginised by the University of Glasgow, the Code Olympics is a 24 hour long hackathlon where a variety of companies1 present challenges to be completed in groups of four. With over 50 groups in attendance, this event serves as a vibrant platform for collaboration and innovation in the world of coding and technology.
This challenge involved creating an agent to play pig with perfect strategy. At each step, the agent can roll again or stop playing - with the problem phrased like this, I developed and implemented a Q-learning agent to play the game. The agent employs an epsilon-greedy approach in order to balance its exploration of new strategies with its exploitation of known, high-reward actions. Following training against an opponent using a random strategy, my agent’s Q-table revealed optimal moves for every possible position.
I improved my agent slightly by incorporating simulated annealing, a technique where the exploration of sub-optimal moves is gradually reduced over time, allowing for a more refined strategy to emerge.
My agent was tested against other people’s implementations in a tournament-style game, eventually winning the competiton2.
This challenge involved creating an agent (Jake) capable of escaping a 2D maze in the shortest amount of time with no information on its surroundings. After trialing some BFS algorithms, I devloped a Q-learning solution where, at each step, the agent can move up, down, left or right. After training, the agent can successfully navigate the maze and escape quickly.
My agent was tested against other people’s implementations in a series’ of unseen mazes. With the quickest time to escape, my agent won the competition3.
List of awards received for this hackathlon.