A team of machine learning researchers from Oxford University have showcased their AI research at the AI Summit on the Microsoft stand. They brought their deep reinforcement learning and use of Microsoft Azure to a fun and engaging demonstration which bring human players and AI together to collaborate in fighting against the Starcraft bots.
StarCraft is a military science fiction media franchise created by Chris Metzen and James Phinney, and owned by Blizzard Entertainment. The series, set in the beginning of the 26th century, centers on a galactic struggle for dominance between four species—the adaptable and mobile Terrans, the ever-evolving insectoid Zerg, the powerfully enigmatic Protoss, and the “god-like” Xel’Naga creator race—in a distant part of the Milky Way galaxy known as the Koprulu Sector. The series debuted with the video game StarCraft in 1998. Since then it has grown to include a number of other games as well as eight novelizations, two Amazing Stories articles, a board game, and other licensed merchandise such as collectible statues and toys.
This work is being carried out by the Whiteson research lab (http://whirl.cs.ox.ac.uk /) in collaboration with PhD students from the Engineering department in Oxford. They are using StarCraft as a platform for developing and testing novel methods in deep multi-agent reinforcement learning. The effort is based on the TorchCraft library, an open source platform which allows Torch code to interact with the StarCraft game, Brood war. The Oxford team plans to release their codebase to the public after the publication of their next paper , which has been submitted to NIPS. Unlike work undertaken by other research institutes using StarCraft, their effort is focussed on decentralised execution, meaning that each unit has to take independent decisions during the game based on local and incomplete observations. The project was initially developed on onsite servers. However, shifting to Azure has been extremely simple for them and has allowed Oxford to massively expand the number of experiments undertaken and the scope of the research as a whole.
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
“Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A key stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep RL relies. This paper proposes two methods that address this problem: 1) conditioning each agent’s value function on a footprint that disambiguates the age of the data sampled from the replay memory and 2) using a multi-agent variant of importance sampling to naturally decay obsolete data. Results on a challenging decentralised variant of StarCraft unit micromanagement confirm that these methods enable the successful combination of experience replay with multi-agent RL.”
The summary of their research can be found here and undergraduate modules at http://www.cs.ox.ac.uk/teaching/courses/2016-2017/ml/
- See paper https://arxiv.org/abs/1702.08887