Logo

Indie Machine Learning and Video Game Dev

Neural Net Downloads
How to
Hardware Reviews
Get Merch
About
Training of the week
29 January 2019

How to setup Open AI Baselines + Retro

by Mathieu Poliquin

This is a highlight from SuperMarioBros-Nes Level 2-1, 120M timesteps PPO2 training
PPO2 120M

This is a quick intro to get started running Machine Learning on retro games (Atari, NES, SNES, Gameboy, Master System, Genesis). I find it’s a great way to start learning about Tensorflow and Machine Learning in general.

Currently the easiest way is to use OpenAI’s baselines and gym-retro

As mentionned on their github page OpenAI baselines is meant to serve as reference for high quality implementations of various RL algorithms. For example you have their implementation of PPO2 (Proximal Policy Optimization) that you can apply to thousands of games ranging from Atari Pong, Sonic The Hedgehog on the Genesis to Super Mario Bros on the NES.

You can find an intro and installation guide to baselines at their Github page It’s decent but if you want extra details on how to get started you can read on :)

Step 1 - Installation on Ubuntu 18.04

I would recommend a fresh install of Ubuntu 18.04. You can always install it on MacOS or Windows as stated on their readme.md but on Ubuntu is definetly the smoothest way. You can execute these commmands at the terminal. It’s mostly all what OpenAI baselines needs that is not included on a default install of Ubuntu, including Python 3 and Tensorflow.

sudo apt-get --assume-yes install python3 python3-pip git zlib1g-dev libopenmpi-dev ffmpeg
sudo apt-get update

pip3 install --timeout 1000 opencv-python cmake anyrl gym-retro joblib atari-py tensorflow

git clone https://github.com/openai/baselines.git
cd baselines
pip3 install -e .

Step 2 - If you want to use a GPU

after that, in the terminal, just type

If you have CUDA 10.0

pip3 install tensorflow-gpu

If you have CUDA 9.0 1.12 is the last version that supports it

pip3 install tensorflow-gpu==1.12

If you installed rocm

pip3 install tensorflow-rocm

Step 3 - Test your setup

Make sure everything runs well by testing Atari Pong

python3 -m baselines.run --alg=ppo2 --env=PongNoFrameskip-v4 --num_timesteps=2e7

The output should look like this: baselines

On a GTX 1060 you should get around 1000 fps of training. If you have much less than that:

Example: PPO2 on Super Mario Bros Nes

First you need to import the SuperMarioBros rom. Unzip the rom and run this command in the directory containing your rom. It looks for a specific version of the rom (the US one I think) so it doesn’t work out try other versions. To save time, I would recommend getting a rom pack of multiple games and import everything at once.

python3 -m retro.import .

Then test Level 3-1 for 10M frames of training with PPO2:

python3 -m baselines.run --alg=ppo2 --env=SuperMarioBros-Nes --gamestate=Level3-1.state --num_timesteps=1e7

Parameters:

You can see the whole training from 0M to 10M here:

It’s same process for other console games altought you will likely need to add it to a list in the source code first. run.py

# reading benchmark names directly from retro requires
# importing retro here, and for some reason that crashes tensorflow
# in ubuntu
_game_envs['retro'] = {
    'BubbleBobble-Nes',
    'SuperMarioBros-Nes',
    'TwinBee3PokoPokoDaimaou-Nes',
    'SpaceHarrier-Nes',
    'SonicTheHedgehog-Genesis',
    'Vectorman-Genesis',
    'FinalFight-Snes',
    'SpaceInvaders-Snes',
}

Play/Load/Save Neural Networks

Save

--save_path=./PATH_TO_MODEL

Load

--load_path=./PATH_TO_MODEL

Play

Notice that num_timesteps is 0, if you put a number bigger than if will train first for that amount

python3 -m baselines.run --alg=ppo2 --env=PongNoFrameskip-v4 --num_timesteps=0 --play --load_path=./PATH_TO_MODEL

Record video

--save_video_interval=1 --save_video_length=NUM_TIMESTEPS

Debug Info + Metrics with Tensorboard

Before running the experiment you need to set the OPENAI_LOG_FORMAT variable:

export OPENAI_LOG_FORMAT='stdout,log,csv,tensorboard'

You can also set the log directory with

export OPENAI_LOGDIR=[PATH_TO_LOGDIR]

You can launch tensorboard with this:

tensorboard --logdir=[PATH_TO_LOGDIR/TB

Note: Currently only some scalars are accessible in tensorboard, if you don’t see graphs and other useful metrics don’t panic it’s just not integrated yet. There is an issue someone opened about it.

Most likely the most important scalar for you is the reward. Here is the reward graph for level 3-1 after 10M frames of training on PPO2. tensorboard_reward_graph

Integration tool

The integration tool is really handy to integrate new games that are not in the list, create new game states, mine additional information from the rom such as enemy positions, etc.

Here is how to compile from source on Ubuntu/Linux

#First install some required libs
sudo apt-get install capnproto libcapnp-dev libqt5opengl5-dev qtbase5-dev
#Get the lastest retro source code
git clone https://github.com/openai/retro.git
cd retro
cmake . -DBUILD_UI=ON -UPYLIB_DIRECTORY
make -j$(grep -c ^processor /proc/cpuinfo)

Next launch the tool:

./gym-retro-integration

Select ‘Game->Load game…’ from menu, then load the mario rom usually located in your python site-packages, some similar to this:

/home/[YOUR USER NAME]/.local/lib/python3.6/site-packages/retro/data/stable/SuperMarioBros-Nes/rom.nes

Select ‘Game->Load state…’ from the menu, then load mario level 3-1 state:

/home/[YOUR USER NAME]/.local/lib/python3.6/site-packages/retro/data/stable/SuperMarioBros-Nes/Level3-1.state

You should see something like this: integration tool

More details: There is already a very good guide on how to use the tool: Integration tool guide

Also this detailed blog post

tags: machine learning - ppo2 - openai - baselines - retro games