diff --git a/README.md b/README.md index e382a8b..e4f8697 100644 --- a/README.md +++ b/README.md @@ -110,7 +110,7 @@ python -m baselines.run --alg=ppo2 --env=PongNoFrameskip-v4 --num_timesteps=0 -- *NOTE:* At the moment Mujoco training uses VecNormalize wrapper for the environment which is not being saved correctly; so loading the models trained on Mujoco will not work well if the environment is recreated. If necessary, you can work around that by replacing RunningMeanStd by TfRunningMeanStd in [baselines/common/vec_env/vec_normalize.py](baselines/common/vec_env/vec_normalize.py#L12). This way, mean and std of environment normalizing wrapper will be saved in tensorflow variables and included in the model file; however, training is slower that way - hence not including it by default ## Loading and vizualizing learning curves and other training metrics -See [here](docs/viz/viz.md) for instructions on how to load and display the training data. +See [here](docs/viz/viz.ipynb) for instructions on how to load and display the training data. ## Subpackages diff --git a/baselines/common/plot_util.py b/baselines/common/plot_util.py index 1d105c8..8009295 100644 --- a/baselines/common/plot_util.py +++ b/baselines/common/plot_util.py @@ -332,7 +332,7 @@ def plot_results( xys = gresults[group] if not any(xys): continue - color = COLORS[groups.index(group)] + color = COLORS[groups.index(group) % len(COLORS)] origxs = [xy[0] for xy in xys] minxlen = min(map(len, origxs)) def allequal(qs): diff --git a/docs/viz/viz.ipynb b/docs/viz/viz.ipynb new file mode 100644 index 0000000..6eb0cf0 --- /dev/null +++ b/docs/viz/viz.ipynb @@ -0,0 +1,808 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "Ynb-laSwmpac" + }, + "source": [ + "# Loading and visualizing results ([open in colab](https://colab.research.google.com/github/openai/baselines/blob/master/docs/viz.ipynb))\n", + "In order to compare performance of algorithms, we often would like to visualize learning curves (reward as a function of time steps), or some other auxiliary information about learning aggregated into a plot. Baselines repo provides tools for doing so in several different ways, depending on the goal." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "yreoV7OClzYG" + }, + "source": [ + "## Preliminaries / TensorBoard\n", + "First, let us install baselines repo from github" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "r4Aul2Qujlg9" + }, + "outputs": [], + "source": [ + "!pip install git+https://github.com/openai/baselines > ~/pip_install_baselines.log" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "1n7XAyVWniRp" + }, + "source": [ + "For all algorithms in baselines summary data is saved into a folder defined by logger. By default, a folder $TMPDIR/openai--