In order to compare performance of algorithms, we often would like to vizualise learning curves (reward as a function of timesteps), or some other auxiliary information about learining
aggregated into a plot. Baselines repo provides tools for doing so in several different ways, depending on the goal.
## Preliminaries
For all algorithms in baselines directory in which summary data is saved is defined by logger. By default, a folder `$TMPDIR/openai-<date>-<time>` is used;
you can see the location of logger directory at the beginning of the training in the message like this:
```
Logging to /var/folders/mq/tgrn7bs17s1fnhlwt314b2fm0000gn/T/openai-2018-10-29-15-03-13-537078
```
The location can be changed by changing `OPENAI_LOGDIR` environment variable; for instance:
One of the most straightforward ways to visualize data is to use [TensorBoard](https://www.tensorflow.org/guide/summaries_and_tensorboard). Baselines logger can dump data in tensorboard-compatible format; to
set that up, set environment variables `OPENAI_LOG_FORMAT`
```bash
export OPENAI_LOG_FORMAT='stdout,log,csv,tensorboard' # formats are comma-separated, but for tensorboard you only really need the last one
If the summary overview provided by tensorboard is not sufficient, and you would like to either access to raw environment episode data, or use complex post-processing notavailable in tensorboard, you can load results into python as [pandas](https://pandas.pydata.org/) dataframes.
- monitor: pandas.DataFrame - raw episode data (length, episode reward, timestamp). Available if environment wrapped with [Monitor](../../baselines/bench/monitor.py) wrapper
Once results are loaded, they can be plotted in all conventional means. For example:
```python
import matplotlib.pyplot as plt
import numpy as np
r = results[0]
plt.plot(np.cumsum(r.monitor.l), r.monitor.r)
```
will print a (very noisy learing curve) for CartPole (assuming we ran the training command for CartPole above). Note the cumulative sum trick to get convert length of the episode into number of timsteps taken so far.
But how do we plot all 12 of them in a sensible manner? `baselines.common.plot_util` module provides `plot_results` function to do just that:
```
results = results[1:]
pu.plot_results(results)
```
(note that now the length of the results list is 13, due to the data from the previous run stored directly in `~/logs/cartpole-ppo`; we discard first element for the same reason)
The results are split into two groups based on batch size and are plotted on a separate graph. More specifically, by default `plot_results` considers digits after dash at the end of the directory name to be seed id and groups the runs that differ only by those together.