spelling (using default vim spellchecker and ingoring things like dataframe, docstring and etc)
This commit is contained in:
@@ -156,7 +156,7 @@ def load_results(root_dir_or_dirs, enable_progress=True, enable_monitor=True, ve
|
||||
|
||||
enable_progress: bool - if True, will attempt to load data from progress.csv files (data saved by logger). Default: True
|
||||
|
||||
enable_monitor: bool - if True, will attepmt to load data from monitor.csv files (data saved by Monitor environment wrapper). Default: True
|
||||
enable_monitor: bool - if True, will attempt to load data from monitor.csv files (data saved by Monitor environment wrapper). Default: True
|
||||
|
||||
verbose: bool - if True, will print out list of directories from which the data is loaded. Default: False
|
||||
|
||||
@@ -251,9 +251,9 @@ def plot_results(
|
||||
xy_fn: function Result -> x,y - function that converts results objects into tuple of x and y values.
|
||||
By default, x is cumsum of episode lengths, and y is episode rewards
|
||||
|
||||
split_fn: function Result -> hashable - function that converts results objects into keys to split curves into subpanels by.
|
||||
That is, the results r for which split_fn(r) is different will be put on different subpanels.
|
||||
By default, the portion of r.dirname between last / and -<digits> is returned. The subpanels are
|
||||
split_fn: function Result -> hashable - function that converts results objects into keys to split curves into sub-panels by.
|
||||
That is, the results r for which split_fn(r) is different will be put on different sub-panels.
|
||||
By default, the portion of r.dirname between last / and -<digits> is returned. The sub-panels are
|
||||
stacked vertically in the figure.
|
||||
|
||||
group_fn: function Result -> hashable - function that converts results objects into keys to group curves by.
|
||||
@@ -265,11 +265,11 @@ def plot_results(
|
||||
shaded region around corresponding to the standard deviation, and darker shaded region corresponding to
|
||||
the error of mean estimate (that is, standard deviation over square root of number of samples)
|
||||
|
||||
figsize: tuple or None - size of the resulting figure (including subpanels). By default, width is 6 and height is 6 times number of
|
||||
subpanels.
|
||||
figsize: tuple or None - size of the resulting figure (including sub-panels). By default, width is 6 and height is 6 times number of
|
||||
sub-panels.
|
||||
|
||||
|
||||
legend_outside: bool - if True, will place the legend outside of the subpanels.
|
||||
legend_outside: bool - if True, will place the legend outside of the sub-panels.
|
||||
|
||||
resample: int - if not zero, size of the uniform grid in x direction to resample onto. Resampling is performed via symmetric
|
||||
EMA smoothing (see the docstring for symmetric_ema).
|
||||
|
@@ -1,5 +1,5 @@
|
||||
# Loading and visualizing results
|
||||
In order to compare performance of algorithms, we often would like to vizualise learning curves (reward as a function of timesteps), or some other auxiliary information about learining
|
||||
In order to compare performance of algorithms, we often would like to visualize learning curves (reward as a function of time steps), or some other auxiliary information about learning
|
||||
aggregated into a plot. Baselines repo provides tools for doing so in several different ways, depending on the goal.
|
||||
|
||||
## Preliminaries
|
||||
@@ -12,7 +12,7 @@ Logging to /var/folders/mq/tgrn7bs17s1fnhlwt314b2fm0000gn/T/openai-2018-10-29-15
|
||||
The location can be changed by changing `OPENAI_LOGDIR` environment variable; for instance:
|
||||
```bash
|
||||
export OPENAI_LOGDIR=$HOME/logs/cartpole-ppo
|
||||
python -m baselines.run --alg=ppo2 --env=CartPole-v0 --num_timesteps=30000 --nsteps=128
|
||||
python -m baselines.run --alg=ppo2 --env=CartPole-v0 --num_time steps=30000 --nsteps=128
|
||||
```
|
||||
will log data to `~/logs/cartpole-ppo`.
|
||||
|
||||
@@ -55,7 +55,7 @@ import numpy as np
|
||||
r = results[0]
|
||||
plt.plot(np.cumsum(r.monitor.l), r.monitor.r)
|
||||
```
|
||||
will print a (very noisy learing curve) for CartPole (assuming we ran the training command for CartPole above). Note the cumulative sum trick to get convert length of the episode into number of timsteps taken so far.
|
||||
will print a (very noisy learning curve) for CartPole (assuming we ran the training command for CartPole above). Note the cumulative sum trick to get convert length of the episode into number of time steps taken so far.
|
||||
|
||||
<img src="https://storage.googleapis.com/baselines/assets/viz/Screen%20Shot%202018-10-29%20at%204.44.46%20PM.png" width="500">
|
||||
|
||||
@@ -68,7 +68,7 @@ plt.plot(np.cumsum(r.monitor.l), pu.smooth(r.monitor.r, radius=10))
|
||||
|
||||
We can also get a similar curve by using logger summaries (instead of raw episode data in monitor.csv):
|
||||
```python
|
||||
plt.plot(r.progress.total_timesteps, r.progress.eprewmean)
|
||||
plt.plot(r.progress.total_time steps, r.progress.eprewmean)
|
||||
```
|
||||
|
||||
<img src="https://storage.googleapis.com/baselines/assets/viz/Screen%20Shot%202018-10-29%20at%205.04.31%20PM.png" width="730">
|
||||
@@ -81,14 +81,14 @@ While the loading and the plotting functions described above in principle give y
|
||||
sometimes it is necessary to plot and compare many training runs (multiple algorithms, multiple seeds for random number generator),
|
||||
and usage of the functions above can get tedious and messy. For that case, `baselines.common.plot_util` provides convenience function
|
||||
`plot_results` that handles multiple Result objects that need to be routed in multiple plots. Consider the following bash snippet that
|
||||
runs ppo2 with cartpole with 6 different seeds for 30k timesteps, first with batch size 32, and then with batch size 128:
|
||||
runs ppo2 with cartpole with 6 different seeds for 30k time steps, first with batch size 32, and then with batch size 128:
|
||||
|
||||
```bash
|
||||
for seed in $(seq 0 5); do
|
||||
OPENAI_LOGDIR=$HOME/logs/cartpole-ppo/b32-$seed python -m baselines.run --alg=ppo2 --env=CartPole-v0 --num_timesteps=3e4 --seed=$seed --nsteps=32
|
||||
OPENAI_LOGDIR=$HOME/logs/cartpole-ppo/b32-$seed python -m baselines.run --alg=ppo2 --env=CartPole-v0 --num_time steps=3e4 --seed=$seed --nsteps=32
|
||||
done
|
||||
for seed in $(seq 0 5); do
|
||||
OPENAI_LOGDIR=$HOME/logs/cartpole-ppo/b128-$seed python -m baselines.run --alg=ppo2 --env=CartPole-v0 --num_timesteps=3e4 --seed=$seed --nsteps=128
|
||||
OPENAI_LOGDIR=$HOME/logs/cartpole-ppo/b128-$seed python -m baselines.run --alg=ppo2 --env=CartPole-v0 --num_time steps=3e4 --seed=$seed --nsteps=128
|
||||
done
|
||||
```
|
||||
These 12 runs can be loaded just as before:
|
||||
@@ -109,7 +109,7 @@ Showing all seeds on the same plot may be somewhat hard to comprehend and analys
|
||||
<img src="https://storage.googleapis.com/baselines/assets/viz/Screen%20Shot%202018-11-02%20at%204.42.52%20PM.png" width="720">
|
||||
|
||||
The lighter shade shows the standard deviation of data, and darker shade -
|
||||
error in estimate of the mean (that is, standard deviation divided by sqare root of number of seeds)
|
||||
error in estimate of the mean (that is, standard deviation divided by square root of number of seeds)
|
||||
Note that averaging over seeds requires resampling to a common grid, which, in turn, requires smoothing
|
||||
(using language of signal processing, we need to do low-pass filtering before resampling to avoid aliasing effects).
|
||||
You can change the amount of smoothing by adjusting `resample` and `smooth_step` arguments to achieve desired smoothing effect
|
||||
|
Reference in New Issue
Block a user