Fix bug in DDPG parameter space noise adaptation (#306)

The training loop used the rollout step variable `t` rather than the training step variable `t_train` to decide when to adapt the scale of the parameter space noise.
2018-03-01 09:00:34 -08:00
parent 6bdf2f55a2
commit f49a9c3d85
1 changed files with 1 additions and 1 deletions
--- a/baselines/ddpg/training.py
+++ b/baselines/ddpg/training.py
@@ -109,7 +109,7 @@ def train(env, nb_epochs, nb_epoch_cycles, render_eval, reward_scale, render, pa
                epoch_adaptive_distances = []
                for t_train in range(nb_train_steps):
                    # Adapt param noise, if necessary.
-                    if memory.nb_entries >= batch_size and t % param_noise_adaption_interval == 0:
+                    if memory.nb_entries >= batch_size and t_train % param_noise_adaption_interval == 0:
                        distance = agent.adapt_param_noise()
                        epoch_adaptive_distances.append(distance)