Fix bug in DDPG parameter space noise adaptation (#306)

The training loop used the rollout step variable `t` rather than the
training step variable `t_train` to decide when to adapt the scale of
the parameter space noise.
This commit is contained in:
Daniel Ziegler
2018-03-01 09:00:34 -08:00
committed by Matthias Plappert
parent 6bdf2f55a2
commit f49a9c3d85

View File

@@ -109,7 +109,7 @@ def train(env, nb_epochs, nb_epoch_cycles, render_eval, reward_scale, render, pa
epoch_adaptive_distances = []
for t_train in range(nb_train_steps):
# Adapt param noise, if necessary.
if memory.nb_entries >= batch_size and t % param_noise_adaption_interval == 0:
if memory.nb_entries >= batch_size and t_train % param_noise_adaption_interval == 0:
distance = agent.adapt_param_noise()
epoch_adaptive_distances.append(distance)