Fix bug in DDPG parameter space noise adaptation (#306)
The training loop used the rollout step variable `t` rather than the training step variable `t_train` to decide when to adapt the scale of the parameter space noise.
This commit is contained in:
committed by
Matthias Plappert
parent
6bdf2f55a2
commit
f49a9c3d85
@@ -109,7 +109,7 @@ def train(env, nb_epochs, nb_epoch_cycles, render_eval, reward_scale, render, pa
|
||||
epoch_adaptive_distances = []
|
||||
for t_train in range(nb_train_steps):
|
||||
# Adapt param noise, if necessary.
|
||||
if memory.nb_entries >= batch_size and t % param_noise_adaption_interval == 0:
|
||||
if memory.nb_entries >= batch_size and t_train % param_noise_adaption_interval == 0:
|
||||
distance = agent.adapt_param_noise()
|
||||
epoch_adaptive_distances.append(distance)
|
||||
|
||||
|
Reference in New Issue
Block a user