From b1644157d60a384c544cf3d2e80bf24dc356faf4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dar=C3=ADo=20Here=C3=B1=C3=BA?= Date: Mon, 1 Apr 2019 19:41:52 -0300 Subject: [PATCH] Fixed typo on #092 (#824) --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 23487b6..9b6500b 100644 --- a/README.md +++ b/README.md @@ -89,7 +89,7 @@ python -m baselines.run --alg=ppo2 --env=Humanoid-v2 --network=mlp --num_timeste will set entropy coefficient to 0.1, and construct fully connected network with 3 layers with 32 hidden units in each, and create a separate network for value function estimation (so that its parameters are not shared with the policy network, but the structure is the same) See docstrings in [common/models.py](baselines/common/models.py) for description of network parameters for each type of model, and -docstring for [baselines/ppo2/ppo2.py/learn()](baselines/ppo2/ppo2.py#L152) for the description of the ppo2 hyperparamters. +docstring for [baselines/ppo2/ppo2.py/learn()](baselines/ppo2/ppo2.py#L152) for the description of the ppo2 hyperparameters. ### Example 2. DQN on Atari DQN with Atari is at this point a classics of benchmarks. To run the baselines implementation of DQN on Atari Pong: