updated docstring for deepq

This commit is contained in:
Peter Zhokhov
2018-10-19 17:50:54 -07:00
parent d0cc325e14
commit bd390c2ade

View File

@@ -124,16 +124,12 @@ def learn(env,
------- -------
env: gym.Env env: gym.Env
environment to train on environment to train on
q_func: (tf.Variable, int, str, bool) -> tf.Variable network: string or a function
the model that takes the following inputs: neural network to use as a q function approximator. If string, has to be one of the names of registered models in baselines.common.models
observation_in: object (mlp, cnn, conv_only). If a function, should take an observation tensor and return a latent variable tensor, which
the output of observation placeholder will be mapped to the Q function heads (see build_q_func in baselines.deepq.models for details on that)
num_actions: int seed: int or None
number of actions prng seed. The runs with the same seed "should" give the same results. If None, no seeding is used.
scope: str
reuse: bool
should be passed to outer variable scope
and returns a tensor of shape (batch_size, num_actions) with values of every action.
lr: float lr: float
learning rate for adam optimizer learning rate for adam optimizer
total_timesteps: int total_timesteps: int