updated docstring for deepq
This commit is contained in:
@@ -124,16 +124,12 @@ def learn(env,
|
|||||||
-------
|
-------
|
||||||
env: gym.Env
|
env: gym.Env
|
||||||
environment to train on
|
environment to train on
|
||||||
q_func: (tf.Variable, int, str, bool) -> tf.Variable
|
network: string or a function
|
||||||
the model that takes the following inputs:
|
neural network to use as a q function approximator. If string, has to be one of the names of registered models in baselines.common.models
|
||||||
observation_in: object
|
(mlp, cnn, conv_only). If a function, should take an observation tensor and return a latent variable tensor, which
|
||||||
the output of observation placeholder
|
will be mapped to the Q function heads (see build_q_func in baselines.deepq.models for details on that)
|
||||||
num_actions: int
|
seed: int or None
|
||||||
number of actions
|
prng seed. The runs with the same seed "should" give the same results. If None, no seeding is used.
|
||||||
scope: str
|
|
||||||
reuse: bool
|
|
||||||
should be passed to outer variable scope
|
|
||||||
and returns a tensor of shape (batch_size, num_actions) with values of every action.
|
|
||||||
lr: float
|
lr: float
|
||||||
learning rate for adam optimizer
|
learning rate for adam optimizer
|
||||||
total_timesteps: int
|
total_timesteps: int
|
||||||
|
Reference in New Issue
Block a user