* exported rl-algs * more stuff from rl-algs * run slow tests * re-exported rl_algs * re-exported rl_algs - fixed problems with serialization test and test_cartpole * replaced atari_arg_parser with common_arg_parser * run.py can run algos from both baselines and rl_algs * added approximate humanoid reward with ppo2 into the README for reference * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * very dummy commit to RUN BENCHMARKS * serialize variables as a dict, not as a list * running_mean_std uses tensorflow variables * fixed import in vec_normalize * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * flake8 complaints * save all variables to make sure we save the vec_normalize normalization * benchmarks on ppo2 only RUN BENCHMARKS * make_atari_env compatible with mpi * run ppo_mpi benchmarks only RUN BENCHMARKS * hardcode names of retro environments * add defaults * changed default ppo2 lr schedule to linear RUN BENCHMARKS * non-tf normalization benchmark RUN BENCHMARKS * use ncpu=1 for mujoco sessions - gives a bit of a performance speedup * reverted running_mean_std to user property decorators for mean, var, count * reverted VecNormalize to use RunningMeanStd (no tf) * reverted VecNormalize to use RunningMeanStd (no tf) * profiling wip * use VecNormalize with regular RunningMeanStd * added acer runner (missing import) * flake8 complaints * added a note in README about TfRunningMeanStd and serialization of VecNormalize * dummy commit to RUN BENCHMARKS * merged benchmarks branch
57 lines
1.6 KiB
Python
57 lines
1.6 KiB
Python
import tensorflow as tf
|
|
from gym.spaces import Discrete, Box
|
|
|
|
def observation_placeholder(ob_space, batch_size=None, name='Ob'):
|
|
'''
|
|
Create placeholder to feed observations into of the size appropriate to the observation space
|
|
|
|
Parameters:
|
|
----------
|
|
|
|
ob_space: gym.Space observation space
|
|
|
|
batch_size: int size of the batch to be fed into input. Can be left None in most cases.
|
|
|
|
name: str name of the placeholder
|
|
|
|
Returns:
|
|
-------
|
|
|
|
tensorflow placeholder tensor
|
|
'''
|
|
|
|
assert isinstance(ob_space, Discrete) or isinstance(ob_space, Box), \
|
|
'Can only deal with Discrete and Box observation spaces for now'
|
|
|
|
return tf.placeholder(shape=(batch_size,) + ob_space.shape, dtype=ob_space.dtype, name=name)
|
|
|
|
|
|
def observation_input(ob_space, batch_size=None, name='Ob'):
|
|
'''
|
|
Create placeholder to feed observations into of the size appropriate to the observation space, and add input
|
|
encoder of the appropriate type.
|
|
'''
|
|
|
|
placeholder = observation_placeholder(ob_space, batch_size, name)
|
|
return placeholder, encode_observation(ob_space, placeholder)
|
|
|
|
def encode_observation(ob_space, placeholder):
|
|
'''
|
|
Encode input in the way that is appropriate to the observation space
|
|
|
|
Parameters:
|
|
----------
|
|
|
|
ob_space: gym.Space observation space
|
|
|
|
placeholder: tf.placeholder observation input placeholder
|
|
'''
|
|
if isinstance(ob_space, Discrete):
|
|
return tf.to_float(tf.one_hot(placeholder, ob_space.n))
|
|
|
|
elif isinstance(ob_space, Box):
|
|
return tf.to_float(placeholder)
|
|
else:
|
|
raise NotImplementedError
|
|
|