* exported rl-algs * more stuff from rl-algs * run slow tests * re-exported rl_algs * re-exported rl_algs - fixed problems with serialization test and test_cartpole * replaced atari_arg_parser with common_arg_parser * run.py can run algos from both baselines and rl_algs * added approximate humanoid reward with ppo2 into the README for reference * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * very dummy commit to RUN BENCHMARKS * serialize variables as a dict, not as a list * running_mean_std uses tensorflow variables * fixed import in vec_normalize * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * flake8 complaints * save all variables to make sure we save the vec_normalize normalization * benchmarks on ppo2 only RUN BENCHMARKS * make_atari_env compatible with mpi * run ppo_mpi benchmarks only RUN BENCHMARKS * hardcode names of retro environments * add defaults * changed default ppo2 lr schedule to linear RUN BENCHMARKS * non-tf normalization benchmark RUN BENCHMARKS * use ncpu=1 for mujoco sessions - gives a bit of a performance speedup * reverted running_mean_std to user property decorators for mean, var, count * reverted VecNormalize to use RunningMeanStd (no tf) * reverted VecNormalize to use RunningMeanStd (no tf) * profiling wip * use VecNormalize with regular RunningMeanStd * added acer runner (missing import) * flake8 complaints * added a note in README about TfRunningMeanStd and serialization of VecNormalize * dummy commit to RUN BENCHMARKS * merged benchmarks branch
35 lines
1.0 KiB
Python
35 lines
1.0 KiB
Python
import argparse
|
|
|
|
import numpy as np
|
|
|
|
from baselines import deepq
|
|
from baselines.common import retro_wrappers
|
|
|
|
|
|
def main():
|
|
parser = argparse.ArgumentParser()
|
|
parser.add_argument('--env', help='environment ID', default='SuperMarioBros-Nes')
|
|
parser.add_argument('--gamestate', help='game state to load', default='Level1-1')
|
|
parser.add_argument('--model', help='model pickle file from ActWrapper.save', default='model.pkl')
|
|
args = parser.parse_args()
|
|
|
|
env = retro_wrappers.make_retro(game=args.env, state=args.gamestate, max_episode_steps=None)
|
|
env = retro_wrappers.wrap_deepmind_retro(env)
|
|
act = deepq.load(args.model)
|
|
|
|
while True:
|
|
obs, done = env.reset(), False
|
|
episode_rew = 0
|
|
while not done:
|
|
env.render()
|
|
action = act(obs[None])[0]
|
|
env_action = np.zeros(env.action_space.n)
|
|
env_action[action] = 1
|
|
obs, rew, done, _ = env.step(env_action)
|
|
episode_rew += rew
|
|
print('Episode reward', episode_rew)
|
|
|
|
|
|
if __name__ == '__main__':
|
|
main()
|