Commit Graph

1491 Commits

Author SHA1 Message Date
Jie Tang
15c76be797 Bump version 2016-11-01 11:14:47 -07:00
Jie Tang
8c6bd5f9ce Add readme line about change 2016-10-31 23:56:37 -07:00
Jie Tang
ae95553ca7 Supply default algorithm name in sample upload code 2016-10-31 23:56:37 -07:00
Jie Tang
71bb5f8563 Stop seeding in monitor / uploading seeds to scoreboard 2016-10-31 23:56:37 -07:00
Jie Tang
a62bba3bff Only save stats on completed episodes 2016-10-31 23:56:37 -07:00
Jie Tang
533d21a63a Disallow reset()s unless we've reached a done state while monitor is running 2016-10-31 23:56:37 -07:00
Jie Tang
bc2c12a36e Sort env specs when running tests (this allows for repeating a specific test
using an index)
2016-10-31 23:55:56 -07:00
Jie Tang
8293b4bda8 Add benchmarks to scoreboard registry 2016-10-31 18:34:00 -07:00
Greg Brockman
03d40e9422 Bump version 2016-10-31 11:37:10 -07:00
Greg Brockman
6db689a0ca Expand env-id format 2016-10-31 11:36:45 -07:00
Jie Tang
03af6391e9 Tweak to client side compute 2016-10-28 11:52:59 -07:00
Jie Tang
c3283adda0 Fix broken benchmark scoring when handling eval episodes, add a test 2016-10-28 11:48:49 -07:00
Jie Tang
7c8f2b767c local utility benchmark scoring function 2016-10-28 02:08:36 -07:00
Jie Tang
9255e2264c Bring back total environment wall time 2016-10-27 22:49:16 -07:00
Jie Tang
1cc33eb081 Fix bug shadowing initial timesteps, update tests 2016-10-27 22:25:54 -07:00
Jie Tang
271ef783c6 Fix fencepost error in scoring, make unit test actually catch this 2016-10-27 21:35:23 -07:00
Jie Tang
9244bd5001 Properly compute total time 2016-10-27 20:22:49 -07:00
Jie Tang
44ce715dfa Add total reward scoring, tests, propagate solved 2016-10-27 20:22:26 -07:00
Jie Tang
6037456a14 Comment scoring rule 2016-10-27 20:22:22 -07:00
Jie Tang
9347e0611b Error if no evaluations found 2016-10-27 20:22:22 -07:00
Jie Tang
71af1191e0 Fix some bugs with new partial benchmark scoring 2016-10-27 12:09:49 -07:00
Jie Tang
f7a45f6953 py2 numerical compatibility 2016-10-26 16:57:26 -07:00
Jie Tang
3ebe82f69c Fix tests after refactor 2016-10-26 12:12:50 -07:00
Colin
b5f774fe18 Remove blacklist of broken envs from rollout generation (#391) 2016-10-26 11:45:31 -07:00
Colin
7bd39ef62c Redirect stderr to avoid avconv version spam (#393) 2016-10-26 11:44:00 -07:00
Jie Tang
3c341c279d Move / rename benchmark scoring function 2016-10-25 21:55:54 -07:00
Jie Tang
53cde23ece Fix bug in max_seconds scoring. Refactor null_score, add tests for it all 2016-10-25 21:55:54 -07:00
Jie Tang
7513f6e2bd Allow mismatched uploads 2016-10-25 21:55:54 -07:00
Jie Tang
859144868f Implement benchmark scoring on gym side 2016-10-25 21:55:50 -07:00
Jie Tang
2126be9844 Fix error handling bug in benchmark.task_specs, store set of env ids 2016-10-24 23:37:02 -07:00
Jie Tang
d83e8f8702 Bump version 2016-10-24 13:16:34 -07:00
rafal
235237595e v3 benchmarks 2016-10-24 11:56:46 -04:00
Greg Brockman
acb6946b90 Print out env_id when stepping too far 2016-10-23 14:05:42 -07:00
Greg Brockman
9940212f5d Fix test return type 2016-10-23 13:39:22 -07:00
Greg Brockman
59b58602d4 Fix determinism test 2016-10-23 13:09:44 -07:00
Greg Brockman
b3caa3f147 Bump version 2016-10-23 10:56:41 -07:00
Greg Brockman
a3e75eaf43 Allow for autoreset envs 2016-10-23 10:56:30 -07:00
Colin
6d51f72630 Some tweaks to json rollout generation (#386)
* Some tweaks to json rollout generation:

- sort keys (to make diffs easier to read)
- don't open and close json file so much
- command line flag to allow overwriting hashes

* Use a single rollout file, rather than one for each python version.
2016-10-21 19:00:36 -07:00
Colin
e84bd0ffe1 Algorithmic refactor (#383)
* Refactor/document algorithmic environments and add tests.

* test for 3 row addition

* Fix failing rollout test by reinserting quirk in reversedAddition env

* todo regarding addition3-v0

* Fix python 3 division issues

* typo fix

* Re-generate python3 rollout file to account for ReversedAddition bug fix
2016-10-21 16:06:48 -07:00
Jie Tang
bee6be5632 Typo in source indexes 2016-10-20 22:57:33 -07:00
Jie Tang
2dba05ac0a Minor bug computing sources 2016-10-20 22:50:13 -07:00
Jie Tang
1f7c6464b7 Thread data_source and initial_reset_timestamps through to scoreboard 2016-10-20 22:19:39 -07:00
Jie Tang
c46008bfae Fix up benchmark runner to use new benchmark format 2016-10-20 22:19:03 -07:00
Jie Tang
a780b75556 Update upload to respect new gym benchmark spec format 2016-10-20 21:10:34 -07:00
Jie Tang
a8e4734cbc Fix evaluation scoring bug (numpy casts generator to a single-element array containing the generator, which is a truthy object) 2016-10-20 19:39:20 -07:00
Greg Brockman
88f94587a2 Update benchmark spec (#385)
* Update benchmark spec

* Update format of benchmark again

* Add support for max_seconds to benchmark

* Bump version
2016-10-20 17:25:29 -07:00
John Schulman
21f8fe57be fix to previous commit 2016-10-20 11:41:23 -07:00
John Schulman
3546f06bc3 prevents race condition in multiprocessing setting 2016-10-20 11:40:20 -07:00
John Schulman
50be1426be fix number of seeds in mujoco benchmark 2016-10-20 11:39:03 -07:00
Jie Tang
a8aa301202 Bump version 2016-10-20 01:34:33 -07:00