Jie Tang
|
77568accd7
|
Thread episode lengths through when scoring, add tests
|
2017-02-13 12:29:11 -08:00 |
|
Jie Tang
|
5ca80a3141
|
Fix format of solves for TotalReward
|
2016-12-21 20:52:38 -08:00 |
|
Jie Tang
|
b33fc9fb85
|
Fix a bug when the benchmark result is empty
|
2016-12-13 22:05:29 -08:00 |
|
Jie Tang
|
54ead345dc
|
Bunch of refactoring since TotalReward and RewardPerTime scoring are quite
similar
|
2016-12-13 22:01:50 -08:00 |
|
Jie Tang
|
f63bb2e1aa
|
Add RewardPerTime scoring function and tests
|
2016-12-13 22:01:01 -08:00 |
|
Jie Tang
|
5dba36c68d
|
Fix benchmark score compute when a monitor file is empty
|
2016-11-30 22:38:11 -08:00 |
|
Jie Tang
|
c3283adda0
|
Fix broken benchmark scoring when handling eval episodes, add a test
|
2016-10-28 11:48:49 -07:00 |
|
Jie Tang
|
9255e2264c
|
Bring back total environment wall time
|
2016-10-27 22:49:16 -07:00 |
|
Jie Tang
|
1cc33eb081
|
Fix bug shadowing initial timesteps, update tests
|
2016-10-27 22:25:54 -07:00 |
|
Jie Tang
|
271ef783c6
|
Fix fencepost error in scoring, make unit test actually catch this
|
2016-10-27 21:35:23 -07:00 |
|
Jie Tang
|
9244bd5001
|
Properly compute total time
|
2016-10-27 20:22:49 -07:00 |
|
Jie Tang
|
44ce715dfa
|
Add total reward scoring, tests, propagate solved
|
2016-10-27 20:22:26 -07:00 |
|
Jie Tang
|
6037456a14
|
Comment scoring rule
|
2016-10-27 20:22:22 -07:00 |
|
Jie Tang
|
71af1191e0
|
Fix some bugs with new partial benchmark scoring
|
2016-10-27 12:09:49 -07:00 |
|
Jie Tang
|
f7a45f6953
|
py2 numerical compatibility
|
2016-10-26 16:57:26 -07:00 |
|
Jie Tang
|
3c341c279d
|
Move / rename benchmark scoring function
|
2016-10-25 21:55:54 -07:00 |
|
Jie Tang
|
53cde23ece
|
Fix bug in max_seconds scoring. Refactor null_score, add tests for it all
|
2016-10-25 21:55:54 -07:00 |
|
Jie Tang
|
859144868f
|
Implement benchmark scoring on gym side
|
2016-10-25 21:55:50 -07:00 |
|
Jie Tang
|
bee6be5632
|
Typo in source indexes
|
2016-10-20 22:57:33 -07:00 |
|
Jie Tang
|
2dba05ac0a
|
Minor bug computing sources
|
2016-10-20 22:50:13 -07:00 |
|
Greg Brockman
|
88f94587a2
|
Update benchmark spec (#385)
* Update benchmark spec
* Update format of benchmark again
* Add support for max_seconds to benchmark
* Bump version
|
2016-10-20 17:25:29 -07:00 |
|
Greg Brockman
|
45038020ae
|
Assign floor for any missing episodes
|
2016-09-23 02:08:11 -07:00 |
|
Greg Brockman
|
2b3f965faa
|
Fix scoring when fewer episodes are provided
|
2016-09-23 01:47:42 -07:00 |
|
Greg Brockman
|
934b2acbb7
|
Add benchmark support (#338)
* Warn if seed doesn't return a list
* Add preliminary BenchmarkRun support
* Add experimental benchmark registration
* Flesh out interface
* Add preliminary BenchmarkRun support
* Warn if seed doesn't return a list
* Add experimental benchmark registration
* Flesh out interface
* Make benchmarkrun upload recursive
* Add evaluation episodes
* Add benchmark scoring
* Tweak reward locations
* Tweak scoring
* Clear default metadata in Wrapper
* Improve scoring
* Expose registry; fix test
* Add initial_reset_timestamp
* Add back algorithm; fix tests
|
2016-09-23 01:04:26 -07:00 |
|