Gymnasium/gym/monitoring/stats_recorder.py

import json
import os
import time

from gym import error
from gym.utils import atomic_write

class StatsRecorder(object):
    def __init__(self, directory, file_prefix, autoreset=False, env_id=None):
        self.autoreset = autoreset
        self.env_id = env_id

        self.initial_reset_timestamp = None
        self.directory = directory
        self.file_prefix = file_prefix
        self.episode_lengths = []
        self.episode_rewards = []
        self.episode_types = [] # experimental addition
        self._type = 't'
        self.timestamps = []
        self.steps = None
        self.rewards = None

        self.done = None
        self.closed = False

        filename = '{}.stats.json'.format(self.file_prefix)
        self.path = os.path.join(self.directory, filename)

    @property
    def type(self):
        return self._type

    @type.setter
    def type(self, type):
        if type not in ['t', 'e']:
            raise error.Error('Invalid episode type {}: must be t for training or e for evaluation', type)
        self._type = type

    def before_step(self, action):
        assert not self.closed

        if self.done:
            raise error.ResetNeeded("Trying to step environment which is currently done. While the monitor is active for {}, you cannot step beyond the end of an episode. Call 'env.reset()' to start the next episode.".format(self.env_id))
        elif self.steps is None:
            raise error.ResetNeeded("Trying to step an environment before reset. While the monitor is active for {}, you must call 'env.reset()' before taking an initial step.".format(self.env_id))

    def after_step(self, observation, reward, done, info):
        self.steps += 1
        self.rewards += reward

        if done:
            self.save_complete()

        if done:
            self.done = True
            if self.autoreset:
                self.before_reset()
                self.after_reset(observation)

    def before_reset(self):
        assert not self.closed

        if self.done is not None and not self.done:
            raise error.Error("Tried to reset environment which is not done. While the monitor is active for {}, you cannot call reset() unless the episode is over.".format(self.env_id))

        self.done = False
        if self.initial_reset_timestamp is None:
            self.initial_reset_timestamp = time.time()

    def after_reset(self, observation):
        self.steps = 0
        self.rewards = 0
        # We write the type at the beginning of the episode. If a user
        # changes the type, it's more natural for it to apply next
        # time the user calls reset().
        self.episode_types.append(self._type)

    def save_complete(self):
        if self.steps is not None:
            self.episode_lengths.append(self.steps)
            self.episode_rewards.append(self.rewards)
            self.timestamps.append(time.time())

    def close(self):
        self.flush()
        self.closed = True

    def flush(self):
        if self.closed:
            return

        with atomic_write.atomic_write(self.path) as f:
            json.dump({
                'initial_reset_timestamp': self.initial_reset_timestamp,
                'timestamps': self.timestamps,
                'episode_lengths': self.episode_lengths,
                'episode_rewards': self.episode_rewards,
                'episode_types': self.episode_types,
            }, f)
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`import json`
			`import os`
			`import time`

			`from gym import error`
Improve score_from_local implementation (#66) * Make sure that a callable is passed to start * Improve autoflushing for score calculation * Write stats and manifests using proper atomic_writes 2016-05-06 18:19:16 -07:00			`from gym.utils import atomic_write`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00
			`class StatsRecorder(object):`
Print out env_id when stepping too far 2016-10-23 14:05:42 -07:00			`def __init__(self, directory, file_prefix, autoreset=False, env_id=None):`
Allow for autoreset envs 2016-10-23 10:35:24 -07:00			`self.autoreset = autoreset`
Print out env_id when stepping too far 2016-10-23 14:05:42 -07:00			`self.env_id = env_id`
Allow for autoreset envs 2016-10-23 10:35:24 -07:00
Record initial reset timestamp, and use it for duration calculations 2016-04-27 09:17:05 -07:00			`self.initial_reset_timestamp = None`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`self.directory = directory`
			`self.file_prefix = file_prefix`
			`self.episode_lengths = []`
			`self.episode_rewards = []`
Add benchmark support (#338) * Warn if seed doesn't return a list * Add preliminary BenchmarkRun support * Add experimental benchmark registration * Flesh out interface * Add preliminary BenchmarkRun support * Warn if seed doesn't return a list * Add experimental benchmark registration * Flesh out interface * Make benchmarkrun upload recursive * Add evaluation episodes * Add benchmark scoring * Tweak reward locations * Tweak scoring * Clear default metadata in Wrapper * Improve scoring * Expose registry; fix test * Add initial_reset_timestamp * Add back algorithm; fix tests 2016-09-23 01:04:26 -07:00			`self.episode_types = [] # experimental addition`
			`self._type = 't'`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`self.timestamps = []`
			`self.steps = None`
			`self.rewards = None`

			`self.done = None`
Fix flushing of final episode in monitor Thanks @JKCooper2. We also change to getting the PID at monitor construction time, rather than at runtime. This means you need to start your environments post-fork, which should be fine. 2016-05-06 22:00:29 -07:00			`self.closed = False`

[WIP] add support for seeding environments (#135) * Make environments seedable * Fix monitor bugs - Set monitor_id before setting the infix. This was a bug that would yield incorrect results with multiple monitors. - Remove extra pid from stats recorder filename. This should be purely cosmetic. * Start uploading seeds in episode_batch * Fix _bigint_from_bytes for python3 * Set seed explicitly in random_agent * Pass through seed argument * Also pass through random state to spaces * Pass random state into the observation/action spaces * Make all _seed methods return the list of used seeds * Switch over to np.random where possible * Start hashing seeds, and also seed doom engine * Fixup seeding determinism in many cases * Seed before loading the ROM * Make seeding more Python3 friendly * Make the MuJoCo skipping a bit more forgiving * Remove debugging PDB calls * Make setInt argument into raw bytes * Validate and upload seeds * Skip box2d * Make seeds smaller, and change representation of seeds in upload * Handle long seeds * Fix RandomAgent example to be deterministic * Handle integer types correctly in Python2 and Python3 * Try caching pip * Try adding swap * Add df and free calls * Bump swap * Bump swap size * Try setting overcommit * Try other sysctls * Try fixing overcommit * Try just setting overcommit_memory=1 * Add explanatory comment * Add what's new section to readme * BUG: Mark ElevatorAction-ram-v0 as non-deterministic for now * Document seed * Move nondetermistic check into spec 2016-05-29 09:07:09 -07:00			`filename = '{}.stats.json'.format(self.file_prefix)`
Fix flushing of final episode in monitor Thanks @JKCooper2. We also change to getting the PID at monitor construction time, rather than at runtime. This means you need to start your environments post-fork, which should be fine. 2016-05-06 22:00:29 -07:00			`self.path = os.path.join(self.directory, filename)`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00
Add benchmark support (#338) * Warn if seed doesn't return a list * Add preliminary BenchmarkRun support * Add experimental benchmark registration * Flesh out interface * Add preliminary BenchmarkRun support * Warn if seed doesn't return a list * Add experimental benchmark registration * Flesh out interface * Make benchmarkrun upload recursive * Add evaluation episodes * Add benchmark scoring * Tweak reward locations * Tweak scoring * Clear default metadata in Wrapper * Improve scoring * Expose registry; fix test * Add initial_reset_timestamp * Add back algorithm; fix tests 2016-09-23 01:04:26 -07:00			`@property`
			`def type(self):`
			`return self._type`

			`@type.setter`
			`def type(self, type):`
			`if type not in ['t', 'e']:`
			`raise error.Error('Invalid episode type {}: must be t for training or e for evaluation', type)`
			`self._type = type`

Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`def before_step(self, action):`
Fix flushing of final episode in monitor Thanks @JKCooper2. We also change to getting the PID at monitor construction time, rather than at runtime. This means you need to start your environments post-fork, which should be fine. 2016-05-06 22:00:29 -07:00			`assert not self.closed`

Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`if self.done:`
Print out env_id when stepping too far 2016-10-23 14:05:42 -07:00			`raise error.ResetNeeded("Trying to step environment which is currently done. While the monitor is active for {}, you cannot step beyond the end of an episode. Call 'env.reset()' to start the next episode.".format(self.env_id))`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`elif self.steps is None:`
Print out env_id when stepping too far 2016-10-23 14:05:42 -07:00			`raise error.ResetNeeded("Trying to step an environment before reset. While the monitor is active for {}, you must call 'env.reset()' before taking an initial step.".format(self.env_id))`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00
			`def after_step(self, observation, reward, done, info):`
			`self.steps += 1`
			`self.rewards += reward`
Only save stats on completed episodes 2016-10-31 20:06:29 -07:00
			`if done:`
			`self.save_complete()`

Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`if done:`
Disallow reset()s unless we've reached a done state while monitor is running 2016-10-31 19:23:06 -07:00			`self.done = True`
Allow for autoreset envs 2016-10-23 10:35:24 -07:00			`if self.autoreset:`
			`self.before_reset()`
			`self.after_reset(observation)`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00
			`def before_reset(self):`
Fix flushing of final episode in monitor Thanks @JKCooper2. We also change to getting the PID at monitor construction time, rather than at runtime. This means you need to start your environments post-fork, which should be fine. 2016-05-06 22:00:29 -07:00			`assert not self.closed`

Disallow reset()s unless we've reached a done state while monitor is running 2016-10-31 19:23:06 -07:00			`if self.done is not None and not self.done:`
			`raise error.Error("Tried to reset environment which is not done. While the monitor is active for {}, you cannot call reset() unless the episode is over.".format(self.env_id))`

Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`self.done = False`
Record initial reset timestamp, and use it for duration calculations 2016-04-27 09:17:05 -07:00			`if self.initial_reset_timestamp is None:`
			`self.initial_reset_timestamp = time.time()`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00
			`def after_reset(self, observation):`
Issue 45 - Easily return environment score values (#57) * Set restriction on selected actions * Used self.action_space instead of custom set * Move action validation to core.py * Added env.score() function to return current score * Set up multi monitor scoring * Removed unneccesary package inclusions * Monitor writes scores automatically. Video callable can now use monitor_id * Monitor ID not necessary for video filtering * Fix so final stats get written when env.reset() doesn't happen * gitignore PyCharm Project Files 2016-05-07 10:38:42 +10:00			`self.steps = 0`
			`self.rewards = 0`
Add benchmark support (#338) * Warn if seed doesn't return a list * Add preliminary BenchmarkRun support * Add experimental benchmark registration * Flesh out interface * Add preliminary BenchmarkRun support * Warn if seed doesn't return a list * Add experimental benchmark registration * Flesh out interface * Make benchmarkrun upload recursive * Add evaluation episodes * Add benchmark scoring * Tweak reward locations * Tweak scoring * Clear default metadata in Wrapper * Improve scoring * Expose registry; fix test * Add initial_reset_timestamp * Add back algorithm; fix tests 2016-09-23 01:04:26 -07:00			`# We write the type at the beginning of the episode. If a user`
			`# changes the type, it's more natural for it to apply next`
			`# time the user calls reset().`
			`self.episode_types.append(self._type)`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00
Improve score_from_local implementation (#66) * Make sure that a callable is passed to start * Improve autoflushing for score calculation * Write stats and manifests using proper atomic_writes 2016-05-06 18:19:16 -07:00			`def save_complete(self):`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`if self.steps is not None:`
			`self.episode_lengths.append(self.steps)`
			`self.episode_rewards.append(self.rewards)`
			`self.timestamps.append(time.time())`

Fix flushing of final episode in monitor Thanks @JKCooper2. We also change to getting the PID at monitor construction time, rather than at runtime. This means you need to start your environments post-fork, which should be fine. 2016-05-06 22:00:29 -07:00			`def close(self):`
			`self.flush()`
			`self.closed = True`

Improve score_from_local implementation (#66) * Make sure that a callable is passed to start * Improve autoflushing for score calculation * Write stats and manifests using proper atomic_writes 2016-05-06 18:19:16 -07:00			`def flush(self):`
Fix flushing of final episode in monitor Thanks @JKCooper2. We also change to getting the PID at monitor construction time, rather than at runtime. This means you need to start your environments post-fork, which should be fine. 2016-05-06 22:00:29 -07:00			`if self.closed:`
			`return`
Issue 45 - Easily return environment score values (#57) * Set restriction on selected actions * Used self.action_space instead of custom set * Move action validation to core.py * Added env.score() function to return current score * Set up multi monitor scoring * Removed unneccesary package inclusions * Monitor writes scores automatically. Video callable can now use monitor_id * Monitor ID not necessary for video filtering * Fix so final stats get written when env.reset() doesn't happen * gitignore PyCharm Project Files 2016-05-07 10:38:42 +10:00
Fix flushing of final episode in monitor Thanks @JKCooper2. We also change to getting the PID at monitor construction time, rather than at runtime. This means you need to start your environments post-fork, which should be fine. 2016-05-06 22:00:29 -07:00			`with atomic_write.atomic_write(self.path) as f:`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`json.dump({`
Record initial reset timestamp, and use it for duration calculations 2016-04-27 09:17:05 -07:00			`'initial_reset_timestamp': self.initial_reset_timestamp,`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`'timestamps': self.timestamps,`
			`'episode_lengths': self.episode_lengths,`
			`'episode_rewards': self.episode_rewards,`
Add benchmark support (#338) * Warn if seed doesn't return a list * Add preliminary BenchmarkRun support * Add experimental benchmark registration * Flesh out interface * Add preliminary BenchmarkRun support * Warn if seed doesn't return a list * Add experimental benchmark registration * Flesh out interface * Make benchmarkrun upload recursive * Add evaluation episodes * Add benchmark scoring * Tweak reward locations * Tweak scoring * Clear default metadata in Wrapper * Improve scoring * Expose registry; fix test * Add initial_reset_timestamp * Add back algorithm; fix tests 2016-09-23 01:04:26 -07:00			`'episode_types': self.episode_types,`
Initial release. Hello world :). 2016-04-27 08:00:58 -07:00			`}, f)`