StatRecorder Class #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

xinchen-yang wants to merge 15 commits into RyanNavillus:stat-recorder from xinchen-yang:main

Contributor

xinchen-yang commented Apr 2, 2024

No description provided.

RyanNavillus reviewed

View reviewed changes

Owner

RyanNavillus left a comment

Great work! Left some suggestions

syllabus/examples/training_scripts/cleanrl_procgen_plr.py

    
                                      idx = info.index(item)

                                      multiprocessing_sync_wrapper_envs = envs.venv.venv.envs # extract envs of class MultiProcessingSyncWrapper from envs of class VecNormalize, which has access to task id

                                      episode_task = multiprocessing_sync_wrapper_envs[idx]._latest_task

                                      curriculum.update_on_episode(item["episode"]["r"], item["episode"]["l"], episode_task, args.env_id)

Owner

RyanNavillus Apr 3, 2024

This isn't necessary, the curriculum sync wrapper will call this automatically with the correct data.

syllabus/core/stat_recorder.py Outdated Show resolved Hide resolved

syllabus/core/stat_recorder.py Outdated

    
                  def __init__(self, task_space: TaskSpace):

                      """Initialize the StatRecorder"""

                      self.write_path = '/Users/allisonyang/Downloads'

Owner

RyanNavillus Apr 3, 2024

Make this an initializer argument and let people configure it when they create their curriculum

syllabus/core/stat_recorder.py Outdated Show resolved Hide resolved

syllabus/core/stat_recorder.py Outdated

    
                      self.num_tasks = self.task_space.num_tasks

                      self.records = {task: [] for task in self.tasks}

                      self.stats = {task: {} for task in self.tasks}

Owner

RyanNavillus Apr 3, 2024

Instead of tracking the full list, track these efficiently. If we train for 10M episodes then it would be impossible to take the average of these lists. You can look up the running mean formulas, I think it's average_mean = ((average_mean * N) + new_mean) / (N+1) or something like that. There might be one for variance too

Owner

RyanNavillus Apr 3, 2024

Alternatively, it would be good to provide an option for only saving the past N episodes, so rather than taking the average over all of training, it's an average over the past N episodes. I think some normalization schemes prefer that method because returns change during training. It would be good to provide both options (let the user choose and configure each)

syllabus/core/stat_recorder.py Outdated Show resolved Hide resolved

RyanNavillus reviewed

View reviewed changes

Owner

RyanNavillus left a comment

I think you can simplify this code quite a bit, but it looks more efficient now!

syllabus/core/stat_recorder.py Outdated

    
                                      writer.add_scalar(f"stats_per_task/task_{idx}_episode_return_mean", 0, step)

                                      writer.add_scalar(f"stats_per_task/task_{idx}_episode_return_var", 0, step)

                                      writer.add_scalar(f"stats_per_task/task_{idx}_episode_length_mean", 0, step)

                                      writer.add_scalar(f"stats_per_task/task_{idx}_episode_length_var", 0, step)

Owner

RyanNavillus Apr 19, 2024

Please simplify this code, you shouldn't need all these if statements and repeated code. Also I think you can ignore the task_names feature for now, it's sort of half implemented, I'll need to fix it at some point

syllabus/core/stat_recorder.py Outdated

    
                              else:

                                  N_past = len(self.records[episode_task])

                                  self.stats[episode_task]['mean_r'] =round((self.stats[episode_task]['mean_r'] * N_past + episode_return) / (N_past + 1), 4)

Owner

RyanNavillus Apr 19, 2024

We probably shouldn't round the saved values, we should only round them when logging or printing

syllabus/core/stat_recorder.py Outdated

    
                                  "l": episode_length,

                                  "env_id": env_id

                              })

                              self.records[episode_task] = self.records[episode_task][-self.calc_past_N:]

Owner

RyanNavillus Apr 19, 2024

It would be more efficient to implement this with a separate queue for return, length, and id's https://docs.python.org/3/library/collections.html#collections.deque

syllabus/core/stat_recorder.py Outdated

    
                                      writer.add_scalar(f"stats_per_task/task_{self.task_space.task_name(idx)}_episode_return_mean", 0, step)

                                      writer.add_scalar(f"stats_per_task/task_{self.task_space.task_name(idx)}_episode_return_var", 0, step)

                                      writer.add_scalar(f"stats_per_task/task_{self.task_space.task_name(idx)}_episode_length_mean", 0, step)

                                      writer.add_scalar(f"stats_per_task/task_{self.task_space.task_name(idx)}_episode_length_var", 0, step)

Owner

RyanNavillus Apr 19, 2024

simplify this if else, theres a lot of repeated code. Also why are we logging 0 if there are no stats? We can probably just skip logging in that case

syllabus/core/stat_recorder.py

    
                      self.stats = {task: {} for task in self.tasks}

                  def record(self, episode_return: float, episode_length: int, episode_task, env_id=None):

                      """

Owner

RyanNavillus Apr 19, 2024

If you use defaultdicts for the stats you can cut out a lot of code. Change:
self.stats = {task: {} for task in self.tasks}
to:
self.stats = {task: defaultdict(float) for task in self.tasks}

Owner

RyanNavillus Apr 19, 2024

from collections import defaultdict

RyanNavillus reviewed

View reviewed changes

Owner

RyanNavillus left a comment

Thanks for simplifying the code, looks much better!

syllabus/core/stat_recorder.py Outdated Show resolved Hide resolved

syllabus/core/stat_recorder.py Outdated Show resolved Hide resolved

syllabus/core/stat_recorder.py Outdated

    
                              self.episode_lengths[episode_task].append(episode_length)

                              self.env_ids[episode_task].append(env_id)

                              self.stats[episode_task]['mean_r'] = np.mean(list(self.episode_returns[episode_task])[-self.calc_past_N:]) # I am not sure whether there is a more efficient way to slice to deque. I temperorily convert it to a list then slice it, which should cost O(n)

Owner

RyanNavillus Apr 19, 2024

You shouldn't need to slice a deque at all, it should automatically drop elements when it goes past keep_last_n. Check the documentation for it

xinchen-yang force-pushed the main branch from 31d2cfe to d721bf0 Compare

April 19, 2024 21:31

RyanNavillus and others added 13 commits

April 22, 2024 03:08


          Merge pull request RyanNavillus#34 from RyanNavillus/async-plr

63dc8f6

Async plr


          Remove unused eval

78f450a


          merge in upstram

9370a1c


          minor changes

c470517


          merge in upstream


          Update syllabus/core/stat_recorder.py

d04b6a7

Co-authored-by: Ryan Sullivan <[email protected]>


          implemented efficient reward normalization (running average & last N …

d43deca

…episodes); implemented the log_metrics function of the StatRecorder class for visualization on weights & biases


          Simply StatRecorder code, use deque and defaultdict structure

cd9494b


          remove keep_past_n

a6de4f9


          implemented the normalize function of stat_recorder, write test cases…

1e6938d

… for the StatRecorder class


          syllabus/tests/__init__

c29887e


          test cases for the StatRecorder class

a00f90b


          merge in upstream

4441ba3

xinchen-yang force-pushed the main branch from 63da604 to 4441ba3 Compare

April 24, 2024 03:33

xinchen-yang added 2 commits

April 24, 2024 13:13


          access task_id info

d17a690


          add info['task'] to reset, api for the normalize func

9f68e1b

RyanNavillus changed the base branch from main to stat-recorder

April 26, 2024 07:27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet