Upgrade gymnasium to 1.0.0 #502

sdpkjc · 2025-03-02T14:34:17Z

Description

Upgrade gymnasium to 1.0.0

Types of changes

Bug fix
New feature
New algorithm
Documentation

Checklist:

I've read the CONTRIBUTION guide (required).
I have ensured pre-commit run --all-files passes (required).
I have updated the tests accordingly (if applicable).
I have updated the documentation and previewed the changes via mkdocs serve.
- I have explained note-worthy implementation details.
- I have explained the logged metrics.
- I have added links to the original paper and related papers.

If you need to run benchmark experiments for a performance-impacting changes:

I have contacted @vwxyzjn to obtain access to the openrlbenchmark W&B team.
I have used the benchmark utility to submit the tracked experiments to the openrlbenchmark/cleanrl W&B project, optionally with --capture_video.
I have performed RLops with python -m openrlbenchmark.rlops.
- For new feature or bug fix:
  - I have used the RLops utility to understand the performance impact of the changes and confirmed there is no regression.
- For new algorithm:
  - I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
- I have added the learning curves generated by the python -m openrlbenchmark.rlops utility to the documentation.
- I have added links to the tracked experiments in W&B, generated by python -m openrlbenchmark.rlops ....your_args... --report, to the documentation.

vercel · 2025-03-02T14:34:21Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
cleanrl	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Mar 4, 2025 9:24am

pseudo-rnd-thoughts

Thanks for starting this @sdpkjc

Ale-py should be updated to v0.10.1

And the auto reset mode of the vector environment should be updated

pseudo-rnd-thoughts · 2025-03-02T18:01:21Z

pyproject.toml

-stable-baselines3 = "2.0.0"
-gymnasium = ">=0.28.1"
+stable-baselines3 = ">=2.4.0"
+gymnasium = ">=1.0.0"


Can this be specified as v1.1.0, if sb3 is the limitation then I think see if I can update it

Yes, sb3 depends on <=1.0.0.

pseudo-rnd-thoughts · 2025-03-03T10:24:59Z

If I remember correctly, SB3 is used for the replay buffer and the atari wrappers. IMO, those features can probably be shifted in cleanrl_utils directly however this should happen in a separate later PR

pseudo-rnd-thoughts · 2025-03-04T11:24:54Z

cleanrl_utils/same_model_vector_env.py

+
+
+# Only for gymnasium v1.0.0
+class SameModelSyncVectorEnv(gym.vector.SyncVectorEnv):


This should probably be called SameStepModeSyncVectorEnv or we just shift to gymnasium v1.1.0

https://farama.org/Vector-Autoreset-Mode

pseudo-rnd-thoughts · 2025-03-04T11:25:10Z

pyproject.toml


 [tool.poetry.dependencies]
-python = ">=3.8,<3.11"
+python = ">=3.9,<3.11"


What is limiting us increasing this?

pseudo-rnd-thoughts · 2025-03-04T11:26:09Z

pyproject.toml

 torch = ">=1.12.1"
-stable-baselines3 = "2.0.0"
-gymnasium = ">=0.28.1"
+stable-baselines3 = "^2.4.0"


I suspect we will have a new release of sb3 with support for gymnasium v1.1.0 as no changes seem to be required on their end (DLR-RM/stable-baselines3#2095)

ghost · 2025-03-20T07:46:24Z

Hey, with the updates in gymnasium 1.1 would it not be easier to simply use the 'Same-Step Mode' or am I missing something?

Does it have to do with the support for the other wrappers that are only supported in the 'Next step' mode?

pseudo-rnd-thoughts · 2025-03-20T08:03:55Z

@MarcusBinderDTU it is more about minimising implementation changes.
The old implementation used same step, therefore, as this is a module update, we are trying to minimise code changes.
Moving to next step (which for the PPO implementations could be beneficial) would be a separate or

ghost · 2025-03-20T08:10:03Z

@MarcusBinderDTU it is more about minimising implementation changes. The old implementation used same step, therefore, as this is a module update, we are trying to minimise code changes. Moving to next step (which for the PPO implementations could be beneficial) would be a separate or

Thanks for the fast reply!

I agree, but I dont understand why not going directly to gymnasium 1.1 and then using autoreset_mode=gym.vector.AutoresetMode.SAME_STEP for the environment creation?

Would that not be the easiest way of doing it?

sdpkjc · 2025-03-20T08:19:56Z

Thanks for your suggestion. However, since the currently released version of sb3 depends on gymnasium < 1.1, we can’t upgrade to 1.1 directly. Once #505 is merged, we'll remove the sb3 dependency and then update to gymnasium 1.1, which will allow us to use autoreset_mode=gym.vector.AutoresetMode.SAME_STEP for environment creation. So this PR is on hold until #505 goes in.

ghost · 2025-03-20T08:22:41Z

Ahh, now I see! Thanks for clarifying it, that makes sense :)

jugheadjones10 · 2025-05-08T05:13:23Z

cleanrl/dqn_atari.py

@sdpkjc
For this line:
real_next_obs[idx] = infos["final_observation"][idx]
I think "final_observation" should be "final_obs".
Tried running and "final_observation" gives a key error but not "final_obs".

jugheadjones10 · 2025-05-15T04:09:03Z

@sdpkjc I noticed that the scripts in cleanrl_utils/evals also still use the old way of getting episodic returns:

if "final_info" in infos:
    for info in infos["final_info"]:
        if "episode" not in info:

So we might need to update all the files in this directory too!

varadVaidya · 2025-05-30T14:22:15Z

Do you think its a good idea to let the users know that the logging for gymnasium >= 1.0 is different until the PR is merged? I did face some puzzling errors, and took more time than it should have, since the change in reward logging is not mentioned anywhere prominently in the changelogs for gymnasium.

pseudo-rnd-thoughts · 2025-07-07T21:08:35Z

Closing in favour of #516

update req

5961dc9

pseudo-rnd-thoughts reviewed Mar 2, 2025

View reviewed changes

fix ale version

200b374

vercel bot deployed to Preview March 3, 2025 06:46 View deployment

dqn @ Next-Step Mode

a3573eb

vercel bot deployed to Preview March 3, 2025 07:29 View deployment

SameModelSyncVectorEnv

195f952

vercel bot deployed to Preview March 4, 2025 07:42 View deployment

updage

8a8912c

vercel bot deployed to Preview March 4, 2025 07:50 View deployment

ddpg td3 sac pqn

9461bfb

vercel bot deployed to Preview March 4, 2025 08:21 View deployment

update github ci

0749007

vercel bot deployed to Preview March 4, 2025 08:30 View deployment

sdpkjc added 2 commits March 4, 2025 16:53

evals c51 c51_atari dqn_atari ppo

d77f916

fix pre-commit

ed68b72

vercel bot deployed to Preview March 4, 2025 08:55 View deployment

sdpkjc changed the title ~~Upgrade gymnasium to 1.1.0~~ Upgrade gymnasium to 1.0.0 Mar 4, 2025

ppo rpo mujoco

d32feb3

vercel bot deployed to Preview March 4, 2025 09:24 View deployment

pseudo-rnd-thoughts reviewed Mar 4, 2025

View reviewed changes

This was referenced Mar 17, 2025

Capture-video flag doesn't work with ppo continuous #503

Closed

Remove SB3 as a necessary module #505

Merged

jugheadjones10 reviewed May 8, 2025

View reviewed changes

pseudo-rnd-thoughts mentioned this pull request Jul 4, 2025

Replace Poetry with UV #515

Merged

18 tasks

pseudo-rnd-thoughts closed this Jul 7, 2025



		# Only for gymnasium v1.0.0
		class SameModelSyncVectorEnv(gym.vector.SyncVectorEnv):

Upgrade gymnasium to 1.0.0 #502

Upgrade gymnasium to 1.0.0 #502

Uh oh!

Conversation

sdpkjc commented Mar 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Types of changes

Checklist:

Uh oh!

vercel bot commented Mar 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pseudo-rnd-thoughts left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pseudo-rnd-thoughts Mar 2, 2025

Choose a reason for hiding this comment

Uh oh!

sdpkjc Mar 3, 2025

Choose a reason for hiding this comment

Uh oh!

pseudo-rnd-thoughts commented Mar 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pseudo-rnd-thoughts Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

pseudo-rnd-thoughts Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

pseudo-rnd-thoughts Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

pseudo-rnd-thoughts Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

ghost commented Mar 20, 2025

Uh oh!

pseudo-rnd-thoughts commented Mar 20, 2025

Uh oh!

ghost commented Mar 20, 2025

Uh oh!

sdpkjc commented Mar 20, 2025

Uh oh!

ghost commented Mar 20, 2025

Uh oh!

jugheadjones10 May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jugheadjones10 commented May 15, 2025

Uh oh!

varadVaidya commented May 30, 2025

Uh oh!

pseudo-rnd-thoughts commented Jul 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sdpkjc commented Mar 2, 2025 •

edited

Loading

vercel bot commented Mar 2, 2025 •

edited

Loading

pseudo-rnd-thoughts left a comment •

edited

Loading

pseudo-rnd-thoughts commented Mar 3, 2025 •

edited

Loading

jugheadjones10 May 8, 2025 •

edited

Loading