Skip to content

[Bug Report] Gin Rummy environment seed / pay-off inconsistency for RLCard #1312

@Nikelroid

Description

@Nikelroid

Describe the bug

I’m using the gin_rummy_v4 environment from PettingZoo’s classic set (which wraps the RLCard implementation), and I’ve discovered two related issues:

  1. The seed parameter (or overall seeding behavior) of the environment does not seem to follow the usual env.reset(seed=…) semantics in PettingZoo. In particular, the underlying RLCard game seems to be used via rlcard.make(…) and thus the seed applied in PettingZoo does not get consistently propagated to the RLCard backend.
  2. Because of (1) the payoff/reward defaults appear inconsistent with the documented defaults in PettingZoo. The documentation says for the Gin Rummy environment (wrapped by RLCard) that the default knock reward is 0.5, but the underlying RLCard defaults use 0.2 as the knock reward.  Hence I’m seeing “knock amount = 0.2” instead of 0.5 when I use default settings or even custom setting.
  3. These two issues together (seed not respected + inconsistent reward default) cause reproducibility problems and unexpected payoff behavior when using the environment for research or benchmarking.

Code example

from pettingzoo.classic import gin_rummy_v4
env = gin_rummy_v4.env()
env.reset(seed=seed)

System info

  • PettingZoo version: (pettingzoo[classic]>=1.24.3)
  • RLCard version: (ray[rllib]>=2.31.0)
  • Python version: (e.g., 3.11.14)
  • OS: (MacOS 26, Linux 24.04)

Additional context

Suggested Fix / Ask:

  1. Ensure that in the gin_rummy wrapper (and any other RLCard-wrapped environment) the reset(seed=…) value is passed properly to RLCard’s make or internal RNG so that the seed is effective and reproducible.
  2. Force the default knock_reward and gin_reward to the documented values (0.5 and 1.0 respectively) regardless of RLCard’s internal defaults – or at minimum document clearly when RLCard defaults differ.
  3. Add automated tests to verify that the seed parameter produces deterministic results (e.g., fixed draws, shuffle order) and that reward defaults match documentation.
  4. Optionally, update the documentation to more clearly state that because RLCard is used under the hood, some internal defaults (like 0.2) may still apply unless overridden, or update the wrapper to guarantee the override.

Checklist

  • I have checked that there is no similar issue in the repo

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions