Skip to content

MISSING pre-processing / training scripts for code generation #58

@ashwin296

Description

@ashwin296

hey authors, thanks for open-sourcing the project.

in the arxiv paper, you mentioned that the outcome-based reward for code is set to be the unit test pass rate, and specified the processing for code problems (sec b.1).

however, this part looks to be missing in the current repository.

would you please consider releasing the relevant code for reproducing? or remove the code-orm-related content in the paper if that is overclaiming?

i raised the issue, because directly using pass rate without discretization is unlikely to provide good reward signals when training codellms in practice.

i doubt if this part is really implemented, and would love to reproduce if you could provide more details. it looks to be non-trivial to have that in the verl pipeline.

thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions