MISSING pre-processing / training scripts for code generation

hey authors, thanks for open-sourcing the project.

in the arxiv paper, you mentioned that the outcome-based reward for code is set to be the unit test pass rate, and specified the processing for code problems (sec b.1).

however, this part looks to be missing in the current repository.

would you please consider releasing the relevant code for reproducing? or remove the code-orm-related content in the paper if that is overclaiming?

i raised the issue, because directly using pass rate without discretization is unlikely to provide good reward signals when training codellms in practice. 

i doubt if this part is really implemented, and would love to reproduce if you could provide more details. it looks to be non-trivial to have that in the verl pipeline.

thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MISSING pre-processing / training scripts for code generation #58

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MISSING pre-processing / training scripts for code generation #58

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions